Experienced Deep Reinforcement Learning with Generative Adversarial\n Networks (GANs) for Model-Free Ultra Reliable Low Latency Communication — Ali Taleb Zadeh Kasgari (2019) | RDL Network
Experienced Deep Reinforcement Learning with Generative Adversarial\n Networks (GANs) for Model-Free Ultra Reliable Low Latency Communication
Preprint 2019
Authors
AK
Ali Taleb Zadeh Kasgari
WS
Walid Saad
MM
Mohammad Mozaffari
Abstract
1 min read
In this paper, a novel experienced deep reinforcement learning (deep-RL)\nframework is proposed to provide model-free resource allocation for ultra\nreliable low latency communication (URLLC). The proposed, experienced deep-RL\nframework can guarantee high end-to-end reliability and low end-to-end latency,\nunder explicit data rate constraints, for each wireless without any models of\nor assumptions on the users' traffic. In particular, in order to enable the\ndeep-RL framework to account for extreme network conditions and operate in\nhighly reliable systems, a new approach based on generative adversarial\nnetworks (GANs) is proposed. This GAN approach is used to pre-train the deep-RL\nframework using a mix of real and synthetic data, thus creating an experienced\ndeep-RL framework that has been exposed to a broad range of network conditions.\nFormally, the URLLC resource allocation problem is posed as a power\nminimization problem under reliability, latency, and rate constraints. To solve\nthis problem using experienced deep-RL, first, the rate of each user is\ndetermined. Then, these rates are mapped to the resource block and power\nallocation vectors of the studied wireless system. Finally, the end-to-end\nreliability and latency of each user are used as feedback to the deep-RL\nframework. It is then shown that at the fixed-point of the deep-RL algorithm,\nthe reliability and latency of the users are near-optimal. Moreover, for the\nproposed GAN approach, a theoretical limit for the generator output is\nanalytically derived. Simulation results show how the proposed approach can\nachieve near-optimal performance within the rate-reliability-latency region,\ndepending on the network and service requirements. The results also show that\nthe proposed experienced deep-RL framework is able to remove the transient\ntraining time that makes conventional deep-RL methods unsuitable for URLLC.\n
Discussion(0)
No comments yet. Be the first to comment.