Deep Reinforcement Learning for Mobile Video Offloading in Heterogeneous Cellular Networks

Deep Reinforcement Learning for Mobile Video Offloading in Heterogeneous Cellular Networks

Nan Zhao, Chao Tian, Menglin Fan, Minghu Wu, Xiao He, Pengfei Fan
DOI: 10.4018/IJMCMC.2018100103
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Heterogeneous cellular networks can balance mobile video loads and reduce cell arrangement costs, which is an important technology of future mobile video communication networks. Because of the characteristics of non-convexity of the mobile offloading problem, the design of the optimal strategy is an essential issue. For the sake of ensuring users' quality of service and the long-term overall network utility, this article proposes the distributive optimal method by means of multiple agent reinforcement learning in the downlink heterogeneous cellular networks. In addition, to solve the computational load issue generated by the large action space, deep reinforcement learning is introduced to gain the optimal policy. The learning policy can provide a near-optimal solution efficiently with a fast convergence speed. Simulation results show that the proposed approach is more efficient at improving the performance than the Q-learning method.
Article Preview
Top

Introduction

As wireless devices increase rapidly, mobile video communication networks are facing giant challenges to increase network capacity with the ever-increasing demand (Huang et al., 2017; Zhao et al., 2017a). By offloading user equipment (UEs) from the macro base station (MBS) to femto base station (FBS), the heterogeneous network (HetNet) (Zhang et al., 2016; Chen et al., 2015; Zhao et al., 2018) balances mobile video communication network traffic (Lien et al., 2015; Wu et al., 2015). Furthermore, For the sake of improving the cellular network’s entire spectral efficiency the FBS and the MBS (Wang et al., 2016; Bashar, 2015) can share the same channel. Consequently, HetNets can improve the network capacity and energy efficiency is regarded as a promising approach in the future networks.

Mobile offloading problem is one of factors influencing the performance of HetNets, which has been investigated in some existing works (Ye et al., 2013; Bayat et al., 2014; Elsherif et al., 2015). In (Ye et al., 2013) 2013, user association was proposed to solve the load balancing problem in heterogeneous cellular networks. Distributed user association and femtocell allocation was investigated in HetNets (Bayat et al., 2014). The authors in (Elsherif et al., 2015) investigated resource allocation and inter-cell interference management to obtain the optimal offloading strategy. Considering the non-convex features of the mobile offloading optimization issue, it is difficult to gain the global optimal strategy. Some methods have been developed in many works recently. Game theory has been proposed in (Shen et al., 2014). Markov approximation (Chen et al., 2013) has also been used to solve this problem. Nevertheless, we can't use these existing optimization solutions to obtain the optimal strategy effectively without complete and accurate network information. This complete information makes calculating the optimal strategy difficultly is usually not available. This paper introduces a reinforcement learning method to solve the mobile offloading optimization problem of HetNets.

Reinforcement learning (RL) method (Katayama, 2016; Dulac-Arnold et al., 2016; Levine et al., 2017) can obtain the optimal policy to solve the intelligent decision problem by interacting with the environment. Moreover, RL can obtain the long-term goals instead of the optimal current rewards (Degris et al., 2006; Dung et al., 2006, Eremeev et al., 2018). The widely used RL technique is Q-learning. The authors in (Bennis et al., 2010) proposed a Q-learning based approach to interference avoidance in self-organized femtocell networks. In a single agent RL system, independent agents can alter actions with no collaboration, which may result in fluctuating actions in the learning strategy (Talor et al., 2009; D'Eramo et al., 2017). When there are some agents in the environment, it is necessary to consider the dynamic characteristic of the environment due to the behaviors of the other agents. In addition, considering that the cumulative reward of one UE may be inevitably influenced by other UEs' actions, cooperative multi-agent reinforcement learning (MARL) (ElTantawy et al., 2013) should be considered. Unfortunately, there are many issues in MARL to obtain the optimal strategy (Awheda et al., 2016; Graham et al., 2010; Wu et al., 2009), such as convergence, learning speed, and multiple equilibrium.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing