Multi-Agent Reinforcement Learning-Based Resource Management for V2X Communication

Multi-Agent Reinforcement Learning-Based Resource Management for V2X Communication

Nan Zhao, Jiaye Wang, Bo Jin, Ru Wang, Minghu Wu, Yu Liu, Lufeng Zheng
DOI: 10.4018/IJMCMC.320190
Article PDF Download
Open access articles are freely available for download

Abstract

Cellular vehicle-to-everything (V2X) communication is essential to support future diverse vehicular applications. However, due to the dynamic characteristics of vehicles, resource management faces huge challenges in V2X communication. In this paper, the optimization problem of the comprehensive efficiency for V2X communication network is established. Considering the non-convexity of the optimization problem, this paper ulitizes the markov decision process (MDP) to solve the optimization problem. The MDP is formulated with the design of state, action, and reward function for vehicle-to-vehicle links. Then, a multiagent deep Q network (MADQN) method is proposed to improve the comprehensive efficiency of V2X communication network. Simulation results show that the MADQN method outperforms other methods on performance with the higher comprehensive efficiency of V2X communication network.
Article Preview
Top

Introduction

With the development of an intelligent transportation system, vehicle-to-everything (V2X) communication can improve traffic efficiency, road safety, and vehicle entertainment experience through wireless connection between road infrastructure and vehicles (Prathiba et al., 2021; Haapola et al., 2021). In V2X communication, vehicle-to-infrastructure (V2I) and vehicle-to-vehicle (V2V) links are used to support the various vehicle applications (Chen et al., 2017; Thunberg et al., 2021).

The different access requirements of vehicles and the special spectrum of V2X communication, make meeting the demands of massive data transmission difficult. Therefore, resource management is beneficial to improve the performance of a V2X communication network. Lee et al. (2017) used an efficient cluster-based resource management scheme to improve cellular user sum rate, average packet, and throughput. Bahonar et al. (2021) proposed a low-complexity resource allocation method for dense cellular V2X networks, and Bischoff et al. (2021) proposed a decentralized V2X resource allocation that can be used for cooperative driving. Zhou et al. (2021) described a multichannel access management approach for software defined in a cellular V2X network, and Pang et al. (2021) proposed an intelligent network resource management system to overcome the high-mobility edge computing problem of a 5G vehicle network. Although many documents describe the use of conventional optimization methods to solve the problem of V2X resource management, it is actually difficult to find the optimal solution by obtaining global channel status information.

To address the challenges caused by obtaining the global channel status information, we adopt the reinforcement learning (RL) method (Zhao et al., 2021; Zhao et al., 2020) in this paper. Currently one of the most powerful machine learning tools, RL is usually applied to time-varying dynamic systems (Wu et al., 2018; Yan et al., 2018) and the wireless network (Simsek et al., 2018; Zhao et al., 2018). Liang et al. (2019) proposed a multi-agent reinforcement learning (MARL) approach for spectrum sharing in a vehicle network. Zhang et al. (2019) proposed a deep reinforcement learning method to solve the problem of the resource allocation and the model selection in a cellular V2X network. Liu et al. (2020), described the use of a deep reinforcement learning method to optimize the spectrum efficiency and the energy efficiency of a V2X network. Choi et al. (2021) described a distributed congestion control method based on deep reinforcement learning to improve the traffic efficiency of a cellular V2X network.

However, most of the aforementioned literature lacks comprehensive consideration of the randomness of V2V link transmission data and vehicle dynamics in V2X communication networks. Therefore, this paper proposes a multi-agent depth Q network (MADQN) method to meet the aforementioned challenges.

In this paper, we propose the MADQN method to find the optimal solution of the optimization problem; that is, the optimal spectrum allocation and transmission power selection strategy of V2V link to maximize the comprehensive efficiency of the V2X communication network. The main contributions of this work are as follows:

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing