Matching Combined Heterogeneous Multi-Agent Reinforcement Learning for Resource Allocation in NOMA-V2X Networks.

Tools

Gao, A., Zhu, Z., Zhang, J., Liang, W. and Hu, Y., 2024. Matching Combined Heterogeneous Multi-Agent Reinforcement Learning for Resource Allocation in NOMA-V2X Networks. IEEE Transactions on Vehicular Technology, 73 (10), 15109-15124.

Full text available as:

PDF
V2V multiplexes V2I channels240530_final.pdf - Accepted Version
Restricted to Repository staff only until 4 June 2026.
Available under License Creative Commons Attribution Non-commercial.
12MB

Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk.

Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material.

DOI: 10.1109/TVT.2024.3409048

Abstract

To combat the spectrum scarcity, non-orthogonal multiple access (NOMA) and vehicle-to-everything (V2X) systems are integrated as NOMA-V2X networks, where multiple vehicleto-vehicle (V2V) links can opportunistically reuse the spectrum licensed to vehicle-to-infrastructure (V2I) links. However, the contradictory quality of service (QoS) requirements among V2V and V2I links make the design of an effective spectrum sharing scheme to be a great challenge. The high mobility of vehicles and the extra interference brought by NOMA make the issue more complex. Hence, a matching combined multi-agent deep deterministic policy gradient (MADDPG) algorithm is proposed in the paper to maximize the sum delivery rate of V2I up-links while guaranteeing the reliability of V2V links in NOMA-V2X networks. In specific, the channel assignment is solved by oneto-many matching in advance which can be theoretically proved to converge to a stable state. While heterogeneous MADDPG is further adopted to obtain the proper power control for V2I link pairs and V2V links which are taken as different types of agents and interact with the environment independently. On the basis, a fully decentralized framework is designed for the proposed algorithm to reduce the communication overhead caused by the information synchronization. Simulation results demonstrate that by introducing matching theory into deep reinforcement learning (DRL), the sum delivery rate of V2I links can be greatly improved with less computation complexity and convergence time. Moreover, compared with orthogonal multiple access (OMA) communications, the outstanding energy and spectrum efficiency make it significant to explore NOMA in V2X networks

Item Type:	Article
ISSN:	0018-9545
Uncontrolled Keywords:	Vehicular networks; noma-v2x; multi-agent drl; matching; spectrum sharing
Group:	Faculty of Science & Technology
ID Code:	39914
Deposited By:	Symplectic RT2
Deposited On:	06 Jun 2024 06:38
Last Modified:	27 Nov 2024 15:16

Downloads

Downloads per month over past year

More statistics for this item...

Repository Staff Only -