In this paper, we investigated the power allocation problem to maximize the energy efficiency while ensuring the quality-of-service of all users in a downlink MIMO-NOMA system. Two deep reinforcements learning-based frameworks, referred to as the multi-agent DDPG/TD3-based power allocation framework, are proposed to solve this non-convex and dynamic optimization problem. In particular, with current channel conditions as input, every single agent of two multi-agent frameworks dynamically outputs the optimum power allocation policy for all users in every cluster by DDPG/TD3 algorithm, and the additional actor network is also added to the conventional multi-agent model in order to adjust power volumes allocated to clusters to improve overall performance of the system. Finally, both frameworks adjust the entire power allocation policy by updating the weights of neural networks according to the feedback of the system. Simulation results show that the proposed multi-agent deep reinforcement learning-based power allocation frameworks can significantly improve the energy efficiency of the MIMO-NOMA system under various transmit power limitations and minimum data rates compared with other approaches, including the performance comparison over MIMO-NOMA.
We have published this research results in "IET Communications" under the title of "Multi-agent deep reinforcement learning-based energy efficient power allocation in downlink MIMO-NOMA systems"(https://doi.org/10.1049/cmu2.12177).