Article

Resource Allocation for IRS assisted SGF NOMA Transmission: A MADRL Approach

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Non-orthogonal multiple access (NOMA) assisted semi-grant-free (SGF) transmission has been viewed as one of the promising technologies to meet massive connectivity requirements of the next-generation networks. A novel intelligent reconfigurable surface (IRS) assisted SGF NOMA transmission system is proposed, where the IRS is employed to satisfy the channel gain requirements for grant-based users (GBUs) and grant-free users (GFUs). The dynamic optimization on the sub-carrier assignment and power allocation for roaming GFUs, and the amplitude control and phase shift design for reflecting elements of the IRS, is formulated. Aiming at maximizing the long-term data rate of all GFUs, the optimization problem is first modeled as a multi-agent Markov decision problem. Then, three multi-agent deep reinforcement learning based frameworks are proposed to solve the problem under three different IRS cases, including the ideal IRS, non-ideal IRS with continuous phase shifts, and non-ideal IRS with discrete phase shifts. Specifically, for each GFU agent, a sub-carrier assignment deep Q-network (DQN) and a power allocation deep deterministic policy gradient (DDPG) are integrated to dynamically assign network resources for each GFU. For the only IRS agent, two DDPGs are integrated to dynamically assign phase shift and amplitude for each reflecting element of ideal IRS. The single DDPG for dynamically assigning continuous phase shifts, and parallel DQNs for dynamically assigning discrete phase shifts for non-ideal IRS with fixed amplitude are also proposed. Simulation results demonstrate that: 1) The network sum rates of all GFUs can achieve a significant improvement with the aid of IRS, comparing with the system without IRS. 2) The network sum rates of the NOMA assisted SGF transmissions are superior to that of OMA assisted GF transmissions.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Traditionally, these channel conditions are assumed to be fixed and determined solely by the users' propagation environment. However, using RIS presents opportunities for intelligently reconfiguring the users' propagation environment to facilitate the implementation of NOMA, which can result in significant performance gains [303], [304]. In particular, the combination of RIS with NOMA provides greater degrees of freedom for system design when the channel gains of both This can be exploited to not only generate a significant difference in the users' channel gains but also to customize the users' effective channel gains according to the users' QoS requirements [105], [305]. ...
... In contrast to the conventional optimizing techniques [313], [314], ML-based techniques are gaining popularity due to their ability to adapt to instantaneous channel realization based on the radio environment [212], [219], [304]. Moreover, ML-empowered optimization techniques can be designed to further improve RIS-NOMA performance. ...
... To overcome this problem, researchers proposed ML-based optimization techniques for the RIS-NOMA networks. For instance, authors in [304] proposed a semi-grant-free transmission scheme for the RIS-NOMA networks. The authors investigated ideal and non-ideal RIS cases in their proposed network and modelled a joint-optimization problem to maximize the long-term data rate. ...
Preprint
Full-text available
Revolutionary sixth-generation wireless communications technologies and applications, notably digital twin networks (DTN), connected autonomous vehicles (CAVs), space-air-ground integrated networks (SAGINs), zero-touch networks, industry 5.0, and healthcare 5.0, are driving next-generation wireless networks (NGWNs). These technologies generate massive data, requiring swift transmission and trillions of device connections, fueling the need for sophisticated next-generation multiple access (NGMA) schemes. NGMA enables massive connectivity in the 6G era, optimizing NGWN operations beyond current multiple access (MA) schemes. This survey showcases non-orthogonal multiple access (NOMA) as NGMA's frontrunner, exploring What has NOMA delivered?, What is NOMA providing?, and What lies ahead?. We present NOMA variants, fundamental operations, and applicability in multi-antenna systems, machine learning, reconfigurable intelligent surfaces (RIS), cognitive radio networks (CRN), integrated sensing and communications (ISAC), terahertz networks, and unmanned aerial vehicles (UAVs). Additionally, we explore NOMA's interplay with state-of-the-art wireless technologies, highlighting its advantages and technical challenges. Finally, we unveil NOMA research trends in the 6G era and provide design recommendations and future perspectives for NOMA as the leading NGMA solution for NGWNs.
... 7: The BS broadcasts the SC selection and the PA to users. 8: All agents observe the reward r(t) in (39) and move to the next states sm(t + 1) (∀m). 9: for m = 1, . . . ...
... • Reward: Similar to the JOCDDQN method, the reward function in TS t for the FDDQN approach is also determined based on the achieved EE as in (39). ...
... , T do 5: All agents take an action az(t), ∀z, following the ε-greedy policy in (38), where az(t) is defined in (44) if z ∈ M E ∪ M U and in (45) if z ∈ M M . 6: All agents observe the reward r(t) in (39) and move to the next states sz(t + 1) (∀z). 7: for z = 1, . . . ...
Article
Full-text available
The escalating number of wireless users requiring different services, such as enhanced mobile broadband (eMBB), massive machine-type communications (mMTC), and ultra-reliable low-latency communications (URLLC), has led to exploring non-orthogonal multiplexing methods like heterogeneous non-orthogonal multiple access (H-NOMA). This method allows users demanding divergent services to share the same resources. However, implementing the H-NOMA scheme faces major resource management challenges due to unpredictable interference caused by the random access mechanism of mMTC users. To address this issue, this paper proposes a joint optimization and cooperative multi-agent (MA) deep reinforcement learning-based resource allocation mechanism, aimed at maximizing the energy efficiency (EE) of H-NOMA-based networks. Specifically, this work initially establishes an optimization framework capable of determining the optimal power allocation for any specific sub-channel assignment (SA) setting for all users. Based on that, a cooperative MA double deep Q network (CMADDQN) scheme is carefully designed at the base station to conduct SA among users. In addition, a distributed full learning-based approach using MADDQN for both SA and power allocation is also designed for comparison purposes. Simulation results show that the proposed joint optimization and machine learning method outperforms the solely-learning-based approach and other benchmark schemes in terms of convergence rate and EE performance.
... In [19], Gao et al. proposed a deep Q-network (DQN) based algorithm to jointly optimize IRS phase shifts and cluster power allocation in a NOMA system using the zero forcing approach. Multiagent DRL-based design was proposed in [20] for solving the resource allocation problem in IRS-assisted semi-grant-free NOMA transmissions. Furthermore, Benfaid et al. proposed a resource allocation framework for unmanned aerial vehicles (UAV)-NOMA systems based on DQN [21]. ...
... With the normalized action, we then decide to either reward the agent with the sum-rate in (20) if the QoS requirements are satisfied under the channel uncertainty, otherwise, the agent is punished with a negative reward. Any negative reward will work as the agent will try to avoid such action in the future. ...
... Therefore, if a t satisfies the QoS constraints under some bounded error region, the agent will be given a positive reward according to (20), otherwise, it will be punished with the negative reward in (26). Algorithm 1 summarizes the proposed TD3-based algorithm for solving the original robust design problem. ...
Article
Full-text available
In this paper, we propose a robust design for an intelligent reflecting surface (IRS)-assisted multiple-input single output non-orthogonal multiple access (NOMA) system. By considering channel uncertainties, the original robust design problem is formulated as a sum-rate maximization problem under a set of constraints. In particular, the uncertainties associated with reflected channels through IRS elements and direct channels are taken into account in the design and they are modelled as bounded errors. However, the original robust problem is not jointly convex in terms of beamformers at the base station and phase shifts of IRS elements. Therefore, we reformulate the original robust design as a reinforcement learning problem and develop an algorithm based on the twin-delayed deep deterministic policy gradient agent (also known as TD3). In particular, the proposed algorithm solves the original problem by jointly designing the beamformers and the phase shifts, which is not possible with conventional optimization techniques. Numerical results are provided to validate the effectiveness and evaluate the performance of the proposed robust design. In particular, the results demonstrate the competitive and promising capabilities of the proposed robust algorithm, which achieves significant gains in terms of robustness and system sum-rates over the baseline deep deterministic policy gradient agent. In addition, the algorithm has the ability to deal with fixed and dynamic channels, which gives deep reinforcement learning methods an edge over hand-crafted convex optimization-based algorithms.
... This, in turn, prompts some researchers to leverage the multi-agent paradigm [26][27][28][29][30]. Specifically,in [30], researchers presented a multi-agent-based deep Q-network approach that optimized the powers of the BSs, considering discrete power levels. There is a scarcity of literature dedicated to multiagent algorithms in IRS-assisted systems [31,32]. In [31], the considered problem of power allocation and subcarrier assignment was formulated and solved by a multi-agent-based approach, while in [32], a multi-agent reinforcement learning-based buffer was proposed for aiding relay selection in an IRS-aided system. ...
... There is a scarcity of literature dedicated to multiagent algorithms in IRS-assisted systems [31,32]. In [31], the considered problem of power allocation and subcarrier assignment was formulated and solved by a multi-agent-based approach, while in [32], a multi-agent reinforcement learning-based buffer was proposed for aiding relay selection in an IRS-aided system. Precisely, the agents were trained to optimize the IRS reflection phases and relay selection to achieve maximum throughput. ...
... The computational complexity of the proposed algorithm can be split into two primary parts: forward propagation and backpropagation for action selection and training models [31],, respectively. ...
Article
Full-text available
Intelligent reflecting surface (IRS) is a revolutionizing technology for improving the spectral and energy efficiency of future wireless networks. In this paper, we consider a downlink large-scale system empowered by multi-IRS to aid communication between the multiple base stations (BSs) and multiple user equipment (UEs). We target [d = A]maximizingto maximize the sum rate by jointly optimizing the UE association, the transmit powers of BSs, and the configurations of the IRS beamforming. Due to the applicability restrictions of conventional optimization methods and their high complexity with large-scale networks in dynamic environments, deep reinforcement (DRL) learning is adopted as an alternative approach to finding optimal solutions. First, we model the optimization problem as a multi-agent Markov decision problem (MAMDP). [d = A]Then, because large-scale wireless networks are naturally complex and changeable, and because many entities interact and affect how the whole system works, it is important to use a multi-agent approach to understand the complex dependencies and relationships between the different parts.Then, because of the inherent complexity and dynamic nature of large-scale wireless networks, where multiple entities interact and influence the overall system behavior, adopting a multi-agent approach becomes crucial for capturing the intricate relationships and dependencies among the diverse components. [d = A]In order to solve the problem, we propose a cooperative multi-agent deep reinforcement learning (MADRL)-based algorithm that works for both continuous and discrete IRS phase shifts.Accordingly, we propose a cooperative multi-agent deep reinforcement learning (MADRL)-based algorithm to solve the problem under continuous and discrete phase shifts of the IRSs. Simulation results validate that the proposed algorithm surpasses iterative optimization benchmarks regarding both sum rate performance and convergence.
... Huang et al. [17] proposed a double DQN-based spectrum access algorithm to maximize the sum throughput for D2D communication underlaying cellular networks. Chen et al. [8] solved the channel assignment, power allocation, and reflecting matrix design of the RIS-assisted semi-grant-free non-orthogonal multiple access (NOMA) systems by invoking a MADRL-based algorithm. Though there exists many DRL methods focused on the resource allocation problem in D2D communications, most works considered the optimization of only discrete variables. ...
... Otherwise, when the d-th D2D pair scheduled on the channel k working in the cellular mode, the data rate c k;c d ðtÞ in (8) can be given by ...
... Then, the transmit power of CUEs can then be calculated by transforming the (19) with the other variables known. As shown in (16), we set the average sum data rate of D2D pairs as the performance metric of the proposed system, as it has been widely adopted in wireless systems [7,8,17]. It is noted that the achieved system performance is considered to be valid if and only if all the above constraints ( (17)- (20)) are satisfied. ...
Article
Full-text available
Device-to-device (D2D) communication has been regarded as a promising solution to alleviate the mobile traffic explosion problem for its capabilities of improving system data rate and resource utilization. A reconfigurable intelligent surface (RIS) aided mobile D2D communications framework is investigated, where the RIS is deployed to improve communication quality. As the transmission distance of D2D pairs changes, the mode selection for D2D pairs and the phase shift design for RIS is essential for mobile scenarios. Therefore, we formulate a joint optimization problem of mode selection, channel assignment, power allocation, and discrete phase shift selection to maximize the average sum data rate of D2D pairs. This problem is also constrained by the maximum transmit power and the minimum data rate requirements of users, where the latter is to guarantee the fairness of D2D pairs. We first reformulate the original sequential decision-making problem into a Markov game (MG) problem to solve the challenging optimization. Furthermore, a multi-agent deep reinforcement learning (MADRL) framework is proposed, in which multiple agents cooperatively determine the joint mode selection and resource allocation strategy. The proposed MADRL-based framework combines both the multi-pass deep Q-networks (MP-DQN) algorithm and the decaying DQN algorithm to solve the optimization problem. Specifically, we adopt the MP-DQN algorithm for D2D pairs to handle the hybrid discrete-continuous action space. Moreover, the decaying DQN algorithm is invoked by the RIS agent to select discrete phase shifts. Simulation results demonstrate that the proposed algorithm can converge under different cases. The proposed MADRL-based algorithm outperforms the combination algorithm of DQN and the deep deterministic policy gradient (DDPG) in terms of system performance. Moreover, it is also shown that the average sum data rate of D2D pairs can be significantly improved by deploying the RIS and further enhanced by increasing the number of reflecting elements (REs).
... The collision issue can be resolved using massive multipleinput multiple-output (MIMO) or non-orthogonal multiple access (NOMA) technologies. The former solution utilizes spatial degrees of freedom to mitigate multi-user collisions, while the latter focuses on spectrum sharing among multiple users with successive interference cancellation (SIC) [8], [9], [10], [11]. ...
... Even though the massive connectivity can be supported through GF schemes, GB schemes are still desired, especially when strict quality of service (QoS) requirements exist [11]. The GB and GF transmission scheme must coexist in scenarios where URLLC applications are served by the GB scheme and mMTC applications in the same system are served by the GF scheme. ...
... The analysis of the secrecy outage probability of the GB user is similar to that of the GF user, expressed in Eq. (11). Due to space limitations, the analysis of the U B 's secrecy outage probability is regrettably omitted here. ...
Article
Full-text available
Semi-grant-free (SGF) transmission scheme enables grant-free (GF) users to utilize resource blocks allocated for grant-based (GB) users while maintaining the quality of service of GB users. This work investigates the secrecy performance of non-orthogonal multiple access (NOMA)-aided SGF systems. First, analytical expressions for the exact and asymptotic secrecy outage probability (SOP) of NOMA-aided SGF systems with a single GF user are derived. Then, the SGF systems with multiple GF users and the best-user scheduling scheme is considered. By utilizing order statistics theory, analytical expressions for the exact and asymptotic SOP are derived. Monte Carlo simulation results are provided and compared with two benchmark schemes. The effects of system parameters on the SOP of the considered system are demonstrated and the accuracy of the developed analytical results is verified. The results indicate that both the outage target rate for GB and the secure target rate for GF are the main factors of the secrecy performance of SGF systems.
... Different from the threshold protocols given in [24], a dynamic threshold protocol was proposed for randomly admitted GFUs from the perspective of outage performance [29] and ergodic rate [30], respectively. More recently, advanced technologies, such as intelligent reflecting surfaces (IRS) and deep reinforcement learning, have been utilized to further improve the performance of the GFUs in IRS assisted SGF-NOMA transmission [31]. ...
... Although the above-mentioned works for terrestrial networks [24], [26]- [31] have shed light on the characteristic and performance of NOMA assisted SGF transmission, they are only applicable to the ideal scenario with perfect channel state information (CSI) assumption. Since the acquisition of perfect CSI is still a challenging problem in wireless systems [10], [11], it is more favorable to exploit the imperfect CSI to implement the SGF transmission. ...
... Here, the HAP equipped with a uniform concentric ring array (UCRA) functions as an aerial base station (BS) to serve multiple mobile terminals (MTs) through SDMA. Relative to the previous works that investigate SGF schemes in terrestrial networks [24], [26]- [31] or uplink access in the ISATN [9]- [11], the proposed framework is more general and has the advantage of providing flexible connectivity with high spectral efficiency and enhancing system throughput with satisfied user experience. • Based on the presented framework, we propose two SGFbased uplink transmission schemes, where either perfect CSI or imperfect CSI is available. ...
Article
Full-text available
This paper investigates a semi-grant-free (SGF) based transmission strategy to provide a flexible connectivity for various kinds of users in an integrated satellite-aerial-terrestrial network (ISATN). Herein, a high-altitude platform (HAP) termed as a grant-based user (GBU), which serves multiple mobile terminals (MTs) through space division multiple access (SDMA), wants to access a satellite network with multiple earth stations (ESs) termed as grant-free users (GFUs) simultaneously via non-orthogonal multiple access (NOMA) assisted SGF. To this end, we first propose two SGF-based uplink transmission schemes for both perfect channel state information (CSI) and imperfect CSI cases. When perfect CSI is available, a zero-forcing based beamforming (BF) scheme is used in HAP network while an adaptive transmit power allocation (ATPA) approach is adopted for SGF transmission. When only imperfect CSI is available, BF scheme employing the derived channel correlation matrix of HAP-MT link is proposed to achieve SDMA, and a novel ATPA strategy with rate probability constraint is proposed to guarantee quality-of-service of the GBU. Next, we derive the closed-form throughput expressions to evaluate the performance of the considered ISATN with the proposed two SGF-based schemes. Finally, computer simulations are conducted to validate the theoretical performance analysis and show the superiority of the proposed schemes over the related works. Moreover, our numerical results not only demonstrate a satisfactory performance of the proposed SGF-based scheme using imperfect CSI, but also reveal the impact of CSI errors on the system performance.
... It should be noted that each agent in the proposed algorithms learns its policies in a non-cooperative way, which means the convergence of the proposed MADRL algorithm relies on the DDPG algorithm. Since the deterministic policy gradient theorem employed is a limited case, the DDPG algorithm converges to a sub-optimal state [38], [39]. Overall, the convergence of the proposed MADRL algorithm can be guaranteed and the converged solution is sub-optimal [39]. ...
... Since the deterministic policy gradient theorem employed is a limited case, the DDPG algorithm converges to a sub-optimal state [38], [39]. Overall, the convergence of the proposed MADRL algorithm can be guaranteed and the converged solution is sub-optimal [39]. 2) Complexity of the SCTPD algorithm: Let L A i,l and L C i,l denote the number of neurons in the lth layer of the actor and critic networks, respectively, I A i and I C i signify the number of layers for the actor and critic networks, where i ∈ {M, J}. ...
Article
Full-text available
Unmanned aerial vehicles (UAVs) play an essential role in future wireless communication networks due to their high mobility, low cost, and on-demand deployment. In air-to-ground links, UAVs are widely used to enhance the performance of wireless communication systems due to the presence of high-probability line-of-sight (LoS) links. However, the high probability of LoS links also increases the risk of being eavesdropped, posing a significant challenge to the security of wireless communications. In this work, the secure problem in a multi-UAV-assisted communication system is investigated in a moving airborne eavesdropping scenario. To improve the secrecy performance of the considered communication system, aerial eavesdropping capability is suppressed by sending jamming signals from a friendly UAV. An optimization problem under flight conditions, fairness, and limited energy consumption constraints of multiple UAVs is formulated to maximize the fair sum secrecy throughput. Given the complexity and non-convex nature of the problem, we propose a two-step-based optimization approach. The first step employs the K-means algorithm to cluster users and associate them with multiple communication UAVs. Then, a heterogeneous multi-agent deep deterministic policy gradient based algorithm is introduced to solve this optimization problem. The effectiveness of this proposed algorithm is not only theoretically but also rigorously verified by simulation results. Index Terms-Unmanned aerial vehicles, physical layer security , fair sum secrecy throughput, heterogeneous multi-agent deep reinforcement learning.
... Xie et al. used DDPG to jointly optimize the beamforming vectors and IRS phase shifts for the sum-rate maximization problem [23]. The work in [24] proposed a multi-agent DRL-based design that jointly optimizes the subcarrier assignment, power allocation, and IRS phase shifts in NOMA-assisted semi-grant-free systems, while the resource allocation problem for NOMAunmanned aerial vehicle system was considered in [25]. However, there are still practical issues facing the aforementioned works. ...
... Furthermore, the DQN agent utilized to solve the problem cannot be applied to problems with large continuous action spaces as DQN is restricted to discrete action space problems. The work in [24] uses a DQN agent to solve the discrete channel-assignment problem, while a DDPG agent is utilized to solve the power allocation problem. However, since the BS and the user equipment units (UEs) are assumed to be equipped with a single antenna, no beamforming design is considered. ...
Article
Full-text available
In this paper, we propose a robust resource allocation framework for an intelligent reflecting surface (IRS)-assisted multiple-input single-output (MISO) non-orthogonal multiple access (NOMA) system. In particular, a long-term robust sum-rate maximization problem is considered. The impacts of imperfect channel estimation on both the transmitter and the receiver are taken into account with an outage-constrained robust design approach. More specifically, the statistical error model is used to model the unbounded channel uncertainty in the system. However, the joint robust resource allocation problem is a mixed-integer optimization problem, which cannot be solved directly using conventional optimization algorithms. A correlation-based user pairing algorithm is proposed to group the users into clusters. Furthermore, the resource allocation problem with clustered users is reformulated as a reinforcement learning environment. Subsequently, a twin-delayed deep deterministic policy gradient (TD3) agent is developed to solve the outage-constrained robust resource allocation problem. Extensive simulation results are provided to demonstrate the superior performance of the developed TD3 agent over existing algorithms in the literature.
... Table 6 summarizes the studies classified as Resource Allocation Based on DDPG algorithms. DDPG findings demonstrate that it outperforms traditional optimization methods in resource allocation problems, as it can learn complex policies that consider the nonlinear relationships between the system parameters and the rewards [42]. Additionally, DDPG can adapt to changes in the system and learn from experience, which makes it suitable for dynamic environments such as cloud computing and energy management. ...
... (Chen et al., 2022) [42] Techniques: Two DDPG algorithms are combined to allocate the amplitude and phase shift of individual reflecting elements for an ideal intelligent reflecting surface (IRS) in a dynamic manner. Methodology: Dynamic optimization is formulated for sub-carrier assignment, power allocation, amplitude control, and phase shift design. ...
Preprint
Full-text available
Deep Reinforcement Learning (DRL) has gained significant adoption in diverse fields and applications, mainly due to its proficiency in resolving complicated decision-making problems in spaces with high-dimensional states and actions. Deep Deterministic Policy Gradient (DDPG) is a well-known DRL algorithm that adopts an actor-critic approach, synthesizing the advantages of value-based and policy-based reinforcement learning methods. The aim of this study is to provide a thorough examination of the latest developments, patterns, obstacles, and potential opportunities related to DDPG. A systematic search was conducted using relevant academic databases (Scopus, Web of Science, and ScienceDirect) to identify 85 relevant studies published in the last five years (2018-2023). We provide a comprehensive overview of the key concepts and components of DDPG, including its formulation, implementation, and training. Then, we highlight the various applications and domains of DDPG, including Autonomous Driving, Unmanned Aerial Vehicles, Resource Allocation, Communications and the Internet of Things, Robotics, and Finance. Additionally, we provide an in-depth comparison of DDPG with other DRL algorithms and traditional RL methods, highlighting its strengths and weaknesses. We believe that this review will be an essential resource for researchers, offering them valuable insights into the methods and techniques utilized in the field of DRL and DDPG.
... The secrecy performance of NOMA-aided SGF was investigated in [22] wherein the analytical expressions for the Secrecy Outage Probability (SOP) for the scenarios with a single GF user and multiple GF users were derived, respectively. In [23], an Intelligent Reconfigurable Surface (IRS)-assisted SGF NOMA system was investigated, in which the IRS enhanced the channel gains for GB and GF users. The sum rates of GF users were maximized by jointly optimizing the subcarrier assignment, the power allocation of GF users, and the IRS amplitude and phase shift. ...
... We also analyze the asymptotic OP and the achievable diversity orders in the higher-signal-to-noise ratio (SNR) region to obtain more insights. The results demonstrate that [4] downlink ✓ Performance analysis OP, EC [5] downlink ✓ Performance analysis OP, EC [6] downlink ✓ Performance analysis CP [7] uplink ✓ Performance analysis CP [8] downlink Optimization Sum rate [9] downlink ✓ Optimization Sum rate [10] downlink ✓ Optimization Sum rate [11] downlink ✓ Optimization EE [12] uplink ✓ Optimization Flight time [13] uplink ✓ Optimization SE, EE [16] uplink, SGF Performance analysis OP [17] uplink, SGF Performance analysis OP [18] uplink, SGF Performance analysis OP [19] uplink, SGF Performance analysis OP [20] uplink, SGF Performance analysis OP [21] uplink, SGF Performance analysis EC [22] uplink, SGF Performance analysis SOP [23] uplink [16] - [19] wherein the performance of the uplink NOMA-aided SGF systems was investigated, the transmit power and the Channel State Information (CSI) of GB users must be known at the GF user to realize the power control. This work studied the performance of the downlink NOMA-aided SGF systems, where the power allocation is utilized at the base station and the GF user need not know the CSI of the GB user. ...
... Indeed, a recent survey on GF-NOMA has also pointed out that RISaided GF-NOMA is a promising technology but has yet to be explored in-depth [10]. To the best of the authors' knowledge, the RIS-assisted GF-NOMA has been studied only in [39], where authors exploit deep reinforcement learning to solve sub-carrier assignment, power control, and RIS phase-shift alignment. Unlike the single RIS assumption in [39], we consider a network of users and RISs distributed randomly over the cell coverage area, which necessitates a joint solution of UE clustering, RIS assignment, and RIS phase shift alignment problems. ...
... To the best of the authors' knowledge, the RIS-assisted GF-NOMA has been studied only in [39], where authors exploit deep reinforcement learning to solve sub-carrier assignment, power control, and RIS phase-shift alignment. Unlike the single RIS assumption in [39], we consider a network of users and RISs distributed randomly over the cell coverage area, which necessitates a joint solution of UE clustering, RIS assignment, and RIS phase shift alignment problems. ...
Article
Full-text available
This paper introduces a reconfigurable intelligent surface (RIS)-assisted grant-free non-orthogonal multiple access (GF-NOMA) scheme. We propose a joint user equipment (UE) clustering and RIS assignment/alignment approach that jointly ensures the power reception disparity required by the power domain NOMA (PD-NOMA). The proposed approach maximizes the network sum rate by judiciously pairing UE with distinct channel gains and assigning RISs to proper clusters. To alleviate the computational complexity of the joint approach, we decouple UE clustering and RIS assignment/alignment subproblems, which reduces run times 80 times while attaining almost the same performance. Once the proposed approaches acknowledge UEs with the cluster index, UEs are allowed to access corresponding resource blocks (RBs) at any time requiring neither further grant acquisitions from the base station (BS) nor power control as all UEs are requested to transmit at the same power. In addition to passive RISs containing only passive elements and giving an 18% better performance, an active RIS structure that enhances the performance by 37% is also used to overcome the double path loss problem. The numerical results also investigate the impact of UE density, RIS deployment, RIS hardware specifications, and the fairness among the UEs in terms of bit-per-joule energy efficiency.
... It is worth noting that the aforementioned works are based on the conventional grant-based (GB) transmission, which requires a large amount of signaling overhead pertaining to handshaking and might not be suitable for mMTC scenarios. Very recently, semi-grant-free (SGF) transmission protocol assisted uplink NOMA was proposed to further boost the spectral efficiency and reduce the access delay in radio frequency (RF) based communication system, where each grant-free (GF) user might have a small amount of data to send only [11][12][13][14][15][16][17][18][19][20][21][22][23][24]. In particular, Yang et al. ...
... [18], in which both the greedy best user scheduling SGF (BU-SGF) and cumulative distribution function (CDF)-based scheduling (CS-SGF) schemes are analyzed. Driven by the thriving development of artificial intelligence techniques, the intelligent reconfigurable surface (IRS) assisted SGF-NOMA system was first proposed by Chen et al. [19]. The authors proposed three multi-agent deep reinforcement learning (MADRL) based frameworks to effectively solve the problem of jointly dynamic optimization on sub-carrier assignment, power allocation, and reflection coefficients matrix design. ...
Article
Full-text available
This work aims to enhance the physical layer security (PLS) of non-orthogonal multiple access (NOMA) aided indoor visible light communication (VLC) system with semi-grant-free (SGF) transmission scheme, in which a grant-free (GF) user shares the same resource block with a grant-based (GB) user whose quality of service (QoS) must be strictly guaranteed. Besides, the GF user is also provided with an acceptable QoS experience, which is closely aligned with the practical application. Both active and passive eavesdropping attacks are discussed in this work, where users’ random distributions are taken into account. Specifically, to maximize the secrecy rate of the GB user in the presence of an active eavesdropper, the optimal power allocation policy is obtained in exact closed-form and the user fairness is then assessed by Jain’s fairness index. Moreover, the secrecy outage performance of the GB user is analyzed in the presence of the passive eavesdropping attack. Both exact and asymptotic theoretical expressions for the secrecy outage probability (SOP) of the GB user are derived, respectively. Furthermore, the effective secrecy throughput (EST) is investigated on the basis of the derived SOP expression. Through simulations, it is found that the PLS of this VLC system can be significantly improved by the proposed optimal power allocation scheme. The radius of the protected zone, the outage target rate for the GF user, and the secrecy target rate for the GB user would have pronounced impacts on the PLS and user fairness performance of this SGF-NOMA assisted indoor VLC system. The maximum EST will increase with the increasing transmit power and it is hardly influenced by the target rate for the GF user. This work will benefit the design of indoor VLC system.
... An experience replay buffer (B), with a predefined capacity (C), serves as the memory infrastructure, storing transitions that encapsulate the state-action-reward sequences experienced by the agent. Within each episode of the learning process, a fresh initialization of channel gain (hk), RIS phase shifts (Φ), (16) and user positions (u) within the designated area (A) is conducted. The UAV's horizontal position (v) is set at a predetermined point, and power allocations (ρ) are uniformly distributed as initial conditions. ...
Article
Full-text available
In this work, we apply the Deep Deterministic Policy Gradient (DDPG) technique to improve the security of a non-orthogonal multiple access (NOMA) downlink network by enabling use of a reconfigurable intelligent surface (RIS) equipped unmanned aerial vehicles (UAV). Our main objective is to prevent eavesdroppers from accessing the network while preserving seamless communication for authorized users. The system is made up of a UAV integrated with an RIS which is essential for optimizing signal paths, and a Base Station. Our work aims to maximize secrecy rates for all users under possible eavesdropping scenarios by dynamically adjusting the RIS’s phase shifts and power allocations. This method not only shows how flexible and successful the DDPG algorithm is at protecting wireless communications when used in conjunction with an RIS but it also highlights how much the algorithm has advanced secure communication systems.
... For the DDQN algorithm, two fully connected hidden layers as well as DQN are deployed both the evaluation and target networks. Regarding the proposed DDPG algorithm, we adopt a network architecture with two fully connected hidden layers, and AdamOptimizer is utilized to train deep-learning neural networks (DNNs) [42]. The values of  and N B are 3200 and 32, respectively [28]. ...
Article
Full-text available
In recent years, the Integrated Satellite Aerial Terrestrial (I‐SAT) network has garnered significant attention as an innovative and integrated communication system. However, it still encounters interference in the face of the complex external environment. In this context, reconfigurable intelligent surface (RIS) provides a key way of solving this problem and effectively improves the performance and stability of the I‐SAT network. This article considers the combination of unmanned aerial vehicle (UAV) and RIS and proposes a novel architecture for sub‐connected active RIS (ARIS) under the energy consumption constraints of UAV and ARIS. The authors first provide a UAV‐ARIS based position prediction strategy for the vehicle. Then, a joint RIS phase shift, amplification and UAV trail optimization algorithm is proposed to pursue a high achievable rate. The interference between each link and the total energy consumption are all taken into consideration. In addition, a deep deterministic policy gradient (DDPG) algorithm is utilized for the optimization problem, and achieves convergence in continuous action space. Finally, the simulation results affirm the precision of the proposed method in significantly enhancing performance compared to other schemes.
... DDPG findings demonstrate that it outperforms traditional optimization methods in resource allocation problems, as it can learn complex policies that consider the nonlinear relationships between the system parameters and the rewards [36]. Additionally, DDPG can adapt to changes in the system and learn from experience, which makes it suitable for dynamic environments such as cloud computing and energy management. ...
Article
Full-text available
Deep Reinforcement Learning (DRL) has gained significant adoption in diverse fields and applications, mainly due to its proficiency in resolving complicated decision-making problems in spaces with high-dimensional states and actions. Deep Deterministic Policy Gradient (DDPG) is a well-known DRL algorithm that adopts an actor-critic approach, synthesizing the advantages of value-based and policy-based reinforcement learning methods. The aim of this study is to provide a thorough examination of the latest developments, patterns, obstacles, and potential opportunities related to DDPG. A systematic search was conducted using relevant academic databases (Scopus, Web of Science, and ScienceDirect) to identify 85 relevant studies published in the last five years (2018-2023). We provide a comprehensive overview of the key concepts and components of DDPG, including its formulation, implementation, and training. Then, we highlight the various applications and domains of DDPG, including Autonomous Driving, Unmanned Aerial Vehicles, Resource Allocation, Communications and the Internet of Things, Robotics, and Finance. Additionally, we provide an in-depth comparison of DDPG with other DRL algorithms and traditional RL methods, highlighting its strengths and weaknesses. We believe that this review will be an essential resource for researchers, offering them valuable insights into the methods and techniques utilized in the field of DRL and DDPG.
... Recently, the NOMA-based semi-grant-free (SGF) transmission is proposed and widely investigated to support the massive connectivity in IoT [11], [12], [13], [14]. Compared with pure GF and GB transmissions, SGF transmissions admit one GB user (GBU) to access the spectrum, and multiple GF users (GFUs) are encouraged to access the same spectrum in an opportunistic way and perform the GF manner. ...
Article
Full-text available
Internet of Things (IoT) devices frequently encounter various challenges, including limited power, spectrum, and memory resources, as well as harsh environments conditions. Therefore, the development of an efficient transmission scheme is crucial for ensuring reliable and secure communication in IoT networks. In this article, an adaptive semi-grant-free (SGF) transmission scheme is proposed for reliable uplink nonorthogonal multiple access systems with enhanced security, in which a ratio-based user scheduling criterion and a hybrid successive interference cancellation technique are employed to suppress the activity of untrusted nodes while ensuring reliable transmission. To evaluate the superiority of the adaptive scheme, a conventional static transmission scheme and a worst-case eavesdropping scenario are used as benchmarks. Simulation results show that the adaptive scheme outperforms the conventional schemes in terms of outage and intercept probability. In addition, the closed-form results of grant-based user's and grant-free user's outage probability and untrusted node's intercept probability are derived. Compared to existing literature, this work provides a comprehensive view of security-reliability tradeoff analysis of SGF transmissions.
... DRL is considered as a promising method to address physical layer optimization [31], [32], such as modulation, beamforming design and channel estimation [33], [34], [35], [36], and [37]. Compared with centralized processing, MADRL can compromise cooperative and competitive trade-offs of agents to achieve a flexible balance in V2I networks [38], [39], [40], and [41]. Incorporating the advantages of MADRL into the target-mounted STARS system can bring two-fold benefits. ...
Article
Full-text available
The utilization of integrated sensing and communication (ISAC) technology has the potential to enhance the communication performance of road side units (RSUs) through the active sensing of target vehicles. Furthermore, installing a simultaneous transmitting and reflecting surface (STARS) on the target vehicle can provide an extra boost to the reflection of the echo signal, thereby improving the communication quality for invehicle users. However, the design of this target-mounted STARS system exhibits significant challenges, such as limited information sharing and distributed STARS control. In this paper, we propose an end-to-end multi-agent deep reinforcement learning (MADRL) framework to tackle the challenges of joint sensing and communication optimization in the considered target-mounted STARS assisted vehicle networks. By deploying agents on both RSU and vehicle, the MADRL framework enables RSU and vehicle to perform beam prediction and STARS pre-configuration using their respective local information. To ensure efficient and stable learning for continuous decision-making, we employ the multi-agent soft actor critic (MASAC) algorithm and the multiagent proximal policy optimization (MAPPO) algorithm on the proposed MADRL framework. Extensive experimental results confirm the effectiveness of our proposed MADRL framework in improving both sensing and communication performance through the utilization of target-mounted STARS. Finally, we conduct a comparative analysis and comparison of the two proposed algorithms under various environmental conditions
... Also, at high frequencies, achieving optimal beamforming is another task for IRSassisted NOMA networks. A novel IRS-assisted semi-grantfree-NOMA transmission system was proposed in [165] to satisfy the channel gain requirements for grant-based users and grant-free users (GFUs). The dynamic optimization on the sub-carrier assignment, power allocation for roaming GFUs, amplitude control, and phase shift design for reflecting elements of the IRS was formulated to maximize the long-term data rate of all GFUs. ...
Article
The propagation environment was uncontrollable in first-generation to fifth-generation (5G) wireless technologies. This behavior of the wireless propagation environment is one of the prime constraints in harnessing the performance of wireless networks. This problem can be addressed in sixth-generation (6G) wireless networks by deploying intelligent reflecting surfaces (IRSs). IRS’s amplitude and phase reflecting coefficient of reflecting units (RUs) can be adjusted via a programmable controller to meet the network requirements. On the other hand, in 5G and 6G wireless communication networks, non-orthogonal multiple access (NOMA) is a robust and well-admired multiple access scheme among the other multiple access counterparts in terms of spectrum efficiency and link capacity. NOMA allows many user equipment (UE) by utilizing non-orthogonal distribution of resources. Therefore, the combination of IRS and NOMA is one of the dominant technologies for 6G wireless networks. Based upon the importance of NOMA and IRS in the initial development of 6G wireless networks, this paper presents a comprehensive survey on IRS-assisted NOMA-based networks, considering their designs and challenges. In this work, the concept and structure of IRS-assisted NOMA have been explained with an in-depth analysis of the frameworks. It also includes some challenges of IRS-assisted NOMA in wireless communication networks. Further, applications and future research directions of IRSassisted NOMA networks are discussed.
... Numerical results demonstrated that the two algorithms have a lower complexity and outperformed OMA systems. In [170], DQN and DDPG were proposed for subcarrier assignment and power allocation in RIS assisted semigrant-free NOMA transmission, respectively. Two DDPGs were integrated to assign amplitude and phase shift to RIS's reflecting elements. ...
Preprint
Wireless communication systems to date primarily rely on the orthogonality of resources to facilitate the design and implementation, from user access to data transmission. Emerging applications and scenarios in the sixth generation (6G) wireless systems will require massive connectivity and transmission of a deluge of data, which calls for more flexibility in the design concept that goes beyond orthogonality. Furthermore, recent advances in signal processing and learning have attracted considerable attention, as they provide promising approaches to various complex and previously intractable problems of signal processing in many fields. This article provides an overview of research efforts to date in the field of signal processing and learning for next-generation multiple access, with an emphasis on massive random access and non-orthogonal multiple access. The promising interplay with new technologies and the challenges in learning-based NGMA are discussed.
... In [30], the authors proposed a deep learning approach to solve the variational optimization problem for GF-NOMA, while, in [31], random and structured sparsity learning was utilized to reduce users' signaling overhead. Finally, the use of DRL was proposed in [32] and [33] to optimize the transmit power in semi-and full GF-NOMA schemes, respectively. ...
Article
Full-text available
This study delves into the capabilities of reconfigurable intelligent surfaces (RISs) in enhancing bidirectional non-orthogonal multiple access (NOMA) networks. The proposed approach partitions RIS to optimize the channel conditions for NOMA users, improving NOMA gain and eliminating the requirement for uplink (UL) power control. The proposed approach is rigorously evaluated under four practical operational regimes; 1) Quality-of-Service (QoS) sufficient regime, 2) RIS and power efficient regime, 3) max-min fair regime, and 4) maximum throughput regime, each subject to both UL and downlink (DL) QoS constraints. By leveraging decoupled nature of RIS portions and base station (BS) transmit power, closed-form solutions are derived to demonstrate how optimal RIS partitioning can meet UL-QoS requirements while optimal BS power control can ensure DL-QoS compliance. Our analytical findings are validated through simulations, highlighting the significant benefits that RISs can bring to the NOMA networks in the aforementioned operational scenarios.
... In [30], the authors proposed a deep learning approach to solve the variational optimization problem for GF-NOMA, while, in [31], random and structured sparsity learning was utilized to reduce users' signaling overhead. Finally, the use of DRL was proposed in [32] and [33] to optimize the transmit power in semi-and full GF-NOMA schemes, respectively. ...
Preprint
Full-text available
This study delves into the capabilities of reconfig- urable intelligent surfaces (RISs) in enhancing bidirectional non- orthogonal multiple access (NOMA) networks. The proposed approach partitions RIS to optimize the channel conditions for NOMA users, improving NOMA gain and eliminating the re- quirement for uplink (UL) power control. The proposed approach is rigorously evaluated under four practical operational regimes; 1) Quality-of-Service (QoS) sufficient regime, 2) RIS and power efficient regime, 3) max-min fair regime, and 4) maximum throughput regime, each subject to both UL and downlink (DL) QoS constraints. By leveraging decoupled nature of RIS portions and base station (BS) transmit power, closed-form solutions are derived to demonstrate how optimal RIS partitioning can meet UL-QoS requirements while optimal BS power control can ensure DL-QoS compliance. Our analytical findings are validated through simulations, highlighting the significant benefits that RISs can bring to the NOMA networks in the aforementioned operational scenarios.
... As pointed out by a recent survey on GF-NOMA, RIS-aided GF-NOMA is a promising technology but has not been explored in-depth yet [9]. To the best of the authors' knowledge, RIS-aided GF-NOMA is considered only in [21], where authors propose a semi GF-NOMA by using deep reinforcement learning to control power and phase shifts. ...
Conference Paper
Full-text available
This paper introduces a reconfigurable intelligent surface (RIS)-assisted grant-free non-orthogonal multiple access (GF-NOMA) scheme. To ensure the power reception disparity required by the power domain NOMA (PD-NOMA), we propose a joint user clustering and RIS assignment/alignment approach that maximizes the network sum rate by judiciously pairing user equipments (UEs) with distinct channel gains, assigning RISs to proper clusters, and aligning RIS phase shifts to the cluster members yielding the highest cluster sum rate. Once UEs are acknowledged with the cluster index, they are allowed to access their resource blocks (RBs) at any time requiring neither further grant acquisitions from the base station (BS) nor power control as all UEs are requested to transmit at the same power. In this way, the proposed approach performs an implicit over-the-air power control with minimal control signaling between the BS and UEs, which has shown to deliver up to 20% higher network sum rate than benchmark GF-NOMA and grant-based optimal (OPT) PD-NOMA schemes depending on the network parameters. The given numerical results also investigate the impact of UE density, RIS deployment, and RIS hardware specifications on the overall performance of the proposed RIS-aided GF-NOMA scheme.
... In [40] and [41], multiagent deep reinforcement learning (MADRL) was applied to optimize the transmit power for NOMA-SGF and NOMAaided GF (NOMA-GF) transmissions. Moreover, MADRL was used to optimize transmit power allocation, sub-channel assignment and reflection beamforming for an intelligent reflecting surface (IRS) aided NOMA-SGF system [42]. ...
Article
Full-text available
In this paper, we analyze the outage performance of a rate-splitting multiple access (RSMA)-aided semi-grant-free (SGF) transmission system, in which a grant-based user (GBU) and multiple grant-free users (GFUs) access the base station by sharing the same resource blocks. In the RSMA-aided SGF (RSMA-SGF) transmission system, the GBU and admitted GFU are respectively treated as the primary and secondary users by using the cognitive radio principle. With the aid of RSMA, the admitted GFU’s transmit power allocation, target rate allocation, and successive interference cancellation decoding order are jointly optimized to attain the maximum achievable rate for the admitted GFU, without deteriorating the GBU’s outage performance compared to orthogonal multiple access. Taking into account the extended non-outage zone achieved by rate-splitting, a closed-form expression is derived for the outage probability of the admitted GFU in the considered RSMA-SGF system. Asymptotic analysis for the admitted GFU’s outage probability is also provided. The superior outage performance and full multiuser diversity gain achieved by the RSMA-SGF transmission system are verified by the analytical and simulation results.
... Taking the network overhead and IRS implementation cost issues into account, the IRS phase shift methods were considered with a limited phase shift resolution, which is called a discrete phase shift. For example, the IRS discrete phase shift methods were utilized in MISO systems [44][45][46][47][48], device-to-device [43,49], Internet of Things [50,51], coordinated multipoint transmission [52], millimeter-wave system [53], NOMA systems [54,55], and orthogonal frequency-division multiplexing (OFDM) systems [56]. Particularly, Ref. [44] evaluated the effect of received power loss with respect to the resolution of IRS discrete phase shift and compared the performance of discrete phase shift to the ideal analog phase shift method. ...
Article
Full-text available
In this study, the performance of intelligent reflecting surfaces (IRSs) with a discrete phase shift strategy is examined in multiple-antenna systems. Considering the IRS network overhead, the achievable rate model is newly designed to evaluate the practical IRS system performance. Finding the optimal resolution of the IRS discrete phase shifts and a corresponding phase shift vector is an NP-hard combinatorial problem with an extremely large search complexity. Recognizing the performance trade-off between the IRS passive beamforming gain and IRS signaling overheads, the incremental search method is proposed to present the optimal resolution of the IRS discrete phase shift. Moreover, two low-complexity sub-algorithms are suggested to obtain the IRS discrete phase shift vector during the incremental search algorithms. The proposed incremental search-based discrete phase shift method can efficiently obtain the optimal resolution of the IRS discrete phase shift that maximizes the overhead-aware achievable rate. Simulation results show that the discrete phase shift with the incremental search method outperforms the conventional analog phase shift by choosing the optimal resolution of the IRS discrete phase shift. Furthermore, the cumulative distribution function comparison shows the superiority of the proposed method over the entire coverage area. Specifically, it is shown that more than 20% of coverage extension can be accomplished by deploying IRS with the proposed method.
... In [38], the author presents low complexity multi-level and single GF-NOMA schemes and an UE clustering approach exploiting the channel gains of the UEs. RIS technology is used in [39] where authors propose a semi GF-NOMA scheme using deep reinforcement learning to control the transmit power and phase shifts. As pointed out by a recent survey on GF-NOMA [10], RIS-aided GF-NOMA is a promising technology but has not been explored in-depth yet. ...
Preprint
Full-text available
This paper introduces a reconfigurable intelligent surface (RIS)-assisted grant-free non-orthogonal multiple access (GF-NOMA) scheme. We propose a joint user equipment (UE) clustering and RIS assignment/alignment approach that ensures the power reception disparity required by the power domain NOMA (PD-NOMA). The proposed approach maximizes the network sum rate by judiciously pairing UE with distinct channel gains and assigning RISs to proper clusters. To alleviate the computational complexity of the joint approach, we decouple UE clustering and RIS assignment/alignment subproblems, which reduces run times 80 times while attaining almost the same performance. Once the proposed approaches acknowledge UEs with the cluster index, UEs are allowed to access corresponding resource blocks (RBs) at any time requiring neither further grant acquisitions from the base station (BS) nor power control as all UEs are requested to transmit at the same power. In addition to passive RISs containing only passive elements and giving an 18% better performance, an active RIS structure that enhances the performance by 37% is also used to overcome the double path loss problem. The numerical results also investigate the impact of UE density, RIS deployment, RIS hardware specifications, and the fairness among the UEs in terms of bit-per-joule energy efficiency.
Article
Affected by the resource allocation skew problem, which leads to the low utilization rate of resources, we propose to study the optimization method of dynamic resource allocation of compiler subsystem under the environment of electric power IoT (Internet of Things). The skewness willingness function is constructed to reflect the imbalance degree of the compiler subsystem partitioning in the power IoT environment, and at the same time, quantitative indexes are provided for evaluating the effectiveness of the compiler subsystem partitioning algorithm in the power IoT environment. In the resource allocation stage, combined with the compiler subsystem resource tilt allowable threshold, the dynamic allocation of resources is carried out based on the existing compiler tasks in accordance with the priority, with the goal of minimizing the additional waiting time (tilt). In the test results, the compiler subsystem is tested for the distribution of η EE under different types of requests is always within 0.25 J/bit. Compared with the control group, it has obvious advantages in processing performance and stability.
Article
Low-earth orbit (LEO) satellite networks can achieve global network coverage without geographical restrictions and are essential to the future communication network. In this paper, we study the computing offloading problem in a satellite-terrestrial integrated network for the Internet of Remote Things (IoRT), which aims to reduce the total cost (weighted sum of energy consumption and delay), and jointly offload node selection, offloading ratio, and computational resource allocation to achieve the dynamic management of network resources. First, we propose a hybrid cloud and satellite multi-layer multi-access edge computing (MEC) network architecture that can provide heterogeneous computing resources to terrestrial users. Subsequently, since the problem under consideration is a mixed-integer nonlinear programming problem, we propose a computing offloading algorithm for multi-agent reinforcement learning, which is an integration of double deep Q learning (DDQN) and deep deterministic policy gradient (DDPG). The algorithm can learn the optimal policy for actions containing a mixture of discrete and continuous variables. Finally, an optimal computational resource allocation scheme is proposed to improve the task computation efficiency. Simulation results show that the proposed task offloading and resource allocation scheme can achieve reasonable scheduling of computational tasks and optimal allocation of computational resources, reducing the cost of task computation.
Article
This paper presents a novel simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) assisted mobile edge computing (MEC) system. We employ the semi-grant-free (SGF) non-orthogonal multiple access (NOMA) to improve the system's spectrum and energy efficiency, and the STAR-RIS to enhance the uplink communication from mobile users to the BS. The joint task offloading and resource allocation (JTORA) for the STAR-RIS-assisted SGF-NOMA MEC system with imperfect channel state information is investigated to minimize the average energy consumption. Considering user mobility and dynamic arrival tasks, a JTORA framework comprised of reinforcement learning and a convex optimization module is proposed to tackle this resultant optimization problem. Specifically, a novel quantile regression multi-pass deep Q-network (QRMP-DQN) algorithm is proposed to deal with the hybrid discrete-continuous action structure of MUs and STAR-RIS. Moreover, the convex optimization module adopts the Karush-Kuhn Tucker conditions to derive the optimal computing resource allocation scheme. Simulation results unveil that: 1) the proposed framework can effectively solve the dynamic optimization problem and outperform the conventional DQN algorithm; 2) the STAR-RIS can significantly improve the performance of the SGF-NOMA MEC system compared to the benchmark cases.
Article
Reconfigurable intelligent surface (RIS) is a promising paradigm for implementing intelligent reconfigurable wireless propagation environments in the 6G era. However, most of the existing studies focus on utilizing RIS deployed on buildings to provide services to users or constructing a RIS-assisted system framework for static users, which greatly limited application in real-time changing vehicular communication environments. As a result, combining unmanned aerial vehicles (UAVs) with RIS (RIS-UAV) plays a crucial role in various wireless networks due to their high mobility. To maximize the communication rate between base station (BS) and mobile vehicle, we propose a position prediction strategy for vehicles that facilitates real-time adjustment of UAV trajectories and RIS phase shifts, enhancing communication in dynamic environments. Deep reinforcement learning (DRL) algorithm is utilized to solve the above question, which achieves a good effect on convergence in continuous action space. Simulation results demonstrate that compared with benchmark schemes, the algorithm we suggested has significant performance gains, that is to maximize the communication rate under system constraints and guarantee the reliability of the communication.
Article
Full-text available
An introduction of intelligent interconnectivity for people and things has posed higher demands and more challenges for sixth-generation (6G) networks, such as high spectral efficiency and energy efficiency (EE), ultralow latency, and ultrahigh reliability. Cell-free (CF) massive multiple-input-multiple-output (mMIMO) and reconfigurable intelligent surface (RIS), also called intelligent reflecting surface (IRS), are two promising technologies for coping with these unprecedented demands. Given their distinct capabilities, integrating the two technologies to further enhance wireless network performances has received great research and development attention. In this article, we provide a comprehensive survey of research on RIS-aided CF mMIMO wireless communication systems. We first introduce system models focusing on system architecture and application scenarios, channel models, and communication protocols. Subsequently, we summarize the relevant studies on system operation and resource allocation, providing in-depth analyses and discussions. Following this, we present practical challenges faced by RIS-aided CF mMIMO systems, particularly those introduced by RIS, such as hardware impairments (HIs) and electromagnetic interference (EMI). We summarize the corresponding analyses and solutions to further facilitate the implementation of RIS-aided CF mMIMO systems. Furthermore, we explore an interplay between RIS-aided CF mMIMO and other emerging 6G technologies, such as millimeter wave (mmWave) and terahertz (THz), simultaneous wireless information and power transfer (SWIPT), next-generation multiple access (NGMA), and unmanned aerial vehicle (UAV). Finally, we outline several research directions for future RIS-aided CF mMIMO systems.
Article
Full-text available
Semantic communication and spectrum sharing are pivotal technologies in addressing the perennial challenge of scarce spectrum resources for the sixth-generation (6G) communication networks. Notably, scant attention has been devoted to investigating semantic resource allocation within spectrum sharing semantic communication networks, thereby constraining the full exploitation of spectrum efficiency. To mitigate interference issues between primary users and secondary users while augmenting legitimate signal strength, the introduction of Intelligent Reflective Surfaces (IRS) emerges as a salient solution. In this study, we delve into the intricacies of resource allocation for IRS-enhanced semantic spectrum sharing networks. Our focal point is the maximization of semantic spectral efficiency (S-SE) for the secondary semantic network while upholding the minimum quality of service standards for the primary semantic network. This entails the joint optimization of parameters such as semantic symbol allocation, subchannel allocation, reflective coefficients of IRS elements, and beamforming adjustment of secondary base station. Recognizing computational intricacies and interdependence of variables in the non-convex optimization problem formulated, we present a judicious approach: a hybrid intelligent resource allocation approach leveraging dueling double-deep Q networks coupled with the twin-delayed deep deterministic policy. Simulation results unequivocally affirm the efficacy of our proposed resource allocation approach, showcasing its superior performance relative to baseline schemes. Our approach markedly enhances the S-SE of the secondary network, thereby establishing its prowess in advancing the frontiers of semantic spectrum sharing (S-SE).
Article
Unmanned aerial vehicles (UAVs) have been envisioned as essential technology to enhance the service quality of wireless systems, whereas the security issue is unavoidable. In this paper, a reconfigurable intelligent surfaces (RIS)-aided air-to-ground secure communication paradigm is conceived, where the RIS is used to boost the security of confidential signals from UAVs to ground users. However, robust trajectory and beamforming designs are required to fully reap the secure enhancement capabilities of RIS for UAV links under imperfect channel state information (CSI) of eavesdroppers. Therefore, we formulate a robust minimum multicast rate maximization problem for jointly optimizing the UAVs' trajectories, the active and passive beamforming. The problem is also constrained by the maximum flight duration and the secrecy outage probability (SOP). After an approximate transformation of SOP, we provide an online decision-making framework that combines multi-agent reinforcement learning (MARL) methods with conventional optimization algorithms. To overcome the insufficient learning caused by random rewards and uncertain environments, we propose a novel regularized softmax risk-sensitive QMIX (RES-RMIX) algorithm to guide the UAVs' flight. Simulation results demonstrate that: 1) the proposed RES-RMIX algorithm outperforms the state-of-the-art MARL algorithms; 2) the RIS-aided multi-UAVs system attains significant rate gain over the cases of single UAV and no RIS.
Article
Full-text available
In order to improve the security aspects of a nonorthogonal multiple access (NOMA) downlink network, we investigate the use of the proximal policy optimization (PPO) technique in this study. This network is uniquely augmented with reconfi- gurable intelligent surfaces (RIS) mounted on unmanned aerial vehicles (UAVs). The main objective of this article is to prevent possible eavesdroppers from accessing the network while still assuring continuous and secure connectivity for legitimate users. The network configuration includes a UAV outfitted with a RIS, which plays an important role in signal propagation. This work aims to increase the secrecy rates of all user communications in eavesdropping prone locations. By adaptively modifying the phase shifts and power distributions through the RIS the proposed PPO algorithm not only shows adaptability and effectiveness in securing wireless communications, but also highlights the significant advances made in communication system security through this technology.
Article
In this paper, we design a resource block (RB) oriented power pool (PP) for semi-grant-free non-orthogonal multiple access (SGF-NOMA) in the presence of residual errors resulting from imperfect successive interference cancellation (SIC). In the proposed method, the BS allocates one orthogonal RB to each grant-based (GB) user, and determines the acceptable received power from grant-free (GF) users and calculates a threshold against this RB for broadcasting. Each GF user as an agent, tries to find the optimal transmit power and RB without affecting the quality-of-service (QoS) and ongoing transmission of the GB user. To this end, we formulate the transmit power and RB allocation problem as a stochastic Markov game to design the desired PPs and maximize the long-term system throughput. The problem is then solved using multi-agent (MA) deep reinforcement learning algorithms, such as double deep Q networks (DDQN) and Dueling DDQN due to their enhanced capabilities in value estimation and policy learning, with the latter performing optimally in environments characterized by extensive states and action spaces. The agents (GF users) undertake actions, specifically adjusting power levels and selecting RBs, in pursuit of maximizing cumulative rewards (throughput). Simulation results indicate computational scalability and minimal signaling overhead of the proposed algorithm with notable gains in system throughput compared to existing SGF-NOMA systems. We examine the effect of SIC error levels on sum rate and user transmit power, revealing a decrease in sum rate and an increase in user transmit power as QoS requirements and error variance escalate. We demonstrate that PPs can benefit new (untrained) users joining the network and outperform conventional SGF-NOMA without PPs in spectral efficiency.
Article
Simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs), as a revolutionary technique, can boost transmission security by controlling unfavorable environments for signal eavesdropping and reducing interference. Integrating unmanned aerial vehicles (UAVs) with STAR-RISs has generated considerable interest due to its enhanced deployment flexibility. However, developing secure communication capabilities using flying STAR-RIS remains an open issue. Therefore, this work investigates the secrecy energy efficiency (SEE) maximization problem for the uplink non-orthogonal multiple access (NOMA) systems, where the UAV-mounted STAR-RIS is employed against the eavesdroppers. Specifically, we consider the joint optimization of the power control, the transmission/reflection coefficients, and the UAV/STAR-RIS’s placement for static and mobile scenarios. The problems are also subject to the minimum data rate requirements and the safety flight region. To tackle the intractable problems, we first adopt the iterative-based method to solve the problem under the static scenario. After that, we invoke the fractional programming and successive convex approximation methods to get the power control scheme, the semidefinite relaxation method to get the transmission/reflection (T/R) coefficients design, and the search-based method to obtain the UAV/STAR-RIS position. Extending to the mobile scenario, we adopt the double deep Q-network (DDQN) algorithm to learn the online UAV trajectory design policy from a long-term perspective. Numerical results unveil that: 1) the proposed iterative-based joint optimization algorithm for static scenarios achieves a near-optimal solution; 2) the NOMA communications aided by the UAV-mounted STAR-RIS achieve significant SEE gain over the conventional reflection-only RIS and the fixed STAR-RIS cases; 3) the DDQN-based algorithm for mobile scenario achieves a near-optimal solution and obtains a valuable performance gain over the short-sighted greedy algorithm.
Article
In fifth-generation (5 G) and next-generation mobile communication systems, one of the main design objective is to support different types of devices with disparate requirements. While a non-orthogonal coexistence of different services on the radio interface may potentially yield higher spectral efficiency, a significant performance bottleneck is set by the mutual interference between heterogeneous devices. Interference management strategies of different complexity – ranging from treating interference as noise to successive interference cancellation – may be implemented at the base station to address this problem. In this context, this work investigates the role that intelligent reflecting surfaces (IRS) may play in shaping the interference between enhanced mobile broadband (eMBB) and ultra-reliable low-latency communications (URLLC) devices. Specifically, we investigate the joint design of power allocation and IRS reflection matrix, and draw conclusions on the relative advantages of decoding strategies at the base station by accounting for the interference management capabilities of an IRS-aided uplink.
Article
Network availability and service continuity are major concerns for network operators to provide reliable communication services for Internet of Things (IoT), which are particularly challenging to achieve in virtualized network slicing environment where network services are exposed to the failure risks of both software (virtual network function (VNF) instances) and hardware (physical nodes). In general, the redundancy-based VNF backup solutions are used to improve the reliability of virtualized network slices. However, backup VNFs require the same amount of resources as the primary VNFs, which will result in high resource cost. In this paper, we propose a joint VNF partition and hybrid backup scheme for VNF orchestration, backup and mapping, whose aim is to construct the reliability-enhanced and delay-guaranteed network slices at minimum cost. Specifically, the VNF partition method divides a single VNF into multiple thinner VNFs with lower processing capacity and is expected to enhance the reliability of network slices with less additional resources. The hybrid backup scheme includes both onsite and offsite backup forms. Then, considering the time-varying network environment and IoT service requirements, we formulate the VNF orchestration, backup and mapping as a dynamic mixed integer linear programming (DMILP) problem, and model the dynamic problem as a Markov decision process (MDP). In view of the large action space of the formulated MDP, we propose a multi-agent deep reinforcement learning (DRL) approach with an action space reduction strategy to achieve the dynamic VNF orchestration, backup and mapping solution. Simulation results demonstrate that the proposed joint VNF partition and hybrid backup scheme can obtain superior delay and reliability performance with low network cost.
Article
The aim of this paper is to characterize the impact of non-orthogonal multiple access (NOMA) on the age of information (AoI) of grant-free transmission. In particular, a low-complexity form of NOMA, termed NOMA-assisted random access, is applied to grant-free transmission in order to illustrate the two benefits of NOMA for AoI reduction, namely increasing channel access and reducing user collisions. Closed-form analytical expressions for the time average AoI achieved by NOMA assisted grant-free transmission are obtained, and asymptotic studies are carried out to demonstrate that the use of the simplest form of NOMA is already sufficient to reduce the AoI of orthogonal multiple access (OMA) by more than 40%. In addition, the developed analytical expressions are also shown to be useful for optimizing the users’ transmission attempt probabilities, which are key parameters for grant-free transmission.
Article
This paper investigates a dynamic Mobile Edge Computing (MEC) system enhanced by Unmanned Aerial Vehicles (UAVs) and Intelligent Reflective Surfaces (IRSs). We introduce a scalable resource scheduling algorithm aimed at reducing energy consumption for all User Equipments (UEs) and UAVs within the MEC system, accommodating a varying number of UAVs. To address this challenge, we present a Multi-tAsk Resource Scheduling (MARS) framework that employs Deep Reinforcement Learning (DRL). Firstly, we present a novel Advantage Actor-Critic (A2C) structure with the state-value critic and entropy-enhanced actor to reduce variance and enhance the policy search of DRL. Then, we present a multi-head agent with three different heads in which a classification head is applied to make offloading decisions and a regression head is presented to allocate computational resource, and a critic head is introduced to estimate the state value of the selected action. Next, we introduce a multi-task controller to adjust the agent to adapt to the varying number of UAVs by loading or unloading a part of the weights in the agent. Finally, a Light Wolf Search (LWS) is introduced as the action refinement to enhance the exploration in the dynamic action space. Numerical results demonstrate the feasibility and efficiency of the MARS framework.
Article
Task-oriented semantic communication (TOSC) has significant advantages in reducing the amount of data transmission and alleviating the scarcity of spectrum resources. Unlike traditional communication, the resource allocation in semantic communication is tightly linked to target intelligent tasks and specific interaction requirements. In this article, the intelligent resource allocation in a task-oriented manner is investigated. To further improve spectrum utilization and energy sustainability, a communication network combining energy harvesting (EH), cognitive radio (CR), and non-orthogonal multiple access (NOMA) is considered. This article proposes a semantic-aware resource allocation scheme in the EH-CR-NOMA scenario, where the quality of experience (QoE) is adopted as the evaluation metric. To achieve the preferential occupation of resources by data with richer semantic information, a joint optimization problem of the transmit power, time slot division factor, and semantic compression ratio of the semantic communication user is formulated. With the goal of maximizing the long-term QoE of TOSC, a two-tier deep reinforcement learning framework is designed to solve the semantic-aware resource allocation problem. By striking a trade-off between semantic rate and semantic fidelity, the proposed scheme can better satisfy user intentions.
Article
As a promising technology in the 5G era, the artificial intelligence (AI) enabled Internet of controllable things (IoCT) is expected to be an integral part of heterogeneous networks (HetNets) in the future. However, the realization of ultra-reliable low-latency communications (URLLC) in IoCT communications underlaid HetNet has stringent quality of service (QoS) requirements, resulting in unprecedented challenges for existing wireless resource allocation methods. In this paper, we first describe a cellular HetNet model with uplink IoCT communications, then formulate a dynamic mixed-integer nonlinear programming (MINLP) resource allocation problem for maximizing the long-term average energy efficiency under URLLC requirements including reliability, latency, and transmission rate. To solve the problem, we propose a decentralized MADRL-based resource allocation algorithm with a decentralized partially observable Markov decision process (dec-POMDP) and a mixed-centralized-decentralized (MCD) framework to address the partial observability and the scalability issues, respectively. In addition, we design a reward function featuring the objective decomposition, baseline-guided scaling, and QoS violation penalty so that the agents are coordinated. Extensive experiments demonstrate the convergence, scalability, and robustness of the proposed algorithm. Besides, the proposed algorithm substantially outperforms conventional resource allocation methods and different agent communication mechanisms in terms of maximizing energy efficiency.
Preprint
The aim of this paper is to characterize the impact of non-orthogonal multiple access (NOMA) on the age of information (AoI) of grant-free transmission. In particular, a low-complexity form of NOMA, termed NOMA-assisted random access, is applied to grant-free transmission in order to illustrate the two benefits of NOMA for AoI reduction, namely increasing channel access and reducing user collisions. Closed-form analytical expressions for the AoI achieved by NOMA assisted grant-free transmission are obtained, and asymptotic studies are carried out to demonstrate that the use of the simplest form of NOMA is already sufficient to reduce the AoI of orthogonal multiple access (OMA) by more than 40%. In addition, the developed analytical expressions are also shown to be useful for optimizing the users' transmission attempt probabilities, which are key parameters for grant-free transmission.
Article
Full-text available
Artificial intelligence (AI) provides a promising and novel direction to design future time-varying wireless networks by leading to significantly superior performances compared to conventional methods. In addition, the advanced deployment of unmanned aerial vehicles (UAVs) has boosted extensive novel research results and industrial products in terms of aerial-ground networks. However, with the rapid development of mobile networks and growing requirements for low-latency services, the conventional centralized aerial-ground network has failed to meet the time-varying expectations of mobile users in the dynamic network environment. To cope with the problems, the marriage of the aerial-ground network and innovative AI techniques, i.e., distributed artificial intelligence enabled aerial-ground network (DAIAGN), is proposed in this article, which consists of three vital components: deep reinforcement learning enabled distributed information sharing, edge intelligence enabled distributed security management, and multi-agent reinforcement learning enabled distributed decision making. The functions of the three components are elaborated, and recent related advances are surveyed in detail. A specific case study is also provided with respect to multi-agent reinforcement learning enabled distributed decision making. Furthermore, key challenges and open issues are also discussed to provide some guidances for potential future directions.
Article
Full-text available
This letter proposes a novel design of reconfigurable intelligent surface (RIS) to enhance the physical layer security (PLS) in the RIS-aided non-orthogonal multiple access (NOMA) network. Under the design of the RIS, the problem of increasing the number of RIS elements damaging the secrecy performance is solved. Besides, it also ensures that the networks can use traditional channel coding schemes to achieve secrecy. Our results show that the novel design of the RIS is ready for enhancing secrecy performance.
Article
Full-text available
Grant-free non-orthogonal multiple access (GFNOMA) is a potential multiple access framework for short-packet internet-of-things (IoT) networks to enhance connectivity. However, the resource allocation problem in GF-NOMA is challenging due to the absence of closed-loop power control. We design a prototype of transmit power pool (PP) to provide open-loop power control. IoT users acquire their transmit power in advance from this prototype PP solely according to their communication distances. Firstly, a multi-agent deep Q-network (DQN) aided GF-NOMA algorithm is proposed to determine the optimal transmit power levels for the prototype PP. More specifically, each IoT user acts as an agent and learns a policy by interacting with the wireless environment that guides them to select optimal actions. Secondly, to prevent the Q-learning model overestimation problem, double DQN (DDQN) based GF-NOMA algorithm is proposed. Numerical results confirm that the DDQN based algorithm finds out the optimal transmit power levels that form the PP. Comparing with the conventional online learning approach, the proposed algorithm with the prototype PP converges faster under changing environments due to limiting the action space based on previous learning. The considered GF-NOMA system outperforms the networks with fixed transmission power, namely all the users have the same transmit power and the traditional GF with orthogonal multiple access techniques, in terms of throughput.
Article
Full-text available
Non-orthogonal multiple access (NOMA) exploits the potential of power domain to enhance the connectivity for Internet of Things (IoT). Due to time-varying communication channels, dynamic user clustering is a promising method to increase the throughput of NOMA-IoT networks. This paper develops an intelligent resource allocation scheme for uplink NOMA-IoT communications. To maximise the average performance of sum rates, this work designs an efficient optimization approach based on two reinforcement learning algorithms, namely deep reinforcement learning (DRL) and SARSA-learning. For light traffic, SARSA-learning is used to explore the safest resource allocation policy with low cost. For heavy traffic, DRL is used to handle traffic-introduced huge variables. With the aid of the considered approach, this work addresses two main problems of the fair resource allocation in NOMA techniques: 1) allocating users dynamically and 2) balancing resource blocks and network traffic. We analytically demonstrate that the rate of convergence is inversely proportional to network sizes. Numerical results show that: 1) compared with the optimal benchmark scheme, the proposed DRL and SARSA-learning algorithms achieve high accuracy with low complexity and 2) NOMA-enabled IoT networks outperform the conventional orthogonal multiple access based IoT networks in terms of system throughput.
Article
Full-text available
Considering reconfigurable intelligent surfaces (RISs), we study a multi-cluster multiple-input-single-output (MISO) non-orthogonal multiple access (NOMA) downlink communication network. In the network, RISs assist the communication from the base station (BS) to all users by passive beamforming. Our goal is to minimize the total transmit power by jointly optimizing the active beamforming matrices at the BS and the reflection coefficient vector at the RISs. Because of the constraints on the RIS reflection amplitudes and phase shifts, the formulated quadratically constrained quadratic problem is highly non-convex. For the aforementioned problem, the conventional semidefinite programming (SDP) based algorithm has prohibitively high computational complexity and deteriorating performance. Here, we propose an effective second-order cone programming (SOCP)-alternating direction method of multipliers (ADMM) based algorithm to obtain the locally optimal solution. To reduce the computational complexity, we also propose a low-complexity zero-forcing based suboptimal algorithm. It is shown through simulation results that our proposed SOCP-ADMM based algorithm achieves significant performance gain over the conventional SDP based algorithm. Furthermore, when the target transmission rates of central and cell-edge users are 0.5 bps/Hz, our proposed NOMA RIS-aided system with 32 RIS elements has about 2.5 dB performance gain over the conventional massive multiple-input-multiple-output system with 64 transmit antennas. Index Terms-Alternating direction method of multipliers (ADMM), multiple-input-single-output (MISO), non-orthogonal multiple access (NOMA), reconfigurable intelligent surfaces (RIS-s), zero-forcing (ZF).
Article
Full-text available
An intelligent reflecting surface (IRS) consists of a large number of low-cost reflecting elements, which can steer the incident signal collaboratively by passive beamforming. This way, IRS reconfigures the wireless environment to boost the system performance. In this paper, we consider an IRS-assisted uplink non-orthogonal multiple access (NOMA) system. The objective is to maximize the sum rate of all users under individual power constraint. The considered problem requires a joint power control at the users and beamforming design at the IRS, and is non-convex. To handle it, semidefinite relaxation is employed, which provides a near-optimal solution. Presented numerical results show that the proposed NOMA-based scheme achieves a larger sum rate than orthogonal multiple access (OMA)-based one. Moreover, the impact of the number of reflecting elements on the sum rate is revealed.
Article
Full-text available
The key idea of non-orthogonal multiple access (NOMA) is to serve multiple users simultaneously at the same time and frequency, which can result in excessive multiple-access interference. As a crucial component of NOMA systems, successive interference cancelation (SIC) is key to combating this multiple-access interference, and is the focus of this letter, where an overview of SIC decoding order selection schemes is provided. In particular, selecting the SIC decoding order based on the users' channel state information (CSI) and the users' quality of service (QoS), respectively, is discussed. The limitations of these two approaches are illustrated, and then a recently proposed scheme, termed hybrid SIC, which dynamically adapts the SIC decoding order is presented and shown to achieve a surprising performance improvement that cannot be realized by the conventional SIC decoding order selection schemes individually.
Article
Full-text available
Reconfigurable intelligent surfaces (RISs) constitute a promising performance enhancement for next-generation (NG) wireless networks in terms of enhancing both their spectral efficiency (SE) and energy efficiency (EE). We conceive a system for serving paired power-domain non-orthogonal multiple access (NOMA) users by designing the passive beamforming weights at the RISs. In an effort to evaluate the network performance, we first derive the best-case and worst-case of new channel statistics for characterizing the effective channel gains. Then, we derive the best-case and worst-case of our closed-form expressions derived both for the outage probability and for the ergodic rate of the prioritized user. For gleaning further insights, we investigate both the diversity orders of the outage probability and the high-signal- to-noise (SNR) slopes of the ergodic rate. We also derive both the SE and EE of the proposed network. Our analytical results demonstrate that the base station (BS)-user links have almost no impact on the diversity orders attained when the number of RISs is high enough. Numerical results are provided for confirming that: i) the high-SNR slope of the RIS-aided network is one; ii) the proposed RIS-aided NOMA network has superior network performance compared to its orthogonal counterpart.
Article
Full-text available
A novel framework is proposed for enhancing the driving safety and fuel economy of autonomous vehicles (AVs) with the aid of vehicle-to-infrastructure (V2I) communication networks. The problem of driving trajectory design is formulated for minimizing the total fuel consumption while guaranteeing safe driving (by obeying the traffic rules and avoiding obstacles). In an effort to solve this pertinent problem, a deep reinforcement learning (DRL) approach is proposed for making collision-free decisions. Firstly, a deep Q-network (DQN) aided algorithm is proposed for determining the trajectory and velocity of the AV by receiving real-time traffic information from the base stations (BSs). More particularly, the AV acts as an agent to carry out optimal action such as lane change and velocity change by interacting with the environment. Secondly, to overcome the large overestimation of action values by the Q-learning model, a double deep Q-network (DDQN) algorithm is proposed by decomposing the max-Q-value operation into action selection and action evaluation. Additionally, three practical driving policies are also proposed as benchmarks. Numerical results are provided for demonstrating that the proposed trajectory design algorithms are capable of enhancing the driving safety and fuel economy of AVs. We demonstrate that the proposed DDQN based algorithm outperforms the DQN based algorithm. Additionally, it is also demonstrated that the proposed fuel-economy (FE) based driving policy derived from the DRL algorithm is capable of achieving in excess of 24\% of fuel savings over the benchmarks.
Article
Full-text available
Massive machine-type communications (mMTC) is one of the main three focus areas in the 5th generation (5G) of wireless communications technologies to enable connectivity of a massive number of internet of things (IoT) devices with little or no human intervention. In conventional human-type communications (HTC), due to the limited number of available channel resources and orthogonal resource allocation techniques, users get a transmission slot by making scheduling/connection requests. The involved control channel signaling, negligible with respect to the huge transmit data, is not a major issue. However, this may turn into a potential performance bottleneck in mMTC, where huge number of devices transmit short packet data in a sporadic way. To tackle the limited radio resources and massive connectivity challenges, non-orthogonal multiple access (NOMA) has emerged as a promising technology that allows multiple users to simultaneously transmit their data over the same channel resource. This is achieved by employing user-specific signature sequences at the transmitting devices, which are exploited by the receiver for multi-user data detection. Due to its massive connectivity potential, NOMA has also been considered to enable grant-free transmissions especially in mMTC, where devices can transmit their data whenever they need without the scheduling requests. The existing surveys majorly discuss different NOMA schemes, and exploit their potential, in typical grant-based HTC scenarios, where users are connected with the base station, and various system parameters are pre-defined in the scheduling phase. Different from these works, this survey provides a comprehensive review of the recent advances in NOMA from a grant-free connectivity perspective. Various grant-free NOMA schemes are presented, their potential and related practical challenges are highlighted, and possible future directions are thoroughly discussed at the end.
Article
Full-text available
Facing the dramatic increase of mobile devices and the scarcity of spectrum resources, grant-free non-orthogonal multiple access (NOMA) emerges as an enabling technology for massive access, which also reduces signaling overhead and access latency effectively. However, in grant-free NOMA systems, the collisions resulting from uncoordinated resource selection can cause severe interference and reduce system throughput. In this work, we apply deep reinforcement learning (DRL) in the decision making for grant-free NOMA systems, to mitigate collisions and improve the system throughput in an unknown network environment. To reduce collisions in frequency domain and the computational complexity of DRL, subchannel and device clustering are firstly designed, where a cluster of devices compete for a cluster of subchannels following grant-free NOMA. Further, discrete uplink power control is proposed to reduce intra-cluster collisions. Then, the long-term cluster throughput maximization problem is formulated as a Partially Observable Markov Decision Process (POMDP). To address the POMDP, a DRL-based grant-free NOMA algorithm is proposed to learn about network contention status and output subchannel and received power level selection with less collisions. Numerical results verify the effectiveness of the proposed algorithm and reveal that DRL-based grant-free NOMA outperforms slotted ALOHA NOMA with 32.9%, 156% performance gain on the system throughput when the number of devices is twice and five times that of the subchannels, respectively. When the number of devices is five times that of the subchannels, the success access probability of DRL-based grant-free NOMA is above 85%, compared to 33% in slotted ALOHA NOMA system.
Article
Full-text available
Transmission through reconfigurable intelligent surfaces (RISs), which control the reflection/scattering characteristics of incident waves in a deliberate manner to enhance the signal quality at the receiver, appears as a promising candidate for future wireless communication systems. In this paper, we bring the concept of RIS-assisted communications to the realm of index modulation (IM) by proposing RIS-space shift keying (RIS-SSK) and RIS-spatial modulation (RIS-SM) schemes. These two schemes are realized through not only intelligent reflection of the incoming signals to improve the reception but also utilization of the IM principle for the indices of multiple receive antennas in a clever way to improve the spectral efficiency. Maximum energy-based suboptimal (greedy) and exhaustive search-based optimal (maximum likelihood) detectors of the proposed RIS-SSK/SM schemes are formulated and a unified framework is presented for the derivation of their theoretical average bit error probability. Extensive computer simulation results are provided to assess the potential of RIS-assisted IM schemes as well as to verify our theoretical derivations. Our findings also reveal that RIS-based IM, which enables high data rates with remarkably low error rates, can become a potential candidate for future wireless communication systems in the context of beyond multiple-input multiple-output (MIMO) solutions.
Article
Full-text available
This article investigates a non-orthogonal multiple access (NOMA) enhanced Internet of Things (IoT) network. In order to provide connectivity, a novel cluster strategy is proposed, where multiple devices can be served simultaneously. Two potential scenarios are investigated: 1) NOMA enhanced terrestrial IoT networks and 2) NOMA enhanced aerial IoT networks. We utilize stochastic geometry tools to model the spatial randomness of both terrestrial and aerial devices. New channel statistics are derived for both terrestrial and aerial devices. The exact and the asymptotic expressions in terms of coverage probability are derived. In order to obtain further engineering insights, short-packet communication scenarios are investigated. From our analysis, we show that the performance of NOMA enhanced IoT networks is capable of outperforming OMA enhanced IoT networks. Moreover, based on simulation results, there exists an optimal value of the transmit power that maximizes the coverage probability.
Article
Full-text available
In this paper, a novel approach is introduced to study the achievable delay-guaranteed secrecy rate, by introducing the concept of the effective secrecy rate (ESR). This study focuses on the downlink of a non-orthogonal multiple access (NOMA) network with one base station, multiple single-antenna NOMA users and an eavesdropper. Two possible eavesdropping scenarios are considered: 1) an internal, unknown, eavesdropper in a purely antagonistic network; and 2) an external eavesdropper in a network with trustworthy peers. For a purely antagonistic network with an internal eavesdropper, the only receiver with a guaranteed positive ESR is the one with the highest channel gain. A closed-form expression is obtained for the ESR at high signal-to-noise ratio (SNR) values, showing that the strongest user’s ESR in the high SNR regime approaches a constant value irrespective of the power coefficients. Furthermore, it is shown the strongest user can achieve higher ESR if it has a distinctive advantage in terms of channel gain with respect to the second strongest user. For a trustworthy NOMA network with an external eavesdropper, a lower bound and an upper bound on the ESR are proposed and investigated for an arbitrary legitimate user. For the lower bound, a closed-form expression is derived in the high SNR regime. For the upper bound, the analysis shows that if the external eavesdropper cannot attain any channel state information (CSI), the legitimate NOMA user at high SNRs can always achieve positive ESR, and the value of it depends on the power coefficients. Simulation results numerically validate the accuracy of the derived closed-form expressions and verify the analytical results given in the theorems and lemmas.
Article
Full-text available
In this paper, resource allocation for a multi-carrier uplink non-orthogonal multiple access (NOMA) system is studied. Unlike existing works on multi-carrier uplink NOMA, in which each user is assumed to access only one subcarrier, we consider a more general scenario where the number of subcarriers allocated to a single user is not constrained. We first aim to maximize the system's sum rate, which requires to select the appropriate subcarriers for each user and distribute the transmission power. The formulated non-convex problem is transformed into a convex one, and further, an optimal and low-complexity iterative waterfilling solution is proposed. Nonetheless, it is shown that maximum transmit power is employed by each user to maximize the sum rate. Motivated by the fact that users are power-constrained, the energy efficiency (EE) maximization problem is also studied. Based on fractional programming, the EE maximization problem is transformed into a series of sum rate maximization subproblems, and the proposed iterative water-filling solution is applied to each subproblem. The proposed schemes are compared with other NOMA- and orthogonal multiple access based algorithms, and its superiority is fully validated.
Article
Full-text available
Grant-free transmission is an important feature to be supported by future wireless networks since it reduces the signalling overhead caused by conventional grant-based schemes. However, for grant-free transmission, the number of users admitted to the same channel is not capped, which can lead to a failure of multi-user detection. This paper proposes non-orthogonal multiple-access (NOMA) assisted semi-grant-free (SGF) transmission, which is a compromise between grant-free and grant-based schemes. In particular, instead of reserving channels either for grant-based users or grant-free users, we focus on an SGF communication scenario, where users are admitted to the same channel via a combination of grant-based and grant-free protocols. As a result, a channel reserved by a grant-based user can be shared by grant-free users, which improves both connectivity and spectral efficiency. Two NOMA assisted SGF contention control mechanisms are developed to ensure that, with a small amount of signalling overhead, the number of admitted grant-free users is carefully controlled and the interference from the grant-free users to the grant-based users is effectively suppressed. Analytical results are provided to demonstrate that the two proposed SGF mechanisms employing different successive interference cancelation decoding orders are applicable to different practical network scenarios.
Article
Full-text available
This two-part paper aims to quantify the cost of device activity detection in an uplink massive connectivity scenario with a large number of devices but device activities are sporadic. Part I of this paper shows that in an asymptotic massive multiple-input multiple-output (MIMO) regime, device activity detection can always be made perfect. Part II of this paper subsequently shows that despite the perfect device activity detection, there is nevertheless significant cost due to device detection in terms of overall achievable rate, because of the fact that non-orthogonal pilot sequences have to be used in order to accommodate the large number of potential devices, resulting in significantly larger channel estimation error as compared to conventional massive MIMO systems with orthogonal pilots. Specifically, this paper characterizes each active user's achievable rate using random matrix theory under either maximal-ratio combining (MRC) or minimum mean-squared error (MMSE) receive beamforming at the base-station (BS), assuming the statistics of their estimated channels as derived in Part I. The characterization of user rate also allows the optimization of pilot sequences length. Moreover, in contrast to the conventional massive MIMO system, the MMSE beamforming is shown to achieve much higher rate than the MRC beamforming for the massive connectivity scenario under consideration. Finally, this paper illustrates the necessity of user scheduling for rate maximization when the number of active users is larger than the number of antennas at the BS. Index Terms Beamforming, massive connectivity, massive multiple-input multiple-output (MIMO), random matrix theory, large-system analysis, Internet-of-Things (IoT), machine-type communication (MTC).
Article
Full-text available
This two-part paper considers an uplink massive device communication scenario in which a large number of devices are connected to a base-station (BS), but user traffic is sporadic so that in any given coherence interval, only a subset of users are active. The objective is to quantify the cost of active user detection and channel estimation and to characterize the overall achievable rate of a grant-free two-phase access scheme in which device activity detection and channel estimation are performed jointly using pilot sequences in the first phase and data is transmitted in the second phase. In order to accommodate a large number of simultaneously transmitting devices, this paper studies an asymptotic regime where the BS is equipped with a massive number of antennas. The main contributions of Part I of this paper are as follows. First, we note that as a consequence of having a large pool of potentially active devices but limited coherence time, the pilot sequences cannot all be orthogonal. However, despite this non-orthogonality, this paper shows that in the asymptotic massive multiple-input multiple-output (MIMO) regime, both the missed device detection and the false alarm probabilities for activity detection can always be made to go to zero by utilizing compressed sensing techniques that exploit sparsity in the user activity pattern. Part II of this paper further characterizes the achievable rates using the proposed scheme and quantifies the cost of using non-orthogonal pilot sequences for channel estimation in achievable rates. Index Terms Compressed sensing, approximate message passing (AMP), state evolution, massive connectivity, massive multiple-input multiple-output (MIMO), Internet-of-Things (IoT), machine-type communication (MTC).
Article
Full-text available
In the general (nonlocally convex) case we prove a Stone–Weierstrass-type theorem for sets of continuous vector-valued functions on Hausdorff topological spaces whose compact subsets have finite Lebesgue covering dimension (topological dimension). For such topological space T and Hausdorff topological vector space X (real or complex), in the presence of a separating set of multipliers, the theorem characterizes the closure of a subset in: C(T,X) (resp. C0(T,X) and Cb(T,X)), endowed with the compact-open topology (respectively, the uniform convergence topology and the strict topology). Applications include a Stone–Weierstrass theorem for vector subspaces, range-support uniform approximation results (under constraints on both the range and the support of the approximant function), extension theorems for vector-valued functions, and a short proof of a Schauder-type fixed point theorem. Our noncompact version of the Stone–Weierstrass theorem has significant consequences, among which we mention the extension theorem for vector-valued functions defined on closed subsets of paracompact spaces.
Article
Full-text available
This paper investigates the application of non-orthogonal multiple access (NOMA) in millimeter wave (mmWave) communications by exploiting beamforming, user scheduling and power allocation. Random beamforming is invoked for reducing the feedback overhead of considered systems. A non-convex optimization problem for maximizing the sum rate is formulated, which is proved to be NP-hard. The branch and bound (BB) approach is invoked to obtain the ϵ-optimal power allocation policy, which is proved to converge to a global optimal solution. To elaborate further, a low complexity suboptimal approach is developed for striking a good computational complexity-optimality tradeoff, where matching theory and successive convex approximation (SCA) techniques are invoked for tackling the user scheduling and power allocation problems, respectively. Simulation results reveal that: i) the proposed low complexity solution achieves a near-optimal performance; and ii) the proposed mmWave NOMA systems is capable of outperforming conventional mmWave orthogonal multiple access (OMA) systems in terms of sum rate and the number of served users.
Article
Full-text available
In the framework of fully cooperative multi-agent systems, independent (non-communicative) agents that learn by reinforcement must overcome several difficulties to manage to coordinate. This paper identifies several challenges responsible for the non-coordination of independent agents: Pareto-selection, non-stationarity, stochasticity, alter-exploration and shadowed equilibria. A selection of multi-agent domains is classified according to those challenges: matrix games, Boutilier's coordination game, predators pursuit domains and a special multi-state game. Moreover, the performance of a range of algorithms for independent reinforcement learners is evaluated empirically. Those algorithms are Q-learning variants: decentralized Q-learning, distributed Q-learning, hysteretic Q-learning, recursive frequency maximum Q-value and win-or-learn fast policy hill climbing. An overview of the learning algorithms’ strengths and weaknesses against each challenge concludes the paper and can serve as a basis for choosing the appropriate algorithm for a new domain. Furthermore, the distilled challenges may assist in the design of new learning algorithms that overcome these problems and achieve higher performance in multi-agent applications.
Article
The fundamental intelligent reflecting surface (IRS) deployment problem is investigated for IRS-assisted networks, where one IRS is arranged to be deployed in a specific region for assisting the communication between an access point (AP) and multiple users. Specifically, three multiple access schemes are considered, namely non-orthogonal multiple access (NOMA), frequency division multiple access (FDMA), and time division multiple access (TDMA). The weighted sum rate maximization problem for joint optimization of the deployment location and the reflection coefficients of the IRS as well as the power allocation at the AP is formulated. The non-convex optimization problems obtained for NOMA and FDMA are solved by employing monotonic optimization and semidefinite relaxation to find a performance upper bound. The problem obtained for TDMA is optimally solved by leveraging the time-selective nature of the IRS. Furthermore, for all three multiple access schemes, low-complexity suboptimal algorithms are developed by exploiting alternating optimization and successive convex approximation techniques, where a local region optimization method is applied for optimizing the IRS deployment location. Numerical results are provided to show that: 1) near-optimal performance can be achieved by the proposed suboptimal algorithms; 2) asymmetric and symmetric IRS deployment strategies are preferable for NOMA and FDMA/TDMA, respectively; 3) the performance gain achieved with IRS can be significantly improved by optimizing the deployment location.
Article
A novel framework is proposed for cellular offloading with the aid of multiple unmanned aerial vehicles (UAVs), while non-orthogonal multiple access (NOMA) technique is employed at each UAV to further improve the spectrum efficiency of the wireless network. The optimization problem of joint three-dimensional (3D) trajectory design and power allocation is formulated for maximizing the throughput. Since ground mobile users are considered as roaming continuously, the UAVs need to be re-deployed timely based on the movement of users. In an effort to solve this pertinent dynamic problem, a K-means based clustering algorithm is first adopted for periodically partitioning users. Afterward, a mutual deep Q-network (MDQN) algorithm is proposed to jointly determine the optimal 3D trajectory and power allocation of UAVs. In contrast to the conventional deep Q-network (DQN) algorithm, the MDQN algorithm enables the experience of multi-agent to be input into a shared neural network to shorten the training time with the assistance of state abstraction. Numerical results demonstrate that: 1) the proposed MDQN algorithm is capable of converging under minor constraints and has a faster convergence rate than the conventional DQN algorithm in the multi-agent case; 2) The achievable sum rate of the NOMA enhanced UAV network is 23% superior to the case of orthogonal multiple access (OMA); 3) By designing the optimal 3D trajectory of UAVs with the MDON algorithm, the sum rate of the network enjoys 142% and 56% gains than invoking the circular trajectory and the 2D trajectory, respectively.
Article
Reconfigurable intelligent surfaces (RISs), also known as intelligent reflecting surfaces (IRSs), or large intelligent surfaces (LISs), 1 have received significant attention for their potential to enhance the capacity and coverage of wireless networks by smartly reconfiguring the wireless propagation environment. Therefore, RISs are considered a promising technology for the sixth-generation (6G) of communication networks. In this context, we provide a comprehensive overview of the state-of-the-art on RISs, with focus on their operating principles, performance evaluation, beamforming design and resource management, applications of machine learning to RIS-enhanced wireless networks, as well as the integration of RISs with other emerging technologies. We describe the basic principles of RISs both from physics and communications perspectives, based on which we present performance evaluation of multiantenna assisted RIS systems. In addition, we systematically survey existing designs for RIS-enhanced wireless networks encompassing performance analysis, information theory, and performance optimization perspectives. Furthermore, we survey existing research contributions that apply machine learning for tackling challenges in dynamic scenarios, such as random fluctuations of wireless channels and user mobility in RIS-enhanced wireless networks. Last but not least, we identify major issues and research opportunities associated with the integration of RISs and other emerging technologies for applications to next-generation networks. 1 Without loss of generality, we use the name of RIS in the remainder of this paper. </fn
Article
Grant-free (GF) transmission holds promise in terms of low latency communication by directly transmitting messages without waiting for any permissions. However, collision situations may frequently happen when limited spectrum is occupied by numerous GF users. The non-orthogonal multiple access (NOMA) technique can be a promising solution to achieve massive connectivity and fewer collisions for GF transmission by multiplexing users in power domain. We utilize a semi-grant-free (semi-GF) NOMA scheme for enhancing network connectivity and spectral efficiency by enabling grant-based (GB) and GF users to share the same spectrum resources. With the aid of semi-GF protocols, uplink NOMA networks are investigated by invoking stochastic geometry techniques. We propose a novel dynamic protocol to interpret which part of the GF users are allocated in NOMA transmissions via transmitting various channel quality thresholds by an added handshake. We utilize open-loop protocol with a fixed average threshold as the benchmark to investigate performance improvement. It is observed that dynamic protocol provides more accurate channel quality thresholds than open-loop protocol, thereby the interference from the GF users is reduced to a large extent. We analyze the outage performance and diversity gains under two protocols. Numerical results demonstrate that dynamic protocol is capable of enhancing the outage performance than open-loop protocol.
Article
A novel reconfigurable intelligent surface (RIS) aided non-orthogonal multiple access (NOMA) downlink transmission framework is proposed. We formulate a long-term stochastic optimization problem that involves a joint optimization of NOMA user partitioning and RIS phase shifting, aiming at maximizing the sum data rate of the mobile users (MUs) in NOMA downlink networks. To solve the challenging joint optimization problem, we invoke a modified object migration automation (MOMA) algorithm to partition the users into equal-size clusters. To optimize the RIS phase shifting matrix, we propose a deep deterministic policy gradient (DDPG) algorithm to collaboratively control multiple reflecting elements (REs) of the RIS. Different from conventional training-then-testing processing, we consider a long-term self-adjusting learning model where the intelligent agent is capable of learning the optimal action for every given state through exploration and exploitation. Extensive numerical results demonstrate that: 1) The proposed RIS-aided NOMA downlink framework achieves enhanced sum data rate compared with the conventional orthogonal multiple access (OMA) framework. 2) The proposed DDPG algorithm is capable of learning a dynamic resource allocation policy in a long-term manner. 3) The performance of the proposed RIS-aided NOMA framework can be improved by increasing the granularity of the RIS phase shifts. The numerical results also show that increasing the number of reflecting elements (REs) is an efficient method to improve the sum data rate of the MUs.
Article
Semi-grant-free (SGF) transmission has recently received significant attention due to its capability to accommodate massive connectivity and reduce access delay by admitting grant-free users to channels that would otherwise be solely occupied by grant-based users. In this paper, a new SGF transmission scheme that exploits the flexibility in choosing the decoding order in non-orthogonal multiple access (NOMA) is proposed. Compared to existing SGF schemes, this new scheme can ensure that admitting the grant-free users is completely transparent to the grant-based users, i.e., the grant-based users’ quality-of-service experience is guaranteed to be the same as for orthogonal multiple access. In addition, compared to existing SGF schemes, the proposed SGF scheme can significantly improve the robustness of the grant-free users’ transmissions and effectively avoid outage probability error floors. To facilitate the performance evaluation of the proposed SGF transmission scheme, an exact expression for the outage probability is obtained and an asymptotic analysis is conducted to show that the achievable multi-user diversity gain is proportional to the number of participating grant-free users. Computer simulation results demonstrate the performance of the proposed SGF transmission scheme and verify the accuracy of the developed analytical results.
Article
The intrinsic integration of the nonorthogonal multiple access (NOMA) and reconfigurable intelligent surface (RIS) techniques is envisioned to be a promising approach to significantly improve both the spectrum efficiency and energy efficiency for future wireless communication networks. In this paper, the physical layer security (PLS) for a RIS-aided NOMA 6G networks is investigated, in which a RIS is deployed to assist the two "dead zone" NOMA users and both internal and external eavesdropping are considered. For the scenario with only internal eavesdropping, we consider the worst case that the near-end user is untrusted and may try to intercept the information of far-end user. A joint beamforming and power allocation sub-optimal scheme is proposed to improve the system PLS. Then we extend our work to a scenario with both internal and external eavesdropping. Two sub-scenarios are considered in this scenario: one is the sub-scenario without channel state information (CSI) of eavesdroppers, and another is the sub-scenario where the eavesdroppers' CSI are available. For the both sub-scenarios, a noise beamforming scheme is {\blue introduced to be against} the external eavesdroppers. An optimal power allocation scheme is proposed to further improve the system physical security for the second sub-scenario. Simulation results show the superior performance of the proposed schemes. Moreover, it has also been shown that increasing the number of reflecting elements can bring more gain in secrecy performance than that of the transmit antennas.