Table 4 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Source publication
Current trends in interconnecting myriad smart objects to monetize on Internet of Things applications have led to high-density communications in wireless sensor networks. This aggravates the already over-congested unlicensed radio bands, calling for new mechanisms to improve spectrum management and energy efficiency, such as transmission power cont...
Citations
... Robust communications [6] and optimised power consumption [7] are critical objectives considered for multi-radio implementations. As such, these themes form the basis of the approach presented in this work. ...
... Chincoli and Liotta [7] also employed Q-learning but for transmission power control of a single radio. The reward function proposed was a combination of discrete power levels and a linearly quantised packet reception rate. ...
... The online learning performance of the WAMO-SARSA agent was investigated with two exploration strategies. The first strategy reduced the exploration parameter over time to a minimum value as used in [7,8] and is referred to as the decayed exploration rate. The second was the multi-objective VDBE proposed in Section 3.3, which is referred to as the adaptive exploration rate. ...
The advent of the Internet of Things (IoT) has triggered an increased demand for sensing devices with multiple integrated wireless transceivers. These platforms often support the advantageous use of multiple radio technologies to exploit their differing characteristics. Intelligent radio selection techniques allow these systems to become highly adaptive, ensuring more robust and reliable communications under dynamic channel conditions. In this paper, we focus on the wireless links between devices equipped by deployed operating personnel and intermediary access-point infrastructure. We use multi-radio platforms and wireless devices with multiple and diverse transceiver technologies to produce robust and reliable links through the adaptive control of available transceivers. In this work, the term ‘robust’ refers to communications that can be maintained despite changes in the environmental and radio conditions, i.e., during periods of interference caused by non-cooperative actors or multi-path or fading conditions in the physical environment. In this paper, a multi-objective reinforcement learning (MORL) framework is applied to address a multi-radio selection and power control problem. We propose independent reward functions to manage the trade-off between the conflicting objectives of minimised power consumption and maximised bit rate. We also adopt an adaptive exploration strategy for learning a robust behaviour policy and compare its online performance to conventional methods. An extension to the multi-objective state–action–reward–state–action (SARSA) algorithm is proposed to implement this adaptive exploration strategy. When applying adaptive exploration to the extended multi-objective SARSA algorithm, we achieve a 20% increase in the F1 score in comparison to one with decayed exploration policies.
... As a result, wireless sensor networks have found extensive applications in various aspects of our lives, such as medical, industrial manufacturing, and the Internet of Things (IoT) [1][2][3][4]. However, there are still issues that need to be addressed in wireless communication systems, such as equipment service life, complex environments, and the impact of channel states on energy harvesting and information transmission [5,6]. ...
This paper investigates the problem of RF energy harvesting in wireless sensor networks, with the aim of finding a suitable communication protocol by comparing the performance of the system under different protocols. The network is made up of two parts: first, at the beginning of each timeslot, the sensor nodes harvest energy from the base station (BS) and then send packets to the BS using the harvested energy. For the energy-harvesting part of the wireless sensor network, we consider two methods: point-to-point and multi-point-to-point energy harvesting. For each method, we use two independent control protocols, namely head harvesting energy of each timeslot (HHT) and head harvesting energy of dedicated timeslot (HDT). Additionally, for complex channel states, we derive the cumulative distribution function (CDF) of packet transmission time using selective combining (SC) and maximum ratio combining (MRC) techniques. Analytical expressions for system reliability and packet timeout probability are obtained. At the same time, we also utilize the Monte Carlo simulation method to simulate our system and have analyzed both the numerical and simulation solutions. Results show that the performance of the HHT protocol is better than that of the HDT protocol, and the MRC technology outperforms the SC technology for the HHT protocol in terms of the energy-harvesting efficiency coefficient, sensor positions, transmit signal-to-noise ratio (SNR), and length of energy harvesting time.
... We could be saying that in data transmission blocking, Wireless Sensor Network nodes need to supply more power to the radio transceiver. Therefore, to overcome the blocking, the network needs to be self-learning power control [18]. ...
The Wireless Sensor Network needs to become a dynamic and adaptive network to conserve energy stored in the wireless sensor network node battery. This dynamic and adaptive network sometimes are called SON (Self Organizing Network). Several SON concepts have been developed such as routing, clustering, intrusion detection, and other. Although several SON concepts already exist, however, there is no concept for SON in dynamic radio configuration. Therefore, the authors' contribution to this field would be proposing a dynamic and adaptive Wireless Sensor Network node radio configuration. The significance of their work lies in the modelling of the SON network that builds based on our measurement in the real-world jungle environment. The authors propose input parameters such as SNR, the distance between the transmitter and receiver, and frequency as the static parameter. For adaptive parameters, we propose bandwidth, spreading factor, and its most important parameter such as power for data transmission. Using the Lev-enberg Marquardt Artificial Neural Network (LM-ANN) self-organise Network model, power reduction and optimisation from 20 dBm to 14.9 dBm for SNR 3, to 11.5 dBm for SNR 6, and to 12.9 dBm for SNR 9 all within a 100-m range can be achieved. With this result, the authors conclude that we can use LM-ANN for the wireless sensor network SON model in the jungle environment.
... The lack of a standardized parking scheme also causes problems. [18,19] The increasing popularity of linking disparate smart objects for the sake of Internet of Things applications has led to an increase in the communication density of wireless sensor networks. More people using the same unlicensed radio channels will create congestion, prompting research into better ways to manage the airwaves and save power, such as transmission power control. ...
... In [4], Dai et al. investigated the joint optimization of base station (BS) clustering and power control for non-orthogonal multiple access (NOMA)-enabled coordinated multipoint (CoMP) transmission in dense cellular networks, maximizing the sum rate of the system. In addition, in terms of wireless sensor networks (WSNs), Ref. [5] investigated how machine learning could be used to reduce the possible transmission power level of wireless nodes and, in turn, satisfy the quality requirements of the overall network. Reducing the transmission power has benefits in terms of both energy consumption and interference. ...
The intensity of radio waves decays rapidly with increasing propagation distance, and an edge server’s antenna needs more power to form a larger signal coverage area. Therefore, the power of the edge server should be controlled to reduce energy consumption. In addition, edge servers with capacitated resources provide services for only a limited number of users to ensure the quality of service (QoS). We set the signal transmission power for the antenna of each edge server and formed a signal disk, ensuring that all users were covered by the edge server signal and minimizing the total power of the system. This scenario is a typical geometric set covering problem, and even simple cases without capacity limits are NP-hard problems. In this paper, we propose a primal–dual-based algorithm and obtain an m-approximation result. We compare our algorithm with two other algorithms through simulation experiments. The results show that our algorithm obtains a result close to the optimal value in polynomial time.
... In [4], Dai et al. investigated the joint optimization of base station (BS) clustering and power control for non-orthogonal multiple access (NOMA)enabled coordinated multipoint (CoMP) transmission in dense cellular networks, maximizing the sum rate of the system. In addition, in terms of wireless sensor networks (WSNs), [5] investigated how machine learning could be used to reduce the possible transmission power level of wireless nodes and, in turn, satisfy the quality requirements of the overall network. Reducing the transmission power has benefits in terms of both energy consumption and interference. ...
The intensity of radio waves decays rapidly with increasing propagation distance, and an edge server's antenna needs more power to form a larger signal coverage area. Therefore, the power of the edge server should be controlled to reduce energy consumption. In addition, edge servers with capacitated resources provide services for only a limited number of users to ensure the quality of service (QoS). We set the signal transmission power for the antenna of each edge server and formed a signal disk, ensuring that all users were covered by the edge server signal and minimizing the total power of the system. This scenario is a typical geometric set covering problem, and even simple cases without capacity limits are NP-hard problems. In this paper, we propose a primal-dual-based algorithm and obtain an $m$-approximation result. We compare our algorithm with two other algorithms through simulation experiments. The results show that our algorithm obtains a result close to the optimal value in polynomial time.
... In [10], Dai et al. investigated the joint optimization of BS clustering and power control for NOMA-enabled CoMP transmission in dense cellular networks to maximize system sum-rate. In addition, the scope of [7] was to investigate how machine learning may be used to bring wireless nodes to the lowest possible transmission power level and, in turn, to respect the quality requirements of the overall network. ...
... * ← arg min( : ∈ D)( ≤ˆis established).6: * ← the central AP of * .7: * ← * .8: ...
Terminal devices (TDs) connect to networks through access points (APs) integrated into the edge server. This provides a prerequisite for TDs to upload tasks to cloud data centers or offload them to edge servers for execution. In this process, signal coverage, data transmission, and task execution consume energy, and the energy consumption of signal coverage increases sharply as the radius increases. Lower power leads to less energy consumption in a given time segment. Thus, power control for APs is essential for reducing energy consumption. Our objective is to determine the power assignment for each AP with same capacity constraints such that all TDs are covered, and the total power is minimized. We define this problem as a \emph{minimum power capacitated cover } (MPCC) problem and present a \emph{minimum local ratio} (MLR) power control approach for this problem to obtain accurate results in polynomial time. Power assignments are chosen in a sequence of rounds. In each round, we choose the power assignment that minimizes the ratio of its power to the number of currently uncovered TDs it contains. In the event of a tie, we pick an arbitrary power assignment that achieves the minimum ratio. We continue choosing power assignments until all TDs are covered. Finally, various experiments verify that this method can outperform another greedy-based way.
... The values for the learning parameters were obtained by checking 80 learning parameter combinations and conducting 10 rounds of simulations for the same combination.The value range for each variable was selected after investigating several previous works in the literature such as [38], [39]. For ϵ, the values 0.1, 0.2, 0.4, 0.6, and 0.8, for γ the values 0, 0.3, 0.5, and 0.8 and for α the values 0.2, 0.4, 0.6, and 0.8 were tested to obtain the best combination with a randomly deployed 10nodes network. ...
Wireless sensor networks (WSN) are widely used for multi-disciplinary applications. According to the requirements and the goal of the application, the network is designed and the protocol is tuned to obtain the best performance of the WSN. In real world applications, all nodes in the network have a common protocol parameter set, irrespective of their position in the network. In several experiments with multihop sensor networks, we observed that individual nodes perform differently depending on the protocol parameter values. With the observation the question was raised whether the performance of the network can be improved by using tuned parameter sets for each individual node in the network. Tuning protocol parameters for each node manually is tedious and may not be practical for large number of nodes. As a solution, adaptive protocol parameters are introduced using reinforcement learning. The learning algorithm gradually approaches an optimal set of protocol parameter values for each and every node during the runtime resulting in average improved network performance with 13.44% and 29.41% compared to networks with static common parameter sets in a network of 20 and 30 nodes respectively in simulation environment. The performance of the adaptive protocol is validated using real testbed with 10 nodes and the performance improvement is 16.21%. With the simulation results it was observed that networks with higher number of nodes obtain more performance gain using the adaptive protocol algorithm compared to networks with lower of nodes.
... The advanced automatic systems are trained to make intelligent decisions using RL schemes where RL agent communicates with an unknown environment directly by receiving feedback as a reward or penalty corresponding to the quality of taken action to perform a specific job. 30 Being an RL agent, each UAV adopts the environment through a selflearning procedure and decides an action according to its learning experiences. Conventionally, 19 the optimal placement of UAVs, channel allocation, and device association have been obtained by a UAV-assisted relaying system under the offline framework when the devices know the exact ground to air channel conditions. ...
Unmanned aerial vehicle (UAV)‐aided aerial base stations have emerged as a promising technique to provide rapid on‐demand wireless coverage for ground communicating devices in a geographical area. However, existing works on UAV‐enabled wireless communication systems overlook optimal deployment of UAVs under quality of service (QoS)‐aware device‐to‐device (D2D) communication. Therefore, this work proposes a UAV‐supported self‐organized device‐to‐device (USSD2D) network that employs multiple UAVs as relays for reliable D2D data transmission. The aim is to maximize the total instantaneous transmission rate of the USSD2D network by jointly optimizing devices' association with UAV, UAVs' channel selection, and their deployed location under signal‐to‐interference‐noise ratio (SINR) threshold. As this joint optimization problem is nonconvex and combinatorial, the formulated problem is transformed into a Markov decision process (MDP) that effectively splits up it into three individual optimization subproblems: devices association, UAVs' channel selection indicator, and UAVs' location at each instance. Finally, a reinforcement learning (RL) based on a low‐complexity iterative state–action–reward–state–action (SARSA) algorithm is developed to update UAVs' policy to solve this formulated problem. UAVs adopt the system parameters according to the current state and corresponding action to maximize the generated long‐term discounted reward under the current policy without prior knowledge about the environment. Numerical results validate the proposed approaches and provide various insights on optimal UAV deployment. This investigation demonstrates that the total instantaneous transmission rate of the USSD2D network can be improved by 75.25%, 51.31%, and 13.96% with respect to RS‐FORD, ES‐FIRD, and AOIV schemes, respectively.
... In [88] a Transmission Power Control (QL-TPC) approach has been proposed to adapt, by learning, the transmission power values in different conditions. Every RL agent is a player in a common interest game. ...
... Through this survey, we found that both RL and DRL algorithms are able to enhance network performance, such as lower transmission power [67,79,88], provide better routing decision-making as the works described in the Section 4.1, and higher throughput [76,91,92], and these in various wireless networks, including underwater wireless sensor network [64][65][66], internet of vehicles [56,78,81], cellular network [85,86], Wireless Body Area Networks [79], etc. Reinforcement learning models allow a wireless node to take as input its local observable environment and subsequently learn effectively from its collected data to prioritize the right experience and choose the best next decision. The Deep RL approach, a mix between RL and DL, allows for making better decisions when facing high-dimensional problems and to solve scalability issues by using neural networks, one of the challenges, for example, in edge caching methods. ...
... To evaluate DRL based approaches, the authors turn more towards tools using python as a programming language, such as TensorFlow [109] and OpenAi-Gym [110], which offer several libraries for ML and DL. In addition to the simulation results, the authors in [54,67,73,80,88,89] have evaluated the performance of their approaches in real-world experiments. This gives more realistic results of the proposed approaches performances for IoT networks. ...
Nowadays, many research studies and industrial investigations have allowed the integration of the Internet of Things (IoT) in current and future networking applications by deploying a diversity of wireless-enabled devices ranging from smartphones, wearables, to sensors, drones, and connected vehicles. The growing number of IoT devices, the increasing complexity of IoT systems, and the large volume of generated data have made the monitoring and management of these networks extremely difficult. Numerous research papers have applied Reinforcement Learning (RL) and Deep Reinforcement Learning (DRL) techniques to overcome these difficulties by building IoT systems with effective and dynamic decision-making mechanisms, dealing with incomplete information related to their environments. The paper first reviews pre-existing surveys covering the application of RL and DRL techniques in IoT communication technologies and networking. The paper then analyzes the research papers that apply these techniques in wireless IoT to resolve issues related to routing, scheduling, resource allocation, dynamic spectrum access, energy, mobility, and caching. Finally, a discussion of the proposed approaches and their limits is followed by the identification of open issues to establish grounds for future research directions proposal.