Figure 2 - uploaded by Piotr Gawlowicz
Content may be subject to copyright.
Architecture of ns3-gym framework.

Architecture of ns3-gym framework.

Source publication
Conference Paper
Full-text available
Recently, we have seen a boom of attempts to improve the operation of networking protocols using machine learning techniques. The proposed reinforcement learning (RL) based control solutions very often overtake traditionally designed ones in terms of performance and efficiency. However, in order to reach such a superb level, an RL control agent req...

Contexts in source publication

Context 1
... architecture of ns3-gym as depicted in Fig. 2 consists of the following major components, namely: ns-3 network simulator and OpenAI Gym framework. The former one is used to implement environments, while the latter one unifies their interface. The main contribution of this work is the design and implementation of a generic interface between OpenAI Gym and ns-3 that allows for ...
Context 2
... typical workflow of developing and training an RL-based agent is shown as numbers in Fig. 2: (1) Create a model of the network and configure scenario conditions (i.e. traffic, mobility, etc.) using standard functions of ns-3; (2) Instantiate ns3-gym environment gateway in the simulation, i.e. create OpenGymGateway object and implement callbacks functions that collect a state of the environment to be shared with the agent and ...
Context 3
... architecture of ns3-gym as depicted in Fig. 2 consists of the following major components, namely: ns-3 network simulator and OpenAI Gym framework. The former one is used to implement environments, while the latter one unifies their interface. The main contribution of this work is the design and implementation of a generic interface between OpenAI Gym and ns-3 that allows for ...
Context 4
... typical workflow of developing and training an RL-based agent is shown as numbers in Fig. 2: (1) Create a model of the network and configure scenario conditions (i.e. traffic, mobility, etc.) using standard functions of ns-3; (2) Instantiate ns3-gym environment gateway in the simulation, i.e. create OpenGymGateway object and implement callbacks functions that collect a state of the environment to be shared with the agent and ...

Similar publications

Article
Full-text available
For an efficient design of wireless local-area networks (WLANs), the simulation tools are important to accurately estimate the IEEE 802.11n/ac link features for WLANs. However, this true simulation of network behavior is critical in designing high-performance WLANs. Through testing, analysis, and modeling of the proposed scheme repetitively, the de...
Article
Full-text available
5G Ultra-Dense Networks (UDNs) will involve massive deployment of small cells which in turn form complex backhaul network. This backhaul network must be energy efficient for the 5G UDN network to be green. V-band and E-band mmWave technologies are among the wireless backhaul solutions tipped for 5G UDN. In this paper, we have compared the performan...
Article
Full-text available
The quality of inter-network communication is often detrimentally affected by the large deployment of heterogeneous networks, including Long Fat Networks, as a result of wireless media introduction. Legacy transport protocols assume an independent wired connection to the network. When a loss occurs, the protocol considers it as a congestion loss, d...
Article
Full-text available
New Radio-based access to Unlicensed spectrum (NR-U) intends to expand the applicability of 5G NR access technology to support operation in unlicensed bands by adhering to Listen-Before-Talk (LBT) requirement for accessing the channel. As the NR-U specification is being developed, simulations to assess the performance of NR-U and IEEE 802.11 techno...
Article
Full-text available
Network simulators are used for the research and development of several types of networks. However, one of the limitations of these simulators is the usage of simplified theoretical models of the Packet Error Rate (PER) at the Physical Layer (PHY) of the IEEE 802.11 family of wireless standards. Although the simplified PHY model can significantly r...

Citations

... In this section, we use the NS3 Simulator and RL framework ns3gym [30] to verify our protocol in comparison with the federated reinforcement multiple access (FRMA) scheme [24] and conventional CSMA/CA. ...
Article
Full-text available
In dynamic wireless networks, nodes move in large-scale spaces with different communications scenarios, including network traffic and unpredicted link state change. However, optimizing multi-user access mechanisms in multiple scenarios to maximize aggregate throughput still remains a practically essential and challenging issue. An efficient method to predict channel conditions and adapt to different communication environments for better performance in real-time is necessary. In this paper, we propose a novel Q-learning based MAC protocol using an intelligent backoff selection scheme to adaptively make decisions by evaluating rewards and variable learning parameters. Furthermore, an efficient channel observation scheme is proposed to optimize real-time decision-making more accurately with better assessment of channel states in different communication environments. Two typical wireless networks, i.e., wireless local area networks with dense users as infrastructure networks and mobile ad hoc networks with changing topologies as infrastructureless networks, are taken into account in simulations to show that the proposed protocol achieves significant performance improvement in terms of both aggregate throughput and packet loss rate with strong environmental adaptability.
... At the same time, we think that the development of a network simulation environment is a huge and meticulous project that should be focused on the study of congestion control algorithms; therefore, we chose the ns-3 platform to build the training and testing environment. We built our own simulation environment based on NS3-gym [24] and Quic-NS-3 [25] to test PBQ-enhanced QUIC. ...
Article
Full-text available
Currently, the most widely used protocol for the transportation layer of computer networks for reliable transportation is the Transmission Control Protocol (TCP). However, TCP has some problems such as high handshake delay, head-of-line (HOL) blocking, and so on. To solve these problems, Google proposed the Quick User Datagram Protocol Internet Connection (QUIC) protocol, which supports 0-1 round-trip time (RTT) handshake, a congestion control algorithm configuration in user mode. So far, the QUIC protocol has been integrated with traditional congestion control algorithms, which are not efficient in numerous scenarios. To solve this problem, we propose an efficient congestion control mechanism on the basis of deep reinforcement learning (DRL), i.e., proximal bandwidth-delay quick optimization (PBQ) for QUIC, which combines traditional bottleneck bandwidth and round-trip propagation time (BBR) with proximal policy optimization (PPO). In PBQ, the PPO agent outputs the congestion window (CWnd) and improves itself according to network state, and the BBR specifies the pacing rate of the client. Then, we apply the presented PBQ to QUIC and form a new version of QUIC, i.e., PBQ-enhanced QUIC. The experimental results show that the proposed PBQ-enhanced QUIC achieves much better performance in both throughput and RTT than existing popular versions of QUIC, such as QUIC with Cubic and QUIC with BBR.
... For the experiments, we use MR-iNet Gym [19] that uses ns3-gym [20]. This includes our custom DS-CDMA module for ns-3 to simulate a distributed LPI/D wireless network controlled by RL running in an OpenAI-Gym. ...
Preprint
Full-text available
We tackle the problem of joint frequency and power allocation while emphasizing the generalization capability of a deep reinforcement learning model. Most of the existing methods solve reinforcement learning-based wireless problems for a specific pre-determined wireless network scenario. The performance of a trained agent tends to be very specific to the network and deteriorates when used in a different network operating scenario (e.g., different in size, neighborhood, and mobility, among others). We demonstrate our approach to enhance training to enable a higher generalization capability during inference of the deployed model in a distributed multi-agent setting in a hostile jamming environment. With all these, we show the improved training and inference performance of the proposed methods when tested on previously unseen simulated wireless networks of different sizes and architectures. More importantly, to prove practical impact, the end-to-end solution was implemented on the embedded software-defined radio and validated using over-the-air evaluation.
... In addition, an adaptive rate data mode is considered with a UDP downlink traffic. We implement our proposed solutions using ns-3 and also we use OpenAI Gym to interface between ns-3 and the MA-MAB solution [41]. In Table II and Table III we present the learning hyperparameters and network settings parameters, respectively. ...
Preprint
Full-text available
The exponential increase of wireless devices with highly demanding services such as streaming video, gaming and others has imposed several challenges to Wireless Local Area Networks (WLANs). In the context of Wi-Fi, IEEE 802.11ax brings high-data rates in dense user deployments. Additionally, it comes with new flexible features in the physical layer as dynamic Clear-Channel-Assessment (CCA) threshold with the goal of improving spatial reuse (SR) in response to radio spectrum scarcity in dense scenarios. In this paper, we formulate the Transmission Power (TP) and CCA configuration problem with an objective of maximizing fairness and minimizing station starvation. We present four main contributions into distributed SR optimization using Multi-Agent Multi-Armed Bandits (MAMABs). First, we propose to reduce the action space given the large cardinality of action combination of TP and CCA threshold values per Access Point (AP). Second, we present two deep Multi-Agent Contextual MABs (MA-CMABs), named Sample Average Uncertainty (SAU)-Coop and SAU-NonCoop as cooperative and non-cooperative versions to improve SR. In addition, we present an analysis whether cooperation is beneficial using MA-MABs solutions based on the e-greedy, Upper Bound Confidence (UCB) and Thompson techniques. Finally, we propose a deep reinforcement transfer learning technique to improve adaptability in dynamic environments. Simulation results show that cooperation via SAU-Coop algorithm contributes to an improvement of 14.7% in cumulative throughput, and 32.5% improvement of PLR when compared with no cooperation approaches. Finally, under dynamic scenarios, transfer learning contributes to mitigation of service drops for at least 60% of the total of users.
... For example, in [14], the authors 87 propose a CW optimization mechanism for IEEE 802.11ax under dynamically varying 88 network conditions employing RL algorithms. The RL algorithms were implemented on 89 the NS-3 [15] simulator using the NS3-gym [16] framework, which enables integration 90 with python frameworks [16]. They proved to have efficiency close to optimal according 91 Preprints (www.preprints.org) ...
... The proposed centralized DRL-based CW optimization solution is implemented on 341 NS3-gym [16], which runs on top of the NS-3 simulator [15]. NS3-gym enables the commu-342 nication between NS-3 (c++) and OpenAI gym framework (python) [46]. ...
Preprint
Full-text available
The collision avoidance mechanism adopted by the IEEE 802.11 standard is not optimal. The mechanism employs a binary exponential backoff (BEB) algorithm in the medium access control (MAC) layer. Such an algorithm increases the backoff interval whenever a collision is detected to minimize the probability of subsequent collisions. However, the expansion of the backoff interval causes degradation of the radio spectrum utilization (i.e., bandwidth wastage). That problem worsens when the network has to manage the channel access to a dense number of stations, leading to a dramatic decrease in network performance. Furthermore, a wrong backoff setting increases the probability of collisions such that the stations experience numerous collisions before achieving the optimal backoff value. Therefore, to mitigate bandwidth wastage and, consequently, maximize the network performance, this work proposes using reinforcement learning (RL) algorithms, namely Deep Q Learning (DQN) and Deep Deterministic Policy Gradient (DDPG), to tackle such an optimization problem. As for the simulations, the NS-3 network simulator is used along with a toolkit known as NS3-gym, which integrates a reinforcement-learning (RL) framework into NS-3. The results demonstrate that DQN and DDPG have much better performance than BEB for static and dynamic scenarios, regardless of the number of stations. Moreover, the performance difference is amplified as the number of stations increases, with DQN and DDPG showing a 27% increase in throughput with 50 stations compared to BEB. Furthermore, DQN and DDPG presented similar performances.
... For example, Matlab now includes a comprehensive machine learning tool to easily automatically and select hyperparameters for ML algorithms [71]. In 2019, a new module allowing communication between the TensorFlow framework, an open source machine learning tool developed by Google, and ns-3 was created [72]. In this situation, simulating jamming attacks could save researchers a lot of time and money. ...
Thesis
In recent years, Internet of Things (IoT) networks have become new favorite targets for attackers. Their fundamental characteristic such as their energy and calculation constraints are open to new attack vectors. In this thesis, we focus on the study of vulnerabilities present in wireless networks in order to create frameworks allowing to launch several types of attacks. We also show how these frameworks can also be used as a defense system.At the same time, the threat landscape is changing dramatically and new attacks, commonly referred to as smart attacks, are emerging through the use of new processes like machine learning. Attackers are now able to create more autonomous, robust and efficient attacks that manage to advance current detection and countermeasure systems. This is why studying the security of wireless networks in the face of these new types of attacks to better understand them has become an important issue in research. In this thesis, we evaluate several vulnerabilities present in wireless networks allowing to create new intelligent attacks. First, we are developing HARPAGON, a framework based on the Markov chains theory and exploiting the vulnerabilities generated by the duty cycle mechanism. The main advantage of HARPAGON is to predict the optimal moment to carry out its attack in order to reduce its probability of being detected. At the same time this framework also allows the attacker to conserve energy. Then we propose another framework called FOLPETTI allowing to create several types of attacks thwarting a well-known countermeasure in wireless networks: channel hopping. We show that with the help of FOLPETTI, an attacker is able to predict the future transmission channel in order to increase its impact. In order to evaluate their effectiveness, we have developed a new module on the NS-3 simulator to simulate jamming attacks. Then, after validating their components on the simulator, we assigned them through experimentation on a real testbed. These two solutions; which increase the performance of several attacks, do not require prior knowledge on the part of the attacker and can be implemented on inexpensive components.Finally, strongly inspired by the FOLPETTI and HARPAGON frameworks, we have implemented a new jamming attack, called ICARO, which targets illicit drones. In this case, we show how a jamming attack can be diverted as a defense method to counter drones flying over illicit areas. The main advantage of this contribution is that this new type of attack makes it possible to cut off the communication of an illicit drone with its controller without disrupting communications in the surrounding area.
... In recent years, different works have extended the normal capabilities of ns-3 to combine its potentialities with some well-known ML development software. In [33], the authors propose ns3-gym, a framework that integrates both OpenAI Gym and ns-3 in order to encourage usage of RL in networking research. Following the same principles of ns3-gym, ns3ai [34] provides a high-efficiency solution to enable the data interaction between ns-3 and other python based AI frameworks. ...
Preprint
Full-text available
5G and beyond mobile networks will support heterogeneous use cases at an unprecedented scale, thus demanding automated control and optimization of network functionalities customized to the needs of individual users. Such fine-grained control of the Radio Access Network (RAN) is not possible with the current cellular architecture. To fill this gap, the Open RAN paradigm and its specification introduce an open architecture with abstractions that enable closed-loop control and provide data-driven, and intelligent optimization of the RAN at the user level. This is obtained through custom RAN control applications (i.e., xApps) deployed on near-real-time RAN Intelligent Controller (near-RT RIC) at the edge of the network. Despite these premises, as of today the research community lacks a sandbox to build data-driven xApps, and create large-scale datasets for effective AI training. In this paper, we address this by introducing ns-O-RAN, a software framework that integrates a real-world, production-grade near-RT RIC with a 3GPP-based simulated environment on ns-3, enabling the development of xApps and automated large-scale data collection and testing of Deep Reinforcement Learning-driven control policies for the optimization at the user-level. In addition, we propose the first user-specific O-RAN Traffic Steering (TS) intelligent handover framework. It uses Random Ensemble Mixture, combined with a state-of-the-art Convolutional Neural Network architecture, to optimally assign a serving base station to each user in the network. Our TS xApp, trained with more than 40 million data points collected by ns-O-RAN, runs on the near-RT RIC and controls its base stations. We evaluate the performance on a large-scale deployment, showing that the xApp-based handover improves throughput and spectral efficiency by an average of 50% over traditional handover heuristics, with less mobility overhead.
... A three-phase algorithm is designed to (1) evaluate the history of collision probabilities, (2) the training of both DRL models by maximizing the reward (throughput), and (3) their deployment in the network. The algorithm is implemented in ns3-gym [96] with a single AP and up to 50 stations. Compared to the 802.11ax standard, which leads to a decreased network throughput of up to 28 %, the two algorithms exhibit a stable throughput value for an increasing number of stations. ...
... Observation of the network state is conceived through timeout events, which are referred to as the total number of missing ACKs. Simulations in ns3-gym [96] consider a dynamic scenario, where the receiver station moves away from the sender at a speed of 80 m/s with throughput comparable to Minstrel. ...
... The general role of network simulators for bridging the gap between ML and communications systems like Wi-Fi is discussed by Wilhelmi et al. [360], where possible workflows for ML in networking and the use of existing tools is presented. Among these is ns3-gym, a software framework enabling the design of RL-driven solutions for communication networks, proposed by Gawlowicz et al. [96]. This framework is based on the OpenAI Gym toolkit 10 and provides an extension to the ns-3 network simulator ( Figure 20). ...
Preprint
Full-text available
Wireless local area networks (WLANs) empowered by IEEE 802.11 (Wi-Fi) hold a dominant position in providing Internet access thanks to their freedom of deployment and configuration as well as the existence of affordable and highly interoperable devices. The Wi-Fi community is currently deploying Wi-Fi~6 and developing Wi-Fi~7, which will bring higher data rates, better multi-user and multi-AP support, and, most importantly, improved configuration flexibility. These technical innovations, including the plethora of configuration parameters, are making next-generation WLANs exceedingly complex as the dependencies between parameters and their joint optimization usually have a non-linear impact on network performance. The complexity is further increased in the case of dense deployments and coexistence in shared bands. While classical optimization approaches fail in such conditions, machine learning (ML) is well known for being able to handle complexity. Much research has been published on using ML to improve Wi-Fi performance and solutions are slowly being adopted in existing deployments. In this survey, we adopt a structured approach to describing the various areas where Wi-Fi can be enhanced using ML. To this end, we analyze over 200 papers in the field, providing readers with an overview of the main trends. Based on this review, we identify both open challenges in each Wi-Fi performance area as well as general future research directions.
... The design and implementation of the MetaLearn protocol is based on the OpenAI Gym extension and ns3, which allows for executing a RL environment in ns3 [45]. The ns3-Gym interface handles the life cycle of the simulation process, communicating state and action data between the Gym agent and the simulation setting. ...
Article
Routing protocols in vehicular ad-hoc networks (VANETs) are typically challenged by high vehicular mobility and changing network topology. It becomes more apparent as the inherently dispersed nature of VANETs affects the Quality-of-Service (QoS), which makes it challenging to find a routing algorithm that maximizes the network throughput. Integrating Reinforcement Learning (RL) with Meta-Heuristic (MH) techniques allow for solving constrained, high dimensional problems such as routing optimization. Motivated by this fact, we introduce MetaLearn, a technique akin to global search, which employs a parameterized approach to remove future rewards uncertainty as well as vehicular state exploration to optimize the multilevel network structure. The proposed technique searches for the optimum solution that may be sped up by balancing global exploration using Grey Wolf Optimization (GWO) and exploitation through Temporal Difference Learning (particularly Q(λ)). MetaLearn approach enables cluster heads to learn how to adjust route request forwarding according to QoS parameters. The input received by a vehicle from previous evaluations is used to learn and adapt the subsequent actions accordingly. Furthermore, a customized reward function is developed to select the cluster head and identify stable clusters through GWO. An in-depth experimental demonstration of the proposed protocol addresses applicability and solution challenges for hybrid MH-RL algorithms in VANETs.
... For the reinforcement learning we used OpenAI Gym, which is a toolkit for developing RL algorithms. To integrate our network environment with Gym, we used ns3-Gym framework [52] which simplifies exchanging observations, actions and rewards between the RL agent and the network environment (See Figure 3). We implement two different architectures for ICRAN as shown in Figure 4. ...
Article
Full-text available
Mobile networks are increasingly expected to support use cases with diverse performance expectations at a very high level of reliability. These expectations imply the need for approaches that timely detect and correct performance problems. However, current approaches often focus on optimizing a single performance metric. Here, we aim to address this gap by proposing a novel control framework that maximizes radio resources utilization and minimizes performance degradation in the most challenging part of cellular architecture that is the radio access network (RAN). We devise a method called Intelligent Control for Self-driving RAN (ICRAN) which involves two deep reinforcement learning based approaches that control the RAN in a centralized and a distributed way, respectively. ICRAN defines a dual-objective optimization goals that are achieved through a set of diverse control actions. Using extensive discrete event simulations, we confirm that ICRAN succeeds in achieving its design goals, showing a greater edge over competing approaches. We believe that ICRAN is implementable and can serve as an important point on the way to realizing self-driving mobile networks.