Article

Joint Switch-Controller Association and Control Devolution for SDN Systems: An Integrated Online Perspective of Control and Learning

Authors:
  • Shenzhen Institute of Artificial Intelligence and Robotics for Society
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In software-defined networking (SDN) systems, it is a common practice to adopt a multi-controller design and control devolution techniques to improve the performance of the control plane. However, in such systems the decision-making for joint switch-controller association and control devolution often involves various uncertainties, e.g., the temporal variations of controller accessibility, and computation and communication costs of switches. In practice, statistics of such uncertainties are unattainable and need to be learned in an online fashion, calling for an integrated design of learning and control. In this article, we formulate a stochastic network optimization problem that aims to minimize time-average system costs and ensure queue stability. By transforming the problem into a combinatorial multi-armed bandit problem with long-term stability constraints, we adopt bandit learning methods and optimal control techniques to handle the exploration-exploitation tradeoff and long-term stability constraints, respectively. Through an integrated design of online learning and online control, we propose an effective Learning-Aided Switch-Controller Association and Control Devolution ( LASAC ) scheme. Our theoretical analysis and simulation results show that LASAC achieves a tunable tradeoff between queue stability and system cost reduction with a sublinear time-averaged regret bound over a finite time horizon.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The experiment was emulated in a tree topology as shown in Figure 1, consisting of one RYU Controller [21], seven Open Virtual Switch (OvS) supporting the OpenFlow v.1.3.0 [22] (S1-S7), and eight hosts (H1-H8). The emulation process utilized the mininet-IoT emulator [23]. experiment's scenario, H1 was roled as an attacker who sent LRDDoS attack packets in the form of *.pcap files [24]. ...
Full-text available
Article
Software Defined Internet of Things (SD-IoT) is currently developed extensively. The architecture of the Software Defined Network (SDN) allows Internet of Things (IoT) networks to separate control and data delivery areas into different abstraction layers. However, Low-Rate Distributed Denial of Service (LRDDoS) attacks are a major problem in SD-IoT networks, because they can overwhelm centralized control systems or controllers. Therefore, a system is needed that can identify and detect these attacks comprehensively. In this paper, the authors built an LRDDoS detection system using the Random Forest (RF) algorithm as the classification method. The dataset used during the experiment was considered as a new dataset schema that had 21 features. The dataset was selected using feature importance - logistic regression with the aim of increasing the classification accuracy results as well as reducing the computational burden of the controller during the attack prediction process. The results of the RF classification with the LRDDoS packet delivery speed of 200 packets per second (pps) had the highest accuracy of 98.7%. The greater the delivery rates of the attack pattern, the accuracy results increased.
Article
The Softwarization of networks is enabled by the SDN (Software-Defined Networking), NV (Network Virtualization), and NFV (Network Function Virtualization) paradigms, and offers many advantages for network operators, service providers and data-center providers. Given the strong interest in both industry and academia in the softwarization of telecommunication networks and cloud computing infrastructures, a series of special issues was established in IEEE Transactions on Network and Service Management, which aims at the timely publication of recent innovative research results on the management of softwarized networks.
Full-text available
Article
In this paper, we propose a dynamic controller assignment scheme while considering flow-specific requirements, with an aim to minimize controller response time in softwaredefined networks (SDN). We adopt the concept of FlowVisor, in which a virtualized platform acts as a manager between control- and data planes of SDN architecture. The proposed scheme consists of two phases – adaptive window selection and controller assignment. In the window selection phase, the virtualized manager determines time to wait before incoming flows can be assigned to controllers in adaptive manner. Based on the adaptive window size, the flows are assigned to the controllers in the second phase. We use dynamic stable-matching game to assign flows to controllers, while defining their preference lists. The extensive simulation results show that the proposed scheme is capable of minimizing controller response time by 31%, 39%, and 37% compared to the state-of-the-art schemes – simple stablematching (SM), static assignment (Static), and minimum quota processing (MQP), respectively. Further, the proposed scheme also reduces the percentage of QoS violated flows in the network by 72%, 73%, and 74% compared to SM, Static, and MQP, respectively.
Full-text available
Article
With a centralized control over the forwarding devices and the embedded flows, Software Defined Networking promises to increase the flexibility of communication networks. Meanwhile, a dynamic control plane would adapt itself in a timely manner to sustain flow setup performance in the face of traffic variations. Such adaptation depends on a careful decision of the controller placement, which is challenging because we need to consider two contradictory objectives, namely the cost of operating the control plane and the cost of its adaptation. In this work, we model the problem of operating the control plane as a multi-period offline optimization problem to minimize the total cost induced by the flow setup performance and the control plane adaptation. We leverage the lookahead control scheme and decompose the intractable offline problem into smaller instances, which are solved in an online fashion efficiently with an algorithm based on simulated annealing. We perform extensive simulations on real world topologies and show that our proposed algorithm can reduce the total cost by up to 20% compared with the reference algorithms. Further, we analyze the need of frequent control plane adaptation, and compare different control plane design choices according to a novel flexibility measure.
Full-text available
Article
Decentralized orchestration of the control plane is critical to the scalability and reliability of software-defined network (SDN). However, existing orchestrations of SDN are either one-off or centralized, and would be inefficient the presence of temporal and spatial variations in traffic requests. In this paper, a fully distributed orchestration is proposed to minimize the time-average cost of SDN, adapting to the variations. This is achieved by stochastically optimizing the on-demand activation of controllers, adaptive association of controllers and switches, and real-time request processing and dispatching. The proposed approach is able to operate at multiple timescales for activation and association of controllers, and request processing and dispatching, thereby alleviating potential service interruptions caused by orchestration. A new analytic framework is developed to confirm the asymptotic optimality of the proposed approach in the presence of non-negligible signaling delays between controllers. Corroborated from extensive simulations, the proposed approach can save up to 73% the time-average operational cost of SDN, as compared to the existing static orchestration.
Full-text available
Conference Paper
We consider a distributed Software Defined Networking (SDN) architecture adopting a cluster of multiple controllers to improve network performance and reliability. Differently from previous work, we focus on the control traffic exchanged among the controllers, in addition to the Openflow control traffic exchanged between controllers and switches. We develop an analytical model to estimate the reaction time perceived at the switches due to the inter-controller communications, based on the data-ownership model adopted in the cluster. We advocate a careful placement of the controllers, taking into account the two above kinds of control traffic. We evaluate, for some real ISP network topologies, the possible delay tradeoffs for the controllers placement problem.
Full-text available
Article
Softwarization of networks allows simplifying deployment , configuration and management of network functions. The driving force towards this evolution is represented by Software Defined Networking (SDN) that allows more flexible and dynamic network resource allocation and management. Efficient resource allocation and orchestration are two primary targets of this softwarization process; however, centralized methodologies result complex, and exhibit scalability issues. So, distributed solutions are to be preferred but, in order to be effective, should quickly converge towards equilibrium solutions. In this paper, we focus on making distributed resource allocation and orchestration a viable approach, and prove convergence of the relevant mechanisms. Specifically, we exploit game theory to model interactions between users requesting network functions and servers providing these functions. Accordingly, a two-stage Stackelberg game is presented where servers act as leaders of the game and users as followers. Servers have conflicting interests and try to maximize their utility; users, on the other hand, use a replicator behavior and try to imitate other users decisions to improve their benefit. The framework proves the existence and uniqueness of an equilibrium, and a learning mechanism to converge to such equilibrium is proposed. Numerical results show the effectiveness of the approach.
Full-text available
Article
In this paper, the problem of proactive deployment of cache-enabled unmanned aerial vehicles (UAVs) for optimizing the quality-of-experience (QoE) of wireless devices in a cloud radio access network (CRAN) is studied. In the considered model, the network can leverage human-centric information such as users' visited locations, requested contents, gender, job, and device type to predict the content request distribution and mobility pattern of each user. Then, given these behavior predictions, the proposed approach seeks to find the user-UAV associations, the optimal UAVs' locations, and the contents to cache at UAVs. This problem is formulated as an optimization problem whose goal is to maximize the users' QoE while minimizing the transmit power used by the UAVs. To solve this problem, a novel algorithm based on the machine learning framework of conceptor-based echo state networks (ESNs) is proposed. Using ESNs, the network can effectively predict each user's content request distribution and its mobility pattern when limited information on the states of users and the network is available. Based on the predictions of the users' content request distribution and their mobility patterns, we derive the optimal user-UAV association, optimal locations of the UAVs as well as the content to cache at UAVs. Simulation results using real pedestrian mobility patterns from BUPT and actual content transmission data from Youku show that the proposed algorithm can yield 40% and 61% gains, respectively, in terms of the average transmit power and the percentage of the users with satisfied QoE compared to a benchmark algorithm without caching and a benchmark solution without UAVs.
Full-text available
Conference Paper
Classical collaborative filtering, and content-based filtering methods try to learn a static recommendation model given training data. These approaches are far from ideal in highly dynamic recommendation domains such as news recommendation and computational advertisement, where the set of items and users is very fluid. In this work, we investigate an adaptive clustering technique for content recommendation based on exploration-exploitation strategies in contextual multi-armed bandit settings. Our algorithm takes into account the collaborative effects that arise due to the interaction of the users with the items, by dynamically grouping users based on the items under consideration and, at the same time, grouping items based on the similarity of the clusterings induced over the users. The resulting algorithm thus takes advantage of preference patterns in the data in a way akin to collaborative filtering methods. We provide an empirical analysis on medium-size real-world datasets, showing scalability and increased prediction performance (as measured by click-through rate) over state-of-the-art methods for clustering bandits. We also provide a regret analysis within a standard linear stochastic noise setting.
Full-text available
Article
The IMT 2020 requirements of 20 Gbps peak data rate and 1 millisecond latency present significant engineering challenges for the design of 5G cellular systems. Use of the millimeter wave (mmWave) bands above 10 GHz --- where vast quantities of spectrum are available --- is a promising 5G candidate that may be able to rise to the occasion. However, while the mmWave bands can support massive peak data rates, delivering these data rates on end-to-end service while maintaining reliability and ultra-low latency performance will require rethinking all layers of the protocol stack. This papers surveys some of the challenges and possible solutions for delivering end-to-end, reliable, ultra-low latency services in mmWave cellular systems in terms of the Medium Access Control (MAC) layer, congestion control and core network architecture.
Full-text available
Article
Adaptive and sequential experiment design is a well-studied area in numerous domains. We survey and synthesize the work of the online statistical learning paradigm referred to as multi-armed bandits integrating the existing research as a resource for a certain class of online experiments. We first explore the traditional stochastic model of a multi-armed bandit, then explore a taxonomic scheme of complications to that model, for each complication relating it to a specific requirement or consideration of the experiment design context. Finally, at the end of the paper, we present a table of known upper-bounds of regret for all studied algorithms providing both perspectives for future theoretical work and a decision-making tool for practitioners looking for theoretical guarantees.
Full-text available
Article
Middleboxes are special network devices that perform various functions such as enabling security and efficiency. SDN-based routing approaches in networks with middleboxes need to address resource constraints, such as memory in the switches and processing power of middleboxes, and traversal constraint where a flow must visit the required middleboxes in a specific order. In this work we propose a solution based on MultiPoint-To-Point Trees (MPTPT) for routing traffic in SDN-enabled networks with consolidated middleboxes. We show both theoretically and via simulations that our solution significantly reduces the number of routing rules in the switches, while guaranteeing optimum throughput and meeting processing requirements. Additionally, the underlying algorithm has low complexity making it suitable in dynamic network environment.
Full-text available
Article
Software-Defined Networking (SDN) is an emerging paradigm that promises to change the state of affairs of current networks, by breaking vertical integration, separating the network's control logic from the underlying routers and switches, promoting (logical) centralization of network control, and introducing the ability to program the network. The separation of concerns introduced between the definition of network policies, their implementation in switching hardware, and the forwarding of traffic, is key to the desired flexibility: by breaking the network control problem into tractable pieces, SDN makes it easier to create and introduce new abstractions in networking, simplifying network management and facilitating network evolution. Today, SDN is both a hot research topic and a concept gaining wide acceptance in industry, which justifies the comprehensive survey presented in this paper. We start by introducing the motivation for SDN, explain its main concepts and how it differs from traditional networking. Next, we present the key building blocks of an SDN infrastructure using a bottom-up, layered approach. We provide an in-depth analysis of the hardware infrastructure, southbound and northbounds APIs, network virtualization layers, network operating systems, network programming languages, and management applications. We also look at cross-layer problems such as debugging and troubleshooting. In an effort to anticipate the future evolution of this new paradigm, we discuss the main ongoing research efforts and challenges of SDN. In particular, we address the design of switches and control platforms -- with a focus on aspects such as resiliency, scalability, performance, security and dependability -- as well as new opportunities for carrier transport networks and cloud providers. Last but not least, we analyze the position of SDN as a key enabler of a software-defined environment.
Full-text available
Article
Network architectures such as Software-Defined Networks (SDNs) move the control logic off packet processing devices and onto external controllers. These network architectures with decoupled control planes open many unanswered questions regarding reliability, scalability, and performance when compared to more traditional purely distributed systems. This paper opens the investigation by focusing on two specific questions: given a topology, how many controllers are needed, and where should they go? To answer these questions, we examine fundamental limits to control plane propagation latency on an upcoming Internet2 production deployment, then expand our scope to over 100 publicly available WAN topologies. As expected, the answers depend on the topology. More surprisingly, one controller location is often sufficient to meet existing reaction-time requirements (though certainly not fault tolerance requirements).
Article
Unexpected bursty traffic brought by certain sudden events, such as news in the spotlight on a social network or discounted items on sale, can cause severe load imbalance in backend services. Migrating hot datathe standard approach to achieve load balancemeets a challenge when handling such unexpected load imbalance, because migrating data will slow down the server that is already under heavy pressure. This paper proposes PostMan, an alternative approach to rapidly mitigate load imbalance for services processing small requests. Motivated by the observation that processing large packets incurs far less CPU overhead than processing small ones, PostMan deploys a number of middleboxes called helpers to assemble small packets into large ones for the heavily-loaded server. This approach essentially offloads the overhead of packet processing from the heavily-loaded server to helpers. To minimize the overhead, PostMan activates helpers on demand, only when bursty traffic is detected. The heavily-loaded server determines when clients connect/disconnect to/from helpers based on the real-time load statistics. To tolerate helper failures, PostMan can migrate connections across helpers and can ensure packet ordering despite such migration. Driven by real-world workloads, our evaluation shows that, with the help of PostMan, a Memcached server can mitigate bursty traffic within hundreds of milliseconds, while migrating data takes tens of seconds and increases the latency during migration.
Article
In recent years, Software Defined Networking (SDN) has emerged as a pivotal element not only in data-centers and wide-area networks, but also in next generation networking architectures such as Vehicular ad hoc network and 5G. SDN is characterized by decoupled data and control planes, and logically centralized control plane. The centralized control plane in SDN offers several opportunities as well as challenges. A key design choice of the SDN control plane is placement of the controller(s), which impacts a wide range of network issues ranging from latency to resiliency, from energy efficiency to load balancing, and so on. In this paper, we present a comprehensive survey on the controller placement problem (CPP) in SDN. We introduce the CPP in SDN and highlight its significance. We present the classical CPP formulation along with its supporting system model. We also discuss a wide range of the CPP modeling choices and associated metrics. We classify the CPP literature based on the objectives and methodologies. Apart from the primary use-cases of the CPP in data-center networks and wide area networks, we also examine the recent application of the CPP in several new domains such as mobile/cellular networks, 5G, named data networks, wireless mesh networks and VANETs. We conclude our survey with discussion on open issues and future scope of this topic.
Article
Dense deployments of Small Cells are key to fulfill the capacity requirements of future 5G networks. However, two roadblocks to the adoption of Small Cells are i) the limited availability and the cost of sites with wired backhaul resources, and ii) the complexity to manage a dense deployment of wireless backhaul nodes. Towards these challenges we propose SODALITE, a novel system that applies Software Defined Networking (SDN) to a wireless backhaul network. We present how SODALITE can be integrated to 3GPP’s 4G and 5G architectures, and show the feasibility of SODALITE through LTE network testbed experiments. We substantiate the scalability of SODALITE through stochastic studies using real-life traffic traces from an LTE network and discuss the effects of cell densification and 5G system architecture on these studies. Further, a reliable backhauling solution for wireless links is introduced in SODALITE through SDN-enabled mechanisms that are capable of reconfiguring the data plane upon a link failure detection. Its reliability is shown through experiments on a LTE network testbed, and studied thoroughly via rigorous simulations and network emulator evaluations. As a result, we claim that SODALITE is a promising carrier-grade system to manage a wireless Small Cell backhaul.
Conference Paper
In software-defined networking (SDN) systems, the scalability and reliability of the control plane still remain as major concerns. Existing solutions adopt either multi-controller designs or control devolution back to the data plane. The former requires a flexible yet efficient switch-controller association mechanism to adapt to workload changes and potential failures, while the latter demands timely decision making with low overheads. The integrate design for both is even more challenging. Meanwhile, the dramatic advancement in machine learning techniques has boosted the practice of predictive scheduling to improve the responsiveness in various systems. Nonetheless, so far little work has been conducted for SDN systems. In this paper, we study the joint problem of dynamic switch-controller association and control devolution, while investigating the benefits of predictive scheduling in SDN systems. We propose POSCAD, an efficient, online, and distributed scheme that exploits predictive future information to minimize the total system cost and the average request response time with queueing stability guarantee. Theoretical analysis and trace-driven simulation results show that POSCAD requires only mild-value of future information to achieve a near-optimal system cost and near-zero average request response time. Further, POSCAD is robust against mis-prediction to reduce the average request response time.
Article
In recent years, with the rapid development of current Internet and mobile communication technologies, the infrastructure, devices and resources in networking systems are becoming more complex and heterogeneous. In order to efficiently organize, manage, maintain and optimize networking systems, more intelligence needs to be deployed. However, due to the inherently distributed feature of traditional networks, machine learning techniques are hard to be applied and deployed to control and operate networks. Software Defined Networking (SDN) brings us new chances to provide intelligence inside the networks. The capabilities of SDN (e.g., logically centralized control, global view of the network, software-based traffic analysis, and dynamic updating of forwarding rules) make it easier to apply machine learning techniques. In this paper, we provide a comprehensive survey on the literature involving machine learning algorithms applied to SDN. First, the related works and background knowledge are introduced. Then, we present an overview of machine learning algorithms. In addition, we review how machine learning algorithms are applied in the realm of SDN, from the perspective of traffic classification, routing optimization, Quality of Service (QoS)/Quality of Experience (QoE) prediction, resource management and security. Finally, challenges and broader perspectives are discussed.
Article
In a virtualized software defined network, PACKET_IN messages of switches must pass through the hypervisor in order to reach the corresponding controller. Hence, the latency experienced by a network element is the sum of latency from network element to hypervisor and the latency from the hypervisor to the controller corresponding to the network element. Therefore, the locations of both the hypervisors and controllers determine the latency of network elements in a virtualized environment. In this paper, we propose a strategy for determining the placement of controllers in a virtualized software defined network (VSDN) while fixing the hypervisor(s) in the physical network. We also propose an approach for jointly optimizing the placement of hypervisors and controllers in a VSDN. The objective is to minimize the worst case latency between the network element and its corresponding controller. Furthermore, we propose a generalized model which can be used not only to optimize the worst case latency, but also to optimize other objectives such as the average latency, the maximum average latency, and the average maximum latency. The proposed problems are formulated as integer linear programs. We evaluated the performance of our proposed strategies using the AT&T network of Internet Topology Zoo and the Internet 2 OS3E topology, and compared with the hypervisor placement problem. Evaluations demonstrate that the proposed methods outperform the existing hypervisor placement approach with respect to the various performance metrics. IEEE
Conference Paper
Data center networks carry a mix of flows, some with deadlines and some without. Existing mix-flow transport designs assume prior knowledge of flow sizes, which may not hold in practice. Without such information, mix-flow scheduling becomes particularly challenging due to (1) the lack of precise rate control of deadline flows with minimal impact on non-deadline flows; (2) difficulty in assigning priority to both two types of flows. We present Aemon, an information-agnostic mix-flow transport. Aemon relies on a new congestion control mechanism based on urgency---the ratio between the elapsed transmission time and the remaining time to deadline---to strategically modulate the sending rate of deadline flows. In addition, Aemon uses a novel two-level priority scheduling policy to differentiate mix flows. As time goes, a deadline flow's priority level increases as its urgency rises, and a non-deadline flow's priority level decreases as it sends more bytes similar to PIAS to approximate Shortest-Job-First policy. While in the same priority level, non-deadline flows take precedence to avoid possible starvation caused by aggressive deadline flows. Extensive simulations on ns-2 show that Aemon outperforms existing information-agnostic schemes and is only slightly worse than state-of-the-art Karuna.
Article
Software Defined Networking shifts the control plane of forwarding devices to one or more external entities known as controllers. Determining the optimal location of controllers in the network and the assignment of switches to them is widely known as controller placement problem. In case of controller failures, the switches are disconnected from the controller until they are reassigned to other active controllers with enough spare capacity. However, there is a significant upsurge in the worst case latency after the reassignment due to lack of planning for controller failures. In this paper, we propose a controller placement strategy that not only considers reliability and capacity of controllers but also plans ahead for controller failures to avoid repeated administrative intervention, drastic increase in latency and disconnections. It is formulated as a mixed integer linear program (MILP). The objective is to minimize the maximum, for all switches, of the sum of the latency from the switch to the nearest controller with enough capacity (first reference controller) and the latency from the first reference controller to its closest controller with enough capacity (second reference controller). We also proposed a generalized model which can be used to minimize the average latency and extended it for multiple controller failures. Furthermore, we presented a simulated annealing heuristic that efficiently solves the problem on large scale networks. The proposed formulation and heuristic are evaluated on various networks from the Internet Topology Zoo. Simulation results show that our proposed method performs better than the controller placement that does not plan ahead for failures.
Article
Software defined networking is increasingly prevalent in data center networks for it enables centralized network configuration and management. However, since switches are statically assigned to controllers and controllers are statically provisioned, traffic dynamics may cause long response time and incur high maintenance cost. To address these issues, we formulate the dynamic controller assignment problem (DCAP) as an online optimization to minimize the total cost caused by response time and maintenance on the cluster of controllers. By applying the randomized fixed horizon control framework, we decompose DCAP into a series of stable matching problems with transfers, guaranteeing a small loss in competitive ratio. Since the matching problem is NP-hard, we propose a hierarchical two-phase algorithm that integrates key concepts from both matching theory and coalitional games to solve it efficiently. Theoretical analysis proves that our algorithm converges to a near-optimal Nash stable solution within tens of iterations. Extensive simulations show that our online approach reduces total cost by about 46%, and achieves better load balancing among controllers compared with static assignment.
Article
Datacenter networks suffer unpredictable performance due to a lack of application level bandwidth guarantees. A lot of attention has been drawn to solve this problem such as how to provide bandwidth guarantees for virtualized machines (VMs), proportional bandwidth share among tenants, and high network utilization under peak traffic. However, existing solutions fail to cope with highly dynamic traffic in datacenter networks. In this paper, we propose eBA, an efficient solution to bandwidth allocation that provides end-to-end bandwidth guarantee for VMs under large numbers of short flows and massive bursty traffic in datacenters. eBA leverages a novel distributed VM-to-VM rate control algorithm based on the logistic model under the control-theoretic framework. eBA's implementation requires no changes to hardware or applications and can be deployed in standard protocol stack. The theoretical analysis and the experimental results show that eBA not only guarantees the bandwidth for VMs, but also provides fast convergence to efficiency and fairness, as well as smooth response to bursty traffic.
Conference Paper
In this paper, a QoS-aware traffic classification framework for software defined networks is proposed. Instead of identifying specific applications in most of the previous work of traffic classification, our approach classifies the network traffic into different classes according to the QoS requirements, which provide the crucial information to enable the fine-grained and QoS-aware traffic engineering. The proposed framework is fully located in the network controller so that the real-time, adaptive, and accurate traffic classification can be realized by exploiting the superior computation capacity, the global visibility, andthe inherent programmability of the network controller. More specifically, the proposed framework jointly exploits deep packet inspection (DPI) and semi-supervised machine learning so that accurate traffic classification can be realized, while requiring minimal communications between the network controller and the SDN switches. Based on the real Internet data set, the simulation results show the proposed classification framework can provide good performance in terms of classification accuracy and communication costs
Conference Paper
Software-defined networks (SDNs) have been recognized as the next-generation networking paradigm that decouples the data forwarding from the centralized control. To realize the merits of dedicated QoS provisioning and fast route (re-)configuration services over the decoupled SDNs, various QoS requirements in packet delay, loss, and throughput should be supported by an efficient transportation with respect to each specific application. In this paper, a QoS-aware adaptive routing (QAR) is proposed in the designed multi-layer hierarchical SDNs. Specifically, the distributed hierarchical control plane architecture is employed to minimize signaling delay in large SDNs via three-levels design of controllers, i.e., the super, domain (or master), and slave controllers. Furthermore, QAR algorithm is proposed with the aid of reinforcement learning and QoSaware reward function, achieving a time-efficient, adaptive, QoSprovisioning packet forwarding. Simulation results confirm that QAR outperforms the existing learning solution and provides fast convergence with QoS provisioning, facilitating the practical implementations in large-scale software service-defined networks
Conference Paper
Software-Defined Networking (SDN) allows control applications to install fine-grained forwarding policies in the underlying switches. While Ternary Content Addressable Memory (TCAM) enables fast lookups in hardware switches with flexible wildcard rule patterns, the cost and power requirements limit the number of rules the switches can support. To make matters worse, these hardware switches cannot sustain a high rate of updates to the rule table. In this paper, we show how to give applications the illusion of high-speed forwarding, large rule tables, and fast updates by combining the best of hardware and software processing. Our CacheFlow system "caches" the most popular rules in the small TCAM, while relying on software to handle the small amount of "cache miss" traffic. However, we cannot blindly apply existing cache-replacement algorithms, because of dependencies between rules with overlapping patterns. Rather than cache large chains of dependent rules, we "splice" long dependency chains to cache smaller groups of rules while preserving the semantics of the policy. Experiments with our CacheFlow prototype---on both real and synthetic workloads and policies---demonstrate that rule splicing makes effective use of limited TCAM space, while adapting quickly to changes in the policy and the traffic demands.
Article
With wide application of virtualization technology, tenants are able to access isolated cloud services by renting the shared resources in Infrastructure-as-a-Service (IaaS) datacenters. Unlike resources such as CPU and memory, datacenter network, which relies on traditional transport-layer protocols, suffers unfairness due to a lack of virtual machine (VM)-level bandwidth guarantees. In this paper, we model the datacenter bandwidth allocation as a cooperative game, toward VM-based fairness across the datacenter with two main objectives: 1) guarantee bandwidth for VMs based on their base bandwidth requirements, and 2) share residual bandwidth in proportion to the weights of VMs. Through a bargaining game approach, we propose a bandwidth allocation algorithm, Falloc, to achieve the asymmetric Nash bargaining solution (NBS) in datacenter networks, which exactly meets our objectives. The cooperative structure of the algorithm is exploited to develop an online algorithm for practical real-world implementation. We validate Falloc with experiments under diverse scenarios and show that by adapting to different network requirements of VMs, Falloc can achieve fairness among VMs and balance the tradeoff between bandwidth guarantee and proportional bandwidth sharing. Our large-scale trace-driven simulations verify that Falloc achieves high utilization while maintaining fairness among VMs in datacenters.
Article
Several distributed SDN controller architectures have been proposed to mitigate the risks of overload and failure. However, since they statically assign switches to controller instances and store state in distributed data stores (which doubles flow setup latency), they hinder operators' ability to minimize both flow setup latency and controller resource consumption. To address this, we propose a novel approach for assigning SDN switches and partitions of SDN application state to distributed controller instances. We present a new way to partition SDN application state that considers the dependencies between application state and SDN switches. We then formally model the assignment problem as a variant of multi-dimensional bin packing and propose a practical heuristic to solve the problem with strict time constraints. Our preliminary evaluations show that our approach yields a 44% decrease in flow setup latency and a 42% reduction in controller operating costs.
Article
In a queuing process, let 1/λ be the mean time between the arrivals of two consecutive units, L be the mean number of units in the system, and W be the mean time spent by a unit in the system. It is shown that, if the three means are finite and the corresponding stochastic processes strictly stationary, and, if the arrival process is metrically transitive with nonzero mean, then L = λW.
Conference Paper
Distributed controllers have been proposed for Software Defined Networking to address the issues of scalability and reliability that a centralized controller suffers from. One key limitation of the distributed controllers is that the mapping between a switch and a controller is statically configured, which may result in uneven load distribution among the controllers. To address this problem, we propose ElastiCon, an elastic distributed controller architecture in which the controller pool is dynamically grown or shrunk according to traffic conditions and the load is dynamically shifted across controllers. We propose a novel switch migration protocol for enabling such load shifting, which conforms with the Openflow standard. We also build a prototype to demonstrate the efficacy of our design.
Conference Paper
OpenFlow assumes a logically centralized controller, which ideally can be physically distributed. However, current deployments rely on a single controller which has major drawbacks including lack of scalability. We present HyperFlow, a distributed event-based control plane for OpenFlow. HyperFlow is logically centralized but physically distributed: it provides scalability while keeping the benefits of network control centralization. By passively synchronizing network-wide views of OpenFlow controllers, HyperFlow localizes decision making to individual controllers, thus minimizing the control plane response time to data plane requests. HyperFlow is resilient to network partitioning and component failures. It also enables interconnecting independently managed OpenFlow networks, an essential feature missing in current OpenFlow deployments. We have implemented HyperFlow as an application for NOX. Our implementation requires minimal changes to NOX, and allows reuse of existing NOX applications with minor modifications. Our preliminary evaluation shows that, assuming sufficient control bandwidth, to bound the window of inconsistency among controllers by a factor of the delay between the farthest controllers, the network changes must occur at a rate lower than 1000 events per second across the network.
Article
Limiting the overhead of frequent events on the control plane is essential for realizing a scalable Software-Defined Network. One way of limiting this overhead is to process frequent events in the data plane. This requires modifying switches and comes at the cost of visibility in the control plane. Taking an alternative route, we propose Kandoo, a framework for preserving scalability without changing switches. Kandoo has two layers of controllers: (i) the bottom layer is a group of controllers with no interconnection, and no knowledge of the network-wide state, and (ii) the top layer is a logically centralized controller that maintains the network-wide state. Controllers at the bottom layer run only local control applications (i.e., applications that can function using the state of a single switch) near datapaths. These controllers handle most of the frequent events and effectively shield the top layer. Kandoo's design enables network operators to replicate local controllers on demand and relieve the load on the top layer, which is the only potential bottleneck in terms of scalability. Our evaluations show that a network controlled by Kandoo has an order of magnitude lower control channel consumption compared to normal OpenFlow networks.