Article

Congestion-Minimizing Network Update in Data Centers

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The SDN control plane needs to frequently update the data plane as the network conditions change. Since each switch updates its flow table independently and asynchronously, the transition of data plane state—if done directly from the initial to the final stage—may result in serious flash congestion. Prior work strives to find a congestion-free update plan with multiple stages, each with the property that there will be no congestion independent of the update order. Yet congestion-free update may prevent the network from being fully utilized. It also requires solving a series of LP which is time-consuming. In this paper, we propose congestion-minimizing update and focus on two general problems: The first is to find routing at each intermediate stage that minimizes transient congestion for a given number of intermediate stages. The second is to find the minimum number of intermediate stages and an update plan for a given maximum level of transient congestion. We formulate them as two optimization programs and prove their hardness. We propose a set of algorithms to find the update plan in a scalable manner. Extensive experiments with Mininet show that our solution reduces update time by 50% and saves control overhead by 38% compared to prior work.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In realistic networks, typically a vast many of flows are planned to update simultaneously to achieve network-wide goals. To change flows consistently, many previous flow updates approaches [1], [3], [14], [15], [17]- [19], [28], [30], [31] adopted the two-phase update technique to update each flow due to its operation simplicity, in addition to [16] that utilizes the two-phase update technique to change each segment of flows. However, the two-phase update technique has two main drawbacks when applied to multiple flow updates in a resource-constrained network. ...
... [1], [3], [14], [16], [28]- [30] focused on avoiding congestion caused by transiently overloaded links, and [1], [3], [14], [16], [30] further aimed to achieve fast updates by minimizing the number of rounds required to complete the updates. [15] focused on congestion-tolerant networks and aimed to minimize the congestion to complete the updates within a given time. [31] disclosed that there exists a tradeoff between the update completion time and network throughput, and focused on balancing the intents on this two performances. ...
... locks. As mentioned before and stated in many existing network update works [3], [15], [16], [31], limiting the flow rate is a simple but efficient way to remove deadlocks and make updates move on without congestion. We collect the number N F ratelimit of updating flows that need to limit/reduce the rate to remove deadlocks and the amount Rate reduce of the reduced rate of limited flows in each experiment. ...
Article
Full-text available
Flow updates are common in today’s networks, and Software-Defined Networking (SDN) enables network operators to reconfigure switches for updating flows easily. However, the implementation of flow updates requires to meet many different expectations regarding consistency, resource constraints, and performance. To carry updates out as intended, network operators often need to spend significant effort in update management, developing complex network optimizations and customized heuristics on a case-by-case basis. In this paper, we strive to simplify the flow updates in SDN networks. To this end, we present Atoman, a framework that uses high-level abstractions to capture various update intents and formulates flow updates problems as segment-based update scheduling optimizations to obtains satisfied update solutions. The captured update intents are translated into constraints and objectives of update scheduling optimizations. By extracting critical updating flows and employing decomposition techniques, Atoman can efficiently reduce the scale of problems in each solving and generate near-optimal update solutions. We conduct extensive simulations to evaluate Atoman and simulation results show that Atoman significantly saves operator efforts in managing flow updates and provides comparable or better efficiency than prior customized solutions.
... Brandt et al. [9] proved that deciding whether a congestion-free update schedule exists is NP-hard. Zheng et al. [10] proposed MCUP and BCUP to respectively find an update plan with minimum transient congestion, and find an update plan with minimum intermediate stages when given the transient congestion threshold. Wang et al. [11] divided the global dependencies among updates into local restrictions and provided a heuristic dependency resolution algorithm to schedule updates. ...
Article
Efficient and consistent update of the network routing rules is a challenging task that significantly affects the performance, correctness, and security of Software-Defined Networks (SDN). In this work, we consider the problem of minimizing the makespan of updating the routing rules in SDNs, while guaranteeing three crucial consistency requirements: (1) WayPoint Enforcement, (2) Loop Freedom, and (3) Conflict Freedom. This problem is known to be NP-hard, and thus we focus on designing approximate algorithms that run in polynomial time without incurring TCAM storage overhead. To compute consistent rule-update schedules, we propose two algorithms, called TimeX and RMS . TimeX employs the solution of a linear program (LP) to address the makespan minimization goal systematically. RMS is an LP-independent heuristic that provides higher scalability. We demonstrate and utilize a property of rule-updates, called reversibility, to reduce the makespan in RMS. Extensive simulations show that our algorithms reduce the makespan by 2% to 18% and attain a 4.9 $\times$ speedup compared to previous studies. Moreover, Mininet experiments reveal that the proposed algorithms can mitigate the transient congestion caused by conflicting flows.
Article
Network update enables Software-Defined Networks (SDNs) to optimize the data plane performance. The single update focuses on processing one update event at a time, i.e. , updating a set of flows from their initial routes to target routes, but it fails to handle continuously arriving update events in time incurred by high-frequency network changes. On the contrary, the continuous update proposed in “Update Algebra” can handle multiple update events concurrently and respond to the network condition changes at all times. However, “Update Algebra” only guarantees the blackhole-free and loop-free update. The congestion-free property cannot be respected. In this paper, we propose Coeus to achieve the continuous update while maintaining consistency, i.e. , ensuring the blackhole-free, loop-free, and congestion-free properties simultaneously. Firstly, we establish the continuous update model based on the update operations in update events. With the update model, we dynamically reconstruct the operation dependency graph (ODG) to capture the relationship between update operations and link utilization variations. Then, we develop a composition algorithm to eliminate redundant operations in update events. To further speed up the update procedure, we present a partition algorithm to split the operation nodes of the ODG into a series of suboperation nodes that can be executed independently. The partition algorithm is proven to be optimal. Finally, extensive evaluations show that Coeus can improve the update speed by at least 179% and reduce redundant operations by at least 52% compared with state-of-the-art approaches when the arrival rate of update events equals three times per second.
Article
Software-defined network (SDN) is one of the important technologies to achieve the customization of network services. The key to realize a highly adaptive SDN, which can respond to the changing demands or recover after network failure in a short period of time, is to update the configuration effectively. However, the inconsistent configuration update may lead to transient and incorrect network behaviors and long reconfiguration time between current configuration and target configuration may degrade undesired network performance. Most of the existing works to reduce the network reconfiguration time focus on the consistent update between current configuration and target configuration, and ignore the impact of different target configurations on network reconfiguration time. Therefore, reducing the network reconfiguration time needs to consider not only the consistent update between current configuration and target configuration, but also the impact of different target configurations. In this paper, we first analyze the impact of different target configuration of the control plane on the update operation in the forwarding rules of the data plane. Then, we formulate the shortest configuration path (SCP) problem to get the path with the shortest configuration time and propose the shortest deployment time path selection (SPS) algorithm to obtain the target configuration in the light of different rules deployment time. Subsequently, we propose the consistent scheduling update (CSU) algorithm to solve the consistency conversion from the current configuration to the target configuration. Finally, experimental results demonstrate that our algorithms can reduce the network reconfiguration time up to 39% compared with previous SDN consistent update methods while keeping the similar consistency.
Conference Paper
Virtualized network functions (VNFs) enable software applications to replace traditional middleboxes, which is more flexible and scalable in the network service provision. This paper focuses on ensuring Service Availability via Failure Elimination (SAFE) using VNF scaling, that is, given the resource requirements of VNF instances, finding an optimal and robust instance consolidation strategy, which can recover from one instance failure quickly. To address the above problem, we present a framework based on rounding and dynamic programming. First, we discretize the range of resource requirements for VNF instances deployment into several sub-ranges, so that the number of instance types becomes a constant. Second, we further reduce the number of instance types by gathering several small instances into a bigger one. Third, we propose an algorithm built on dynamic programming to solve the instance consolidation problem with a limited number of instance types. We set up a testbed to profile the functional relationship between resource and throughput for different types of VNF instances, and conduct simulations to validate our theoretical results according to profiling results. The simulation results show that our algorithm outperforms the standby deployment model by 27.33% on average in terms of the number of servers required. Furthermore, SAFE has marginal overhead, around 7.22%, compared to instance consolidation strategy without VNF backup consideration.
Article
In software defined networking (SDN), flow migration will be required when topology changes to improve network performance such as load balancing. However, black holes, loops and transient congestion may occur during flow migration due to the asynchronous update of switches on the data plane. Therefore, in this paper, we propose a novel segmented update method to shorten the time of rules update, and a novel transient congestion avoidance algorithm to minimize the number of delayed updating flows, which can both reduce update time of flows. Specifically, we construct three novel models to guarantee no black holes, no loops and no transient congestion, respectively. The first two models to avoid black holes and loops can update multiple nodes in each segment instead of updating the nodes one by one like Cupid. The third model to avoid transient congestion minimizes the number of delayed updating flows. Subsequently, three novel black holes avoidance algorithm, loops avoidance algorithm and congestion avoidance algorithm are respectively proposed. Furthermore, we propose a novel rules update (RU) algorithm which combines these three algorithms to update the rules to avoid black holes, loops and transient congestion simultaneously. Simulation results show that our scheme can increase the number of directly updated flows by 75% on a single congestion link and reduce the rules update time of the flows by 34% compared with the existing work.
Conference Paper
Full-text available
Network updates such as policy and routing changes occur frequently in Software Defined Networks (SDN). Updates should be performed consistently, preventing temporary disruptions, and should require as little overhead as possible. Scalability is increasingly becoming an essential requirement in SDN. In this paper we propose to use time-triggered network updates to achieve consistent updates. Our proposed solution requires lower overhead than existing update approaches, without compromising the consistency during the update. We demonstrate that accurate time enables far more scalable consistent updates in SDN than previously available. In addition, it provides the SDN programmer with fine-grained control over the tradeoff between consistency and scalability.
Article
Full-text available
We present Dionysus, a system for fast, consistent network updates in software-defined networks. Dionysus encodes as a graph the consistency-related dependencies among updates at individual switches, and it then dynamically schedules these updates based on runtime differences in the update speeds of different switches. This dynamic scheduling is the key to its speed; prior update methods are slow because they pre-determine a schedule, which does not adapt to runtime conditions. Testbed experiments and data-driven simulations show that Dionysus improves the median update speed by 53--88% in both wide area and data center networks compared to prior methods.
Conference Paper
Full-text available
In current networks, a domain can effectively run a network architecture only if it is explicitly supported by the network infrastructure. This coupling between architecture and infrastructure means that any significant architectural change involves sizable costs for vendors (for development) and network operators (for deployment), creating a significant barrier to architectural evolution. In this paper we advocate decoupling architecture from infrastructure by leveraging the recent advances in SDN, the re-emergence of software forwarding, and MPLS's distinction between network's core and edge. We sketch our design, called Software-Defined Internet Architecture (SDIA), and show how it would ease the adoption of various new Internet architectures and blur the distinction between architectures and services.
Article
Full-text available
Network-wide migrations of a running network, such as the replacement of a routing protocol or the modification of its configuration, can improve the performance, scalability, manageability, and security of the entire network. However, such migrations are an important source of concerns for network operators as the reconfiguration campaign can lead to long, service-disrupting outages. In this paper, we propose a methodology that addresses the problem of seamlessly modifying the configuration of link-state Interior Gateway Protocols (IGPs). We illustrate the benefits of our methodology by considering several migration scenarios, including the addition and the removal of routing hierarchy in a running IGP, and the replacement of one IGP with another. We prove that a strict operational ordering can guarantee that the migration will not create any service outage. Although finding a safe ordering is NP-complete, we describe techniques that efficiently find such an ordering and evaluate them using several real-world and inferred ISP topologies. Finally, we describe the implementation of a provisioning system that automatically performs the migration by pushing the configurations on the routers in the appropriate order while monitoring the entire migration process.
Conference Paper
Full-text available
Datacenter networks (DCNs) are constantly evolving due to various updates such as switch upgrades and VM migrations. Each update must be carefully planned and executed in order to avoid disrupting many of the mission-critical, interactive applications hosted in DCNs. The key challenge arises from the inherent difficulty in synchronizing the changes to many devices, which may result in unforeseen transient link load spikes or even congestions. We present one primitive, zUpdate, to perform congestion-free network updates under asynchronous switch and traffic matrix changes. We formulate the update problem using a network model and apply our model to a variety of representative update scenarios in DCNs. We develop novel techniques to handle several practical challenges in realizing zUpdate as well as implement the zUpdate prototype on OpenFlow switches and deploy it on a testbed that resembles real DCN topology. Our results, from both real-world experiments and large-scale trace-driven simulations, show that zUpdate can effectively perform congestion-free updates in production DCNs.
Conference Paper
Full-text available
Configuration changes are a common source of instability in networks, leading to outages, performance disruptions, and security vulnerabilities. Even when the initial and final configurations are correct, the update process itself often steps through intermediate configurations that exhibit incorrect behaviors. This paper introduces the notion of consistent network updates---updates that are guaranteed to preserve well-defined behaviors when transitioning mbetween configurations. We identify two distinct consistency levels, per-packet and per-flow, and we present general mechanisms for implementing them in Software-Defined Networks using switch APIs like OpenFlow. We develop a formal model of OpenFlow networks, and prove that consistent updates preserve a large class of properties. We describe our prototype implementation, including several optimizations that reduce the overhead required to perform consistent updates. We present a verification tool that leverages consistent updates to significantly reduce the complexity of checking the correctness of network control software. Finally, we describe the results of some simple experiments demonstrating the effectiveness of these optimizations on example applications.
Conference Paper
Full-text available
We explore the nature of trac in data centers, designed to su p- port the mining of massive data sets. We instrument the servers to collect socket-level logs, with negligible performance impact. In a 1500 server operational cluster, we thus amass roughly a petabyte of measurements over two months, from which we obtain and re- portdetailedviewsoftracandcongestionconditionsandp atterns. We further consider whether trac matrices in the clustermi ght be obtained instead via tomographic inference from coarser-grained counter data.
Conference Paper
Full-text available
OpenFlow is a great concept, but its original design imposes excessive overheads. It can simplify network and traffic management in enterprise and data center environments, because it enables flow-level control over Ethernet switching and provides global visibility of the flows in the network. However, such fine-grained control and visibility comes with costs: the switch-implementation costs of involving the switch's control-plane too often and the distributed-system costs of involving the OpenFlow controller too frequently, both on flow setups and especially for statistics-gathering. In this paper, we analyze these overheads, and show that OpenFlow's current design cannot meet the needs of high-performance networks. We design and evaluate DevoFlow, a modification of the OpenFlow model which gently breaks the coupling between control and global visibility, in a way that maintains a useful amount of visibility without imposing unnecessary costs. We evaluate DevoFlow through simulations, and find that it can load-balance data center traffic as well as fine-grained solutions, without as much overhead: DevoFlow uses 10--53 times fewer flow table entries at an average switch, and uses 10--42 times fewer control messages.
Conference Paper
Full-text available
Cloud data centers host diverse applications, mixing workloads that require small predictable latency with others requiring large sustained throughput. In this environment, today's state-of-the-art TCP protocol falls short. We present measurements of a 6000 server production cluster and reveal impairments that lead to high application latencies, rooted in TCP's demands on the limited buffer space available in data center switches. For example, bandwidth hungry "background" flows build up queues at the switches, and thus impact the performance of latency sensitive "foreground" traffic. To address these problems, we propose DCTCP, a TCP-like protocol for data center networks. DCTCP leverages Explicit Congestion Notification (ECN) in the network to provide multi-bit feedback to the end hosts. We evaluate DCTCP at 1 and 10Gbps speeds using commodity, shallow buffered switches. We find DCTCP delivers the same or better throughput than TCP, while using 90% less buffer space. Unlike TCP, DCTCP also provides high burst tolerance and low latency for short flows. In handling workloads derived from operational measurements, we found DCTCP enables the applications to handle 10X the current background traffic, without impacting foreground traffic. Further, a 10X increase in foreground traffic does not cause any timeouts, thus largely eliminating incast problems.
Conference Paper
Full-text available
Large datacenter operators with sites at multiple locations dimension their key resources according to the peak demand of the geographic area that each site covers. The demand of specific areas follows strong diurnal patterns with high peak to valley ratios that result in poor average utilization across a day. In this paper, we show how to rescue unutilized bandwidth across multiple datacenters and backbone networks and use it for non-real-time applications, such as backups, propagation of bulky updates, and migration of data. Achieving the above is non-trivial since leftover bandwidth appears at different times, for different durations, and at different places in the world. For this purpose, we have designed, implemented, and validated NetStitcher, a system that employs a network of storage nodes to stitch together unutilized bandwidth, whenever and wherever it exists. It gathers information about leftover resources, uses a store-and-forward algorithm to schedule data transfers, and adapts to resource fluctuations. We have compared NetStitcher with other bulk transfer mechanisms using both a testbed and a live deployment on a real CDN. Our testbed evaluation shows that NetStitcher outperforms all other mechanisms and can rescue up to five times additional datacenter bandwidth thus making it a valuable tool for datacenter providers. Our live CDN deployment demonstrates that our solution can perform large data transfers at a much lower cost than naive end-to-end or store-and-forward schemes.
Article
Full-text available
This whitepaper proposes OpenFlow: a way for researchers to run experimental protocols in the networks they use ev- ery day. OpenFlow is based on an Ethernet switch, with an internal flow-table, and a standardized interface to add and remove flow entries. Our goal is to encourage network- ing vendors to add OpenFlow to their switch products for deployment in college campus backbones and wiring closets. We believe that OpenFlow is a pragmatic compromise: on one hand, it allows researchers to run experiments on hetero- geneous switches in a uniform way at line-rate and with high port-density; while on the other hand, vendors do not need to expose the internal workings of their switches. In addition to allowing researchers to evaluate their ideas in real-world traffic settings, OpenFlow could serve as a useful campus component in proposed large-scale testbeds like GENI. Two buildings at Stanford University will soon run OpenFlow networks, using commercial Ethernet switches and routers. We will work to encourage deployment at other schools; and We encourage you to consider deploying OpenFlow in your university network too.
Conference Paper
The software-defined networking paradigm introduces interesting opportunities to operate networks in a more flexible yet formally verifiable manner. Despite the logically centralized control, however, a Software-Defined Network (SDN) is still a distributed system, with inherent delays between the switches and the controller. Especially the problem of changing network configurations in a consistent manner, also known as the consistent network update problem, has received much attention over the last years. This paper revisits the problem of how to update an SDN in a transiently consistent, loop-free manner. First, we rigorously prove that computing a maximum (“greedy”) loop-free network update is generally NP-hard; this result has implications for the classic maximum acyclic subgraph problem (the dual feedback arc set problem) as well. Second, we show that for special problem instances, fast and good approximation algorithms exist.
Conference Paper
With the rise of Software Defined Networks (SDN), there is growing interest in dynamic and centralized traffic engineering, where decisions about forwarding paths are taken dynamically from a network-wide perspective. Frequent path reconfiguration can significantly improve the network performance, but should be handled with care, so as to minimize disruptions that may occur during network updates. In this paper we introduce Time4, an approach that uses accurate time to coordinate network updates. We characterize a set of update scenarios called flow swaps, for which Time4 is the optimal update approach, yielding less packet loss than existing update approaches. We define the lossless flow allocation problem, and formally show that in environments with frequent path allocation, scenarios that require simultaneous changes at multiple network devices are inevitable. We present the design, implementation, and evaluation of a Time4-enabled OpenFlow prototype. The prototype is publicly available as open source. Our work includes an extension to the OpenFlow protocol that has been adopted by the Open Networking Foundation (ONF), and is now included in OpenFlow 1.5. Our experimental results demonstrate the significant advantages of Time4 compared to other network update approaches.
Conference Paper
We propose FLIP, a new algorithm for SDN network updates that preserve forwarding policies. FLIP builds upon the dualism between replacements and additions of switch flow-table rules. It identifies constraints on rule replacements and additions that independently prevent policy violations from occurring during the update. Moreover, it keeps track of alternative constraints, avoiding the same policy violation. Then, it progressively explores the solution space by swapping constraints with their alternatives, until it reaches a satisfiable set of constraints. Extensive simulations show that FLIP outperforms previous proposals. It achieves a much higher success rate than algorithms based on rule replacements only, and massively reduces the memory overhead with respect to techniques solely relying on rule additions.
Conference Paper
Updating network configurations responding to dynamic changes is still a tricky task in SDN. During the update process, in-flight packets might misuse different versions of rules, and “hot” links could be overloaded due to the unplanned update order. As for the problem of misusing rule, recently proposed suggestions like two-phase mechanism and Customizable Consistency Generator (CCG) have provided generic and customizable solutions. Yet, there does not exist an approach that is flexible to avoid the transient congestion on hot links respecting to diverse user requirements like guaranteeing update deadline, managing transient throughput loss, etc.; controllers urgently need one. In this paper, we propose CUP, Customizable Update Planner, to seek the solution. Different from prior approaches that adopt fixed designs for a single purpose like optimizing the update speed (e.g., Dionysus) or avoiding congestions (e.g., zUpdate, SWAN), CUP introduces generic linear programming models to formulate user-specified requirements and the update planning problem. By solving these customized models, CUP is able to plan network updates according to a large fraction of user requirements, such as guaranteeing deadlines, prioritizing operation orders, managing throughput loss, etc., while avoiding transient congestion. We prototype CUP on Ryu and employ it to arrange updates for networks built upon Mininet. Results confirm the flexibility of CUP while indicating that it always obtains the “best” update plans following the user's wish.
Conference Paper
Computer networks have become a critical infrastructure. Especially in shared environments such as datacenters it is important that a correct, consistent and secure network operation is guaranteed at any time, even during routing policy updates. In particular, at no point in time should it be possible for packets to bypass security critical waypoints~(such as a firewall or IDS) or to be forwarded along loops. This paper studies the problem of how to change routing policies in a transiently consistent manner. Transiently consistent network updates have been proposed as a fast and resource efficient alternative to per-packet consistent updates. Our main result is a negative one: we show that there are settings where the two basic properties waypoint enforcement and loop-freedom cannot be satisfied simultaneously. Even worse, we rigorously prove that deciding whether a waypoint enforcing, loop-free network update schedule exists is NP-hard. These results hold for both kinds of loop-freedom used in the literature: strong and relaxed loop-freedom. This paper also presents optimized, exact mixed integer programs to compute optimal update schedules. We report on extensive simulation results and initiate the discussion of scenarios where multiple waypoints need to be ensured (also known as service chains).
Conference Paper
We consider the problem of updating arbitrary routes in a software-defined network in a (transiently) loop-free manner. We are interested in fast network updates, i.e., in schedules which minimize the number of interactions (i.e., rounds) between the controller and the network nodes. We first prove that this problem is difficult in general: The problem of deciding whether a k-round schedule exists is NP-complete already for k = 3, and there are problem instances requiring Ω(n) rounds, where n is the network size. Given these negative results, we introduce an attractive, relaxed notion of loop-freedom. We prove that O(log n)-round relaxed loop-free schedules always exist, and can also be computed efficiently.
Conference Paper
Networks are critical for the security of many computer systems. However, their complex and asynchronous nature often renders it difficult to formally reason about network behavior. Accordingly, it is challenging to provide correctness guarantees, especially during network updates. This paper studies how to update networks while maintaining a most basic safety property, Waypoint Enforcement (WPE): each packet is required to traverse a certain checkpoint (for instance, a firewall). Waypoint enforcement is particularly relevant in today's increasingly virtualized and software-defined networks, where new in-network functionality is introduced flexibly. We show that WPE can easily be violated during network updates, even though both the old and the new policy ensure WPE. We then present an algorithm WayUp that guarantees WPE at any time, while completing updates quickly. We also find that in contrast to other transient consistency properties, WPE cannot always be implemented in a wait-free manner, and that WPE may even conflict with Loop-Freedom (LF). Finally, we present an optimal policy update algorithm OptRounds, which requires a minimum number of communication rounds while ensuring both WPE and LF, whenever this is possible.
Conference Paper
We study consistent migration of flows, with special focus on software defined networks. Given a current and a desired network flow configuration, we give the first polynomial-time algorithm to decide if a congestion-free migration is possible. However, if all flows must be integer or are unsplittable, this is NP-hard to decide. A similar problem is providing increased bandwidth to an application, while keeping all other flows in the network, but possibly migrating them consistently to other paths. We show that the maximum increase can be approximated arbitrarily well in polynomial time. Current methods as RSVPTE consider unsplittable flows and remove flows of lesser importance in order to increase bandwidth for an application: We prove that deciding what flows need to be removed is an NP-hard optimization problem with no PTAS possible unless P = NP.
Conference Paper
Updating network flows in a real-world setting is a nascent research area, especially with the recent rise of Software Defined Networks. While augmenting s-t flows of a single commodity is a well-understood concept, we study updating flows in a multi-commodity setting: Given a directed network with flows of different commodities, how can the capacity of some commodities be increased, without reducing capacities of other commodities, when moving flows in the network in an orchestrated order? To this extent, we show how the notion of augmenting flows can be efficiently extended to multiple commodities for anycast applications.
Article
Networks without effective AQM may again be vulnerable to congestion collapse.
Conference Paper
Software Defined Networks (SDNs) are becoming the leading technology behind many traffic engineering solutions, both for backbone and data-center networks, since it allows a central controller to globally plan the path of the flows according to the operator's objective. Nevertheless, networking devices' forwarding table is a limited and expensive resource (e.g., TCAM-based switches) which should thus be considered upon configuring the network. In this paper, we concentrate on satisfying global network objectives, such as maximum flow, in environments where the size of the forwarding table in network devices is limited. We formulate this problem as an (NP-hard) optimization problem and present approximation algorithms for it. We show through extensive simulations that practical use of our algorithms (both in Data Center and backbone scenarios) result in a significant reduction (factor 3) in forwarding table size, while having a small effect on the global objective (maximum flow).
Article
Discrete optimization problems are everywhere, from traditional operations research planning (scheduling, facility location and network design); to computer science databases; to advertising issues in viral marketing. Yet most such problems are NP-hard; unless P = NP, there are no efficient algorithms to find optimal solutions. This book shows how to design approximation algorithms: efficient algorithms that find provably near-optimal solutions. The book is organized around central algorithmic techniques for designing approximation algorithms, including greedy and local search algorithms, dynamic programming, linear and semidefinite programming, and randomization. Each chapter in the first section is devoted to a single algorithmic technique applied to several different problems, with more sophisticated treatment in the second section. The book also covers methods for proving that optimization problems are hard to approximate. Designed as a textbook for graduate-level algorithm courses, it will also serve as a reference for researchers interested in the heuristic solution of discrete optimization problems.
Article
Datacenter WAN traffic consists of high priority transfers that have to be carried as soon as they arrive alongside large transfers with pre-assigned deadlines on their completion (ranging from minutes to hours). The ability to offer guarantees to large transfers is crucial for business needs and impacts overall cost-of-business. State-of-the-art traffic engineering solutions only consider the current time epoch and hence cannot provide pre-facto promises for long-lived transfers. We present Tempus, an online traffic engineering scheme that exploits information on transfer size and deadlines to appropriately pack long-running transfers across network paths and time, thereby leaving enough capacity slack for future high-priority requests. Tempus builds on a tailored approximate solution to a mixed packing-covering linear program, which is parallelizable and scales well in both running time and memory usage. Consequently, Tempus is able to quickly and effectively update its solution when new transfers arrive or unexpected changes happen. These updates involve only small edits to existing transfers. Therefore, as experiments on traces from a large production WAN show, Tempus can offer and keep promises to long-lived transfers well in advance of their actual deadline; the promise on minimal transfer size is comparable with an offline optimal solution and outperforms state-of-the-art solutions by 2-3X.
Conference Paper
We present the design, implementation, and evaluation of B4, a private WAN connecting Google's data centers across the planet. B4 has a number of unique characteristics: i) massive bandwidth requirements deployed to a modest number of sites, ii) elastic traffic demand that seeks to maximize average bandwidth, and iii) full control over the edge servers and network, which enables rate limiting and demand measurement at the edge. These characteristics led to a Software Defined Networking architecture using OpenFlow to control relatively simple switches built from merchant silicon. B4's centralized traffic engineering service drives links to near 100% utilization, while splitting application flows among multiple paths to balance capacity against application priority/demands. We describe experience with three years of B4 production deployment, lessons learned, and areas for future work.
Conference Paper
We present SWAN, a system that boosts the utilization of inter-datacenter networks by centrally controlling when and how much traffic each service sends and frequently re-configuring the network's data plane to match current traffic demand. But done simplistically, these re-configurations can also cause severe, transient congestion because different switches may apply updates at different times. We develop a novel technique that leverages a small amount of scratch capacity on links to apply updates in a provably congestion-free manner, without making any assumptions about the order and timing of updates at individual switches. Further, to scale to large networks in the face of limited forwarding table capacity, SWAN greedily selects a small set of entries that can best satisfy current demand. It updates this set without disrupting traffic by leveraging a small amount of scratch capacity in forwarding tables. Experiments using a testbed prototype and data-driven simulations of two production networks show that SWAN carries 60% more traffic than the current practice.
Conference Paper
A consistent update installs a new packet-forwarding policy across the switches of a software-defined network in place of an old policy. While doing so, such an update guarantees that every packet entering the network either obeys the old policy or the new one, but not some combination of the two. In this paper, we introduce new algorithms that trade the time required to perform a consistent update against the rule-space overhead required to implement it. We break an update in to k rounds that each transfer part of the traffic to the new configuration. The more rounds used, the slower the update, but the smaller the rule-space overhead. To ensure consistency, our algorithm analyzes the dependencies between rules in the old and new policies to determine which rules to add and remove on each round. In addition, we show how to optimize rule space used by representing the minimization problem as a mixed integer linear program. Moreover, to ensure the largest flows are moved first, while using rule space efficiently, we extend the mixed integer linear program with additional constraints. Our initial experiments show that a 6-round, optimized incremental update decreases rule space overhead from 100% to less than 10%. Moreover, if we cap the maximum rule-space overhead at 5% and assume the traffic flow volume follows Zipf's law, we find that 80% of the traffic may be transferred to the new policy in the first round and 99% in the first 3 rounds.
Conference Paper
Networks today rely on middleboxes to provide critical performance, security, and policy compliance capabilities. Achieving these benefits and ensuring that the traffic is directed through the desired sequence of middleboxes requires significant manual effort and operator expertise. In this respect, Software-Defined Networking (SDN) offers a promising alternative. Middleboxes, however, introduce new aspects (e.g., policy composition, resource management, packet modifications) that fall outside the purvey of traditional L2/L3 functions that SDN supports (e.g., access control or routing). This paper presents SIMPLE, a SDN-based policy enforcement layer for efficient middlebox-specific "traffic steering''. In designing SIMPLE, we take an explicit stance to work within the constraints of legacy middleboxes and existing SDN interfaces. To this end, we address algorithmic and system design challenges to demonstrate the feasibility of using SDN to simplify middlebox traffic steering. In doing so, we also take a significant step toward addressing industry concerns surrounding the ability of SDN to integrate with existing infrastructure and support L4-L7 capabilities.
Article
Updates to network configurations are notoriously difficult to implement correctly. Even if the old and new configurations are correct, the update process can introduce transient errors such as forwarding loops, dropped packets, and access control violations. The key factor that makes updates difficult to implement is that networks are distributed systems with hundreds or even thousands of nodes, but updates must be rolled out one node at a time. In networks today, the task of determining a correct sequence of updates is usually done manually – a tedious and error-prone process for network operators. This paper presents a new tool for synthesizing network updates automatically. The tool generates efficient updates that are guaranteed to respect invariants specified by the operator. It works by navigating through the (restricted) space of possible solutions, learning from counterexamples to improve scalability and optimize performance. We have implemented our tool in OCaml, and conducted experiments showing that it scales to networks with a thousand switches and tens of switches updating.
Article
Configuration updates are a leading cause of instability in networks today. A key factor that makes updates difficult to implement is that networks are distributed systems with hundreds or thousands of nodes all interacting with each other. Even if the initial and final configurations are correct, naively updating individual nodes can easily cause the network to exhibit incorrect behaviors such as forwarding loops, black holes, and security vulnerabilities. This paper presents a new approach to the network update problem: we automatically generate updates that are guaranteed to preserve important correctness properties. Our system is based on counterexample-guided search and incorporates heuristics that quickly identify correct updates in many situations. We exploit the fact that the search algorithm asks a series of related model checking questions, and develop an efficient incremental LTL model checking algorithm. We describe a prototype implementation of our system in OCaml, and present experimental results showing that it efficiently generates updates for real-world topologies with thousands of nodes.
Article
MPLS was an attempt to simplify network hardware while improving the flexibility of network control. Software-Defined Networking (SDN) was designed to make further progress along both of these dimensions. While a significant step forward in some respects, it was a step backwards in others. In this paper we discuss SDN's shortcomings and propose how they can be overcome by adopting the insight underlying MPLS. We believe this hybrid approach will enable an era of simple hardware and flexible control.
Article
We describe a new protocol for update of OpenFlow networks, which has the packet consistency condition of [?] and a weak form of the flow consistency condition of [?]. The protocol conserves switch resources, particularly TCAM space, by ensuring that only a single set of rules is present on a switch at any time. The protocol exploits the identity of switch rules with Boolean functions, and the ability of any switch to send packets to a controller for routing. When a network changes from one ruleset (ruleset 1) to another (ruleset 2), the packets affected by the change are computed, and are sent to the controller. When all switches have been updated to send affected packets to the controller, ruleset 2 is sent to the switches and packets sent to the controller are re-released into the network.
Conference Paper
The effects of data center traffic characteristics on data center traffic engineering is not well understood. In particular, it is unclear how existing traffic engineering techniques perform under various traffic patterns, namely how do the computed routes differ from the optimal routes. Our study reveals that existing traffic engineering techniques perform 15% to 20% worse than the optimal solution. We find that these techniques suffer mainly due to their inability to utilize global knowledge about flow characteristics and make coordinated decision for scheduling flows. To this end, we have developed MicroTE, a system that adapts to traffic variations by leveraging the short term and partial predictability of the traffic matrix. We implement MicroTE within the OpenFlow framework and with minor modification to the end hosts. In our evaluations, we show that our system performs close to the optimal solution and imposes minimal overhead on the network making it appropriate for current and future data centers.
Article
We give an approximation-preserving reduction from vector scheduling problem (VS) to generalized load balancing problem (GLB). The reduction bridges existing results of the two problems. Specifically, the hardness result for VS holds also for GLB and any algorithm for GLB can be used to solve VS. Based on this, we get two new results. First, GLB does not have constant approximation algorithms that run in polynomial time unless $P=NP$. Second, there is an online algorithm (vectors coming in an online fashion) that solves VS with approximation bound $e\log(md)$, where $e$ is the natural number, $m$ is the number of partitions and $d$ is the dimension of vectors. The algorithm is borrowed from GLB literature and is very simple in that each vector only needs to minimize the $L_{\ln(md)}$ norm of the resulting load. However, it is unclear whether this algorithm runs in polynomial time, due to the irrational and non-integer nature of $\ln(md)$. We address this issue by rounding $\ln(md)$ to the next integer. We prove that the resulting algorithm runs in polynomial time with approximation bound $e\log(md)+\frac{e\log(e)}{\ln(md)+1}$ which is in $O(\ln(md))$. This improves the $O(\ln^2d)$ bound of the existing polynomial time algorithm for VS.
Conference Paper
Today's data centers may contain tens of thousands of computers with significant aggregate bandwidth requirements. The network architecture typically consists of a tree of routing and switching elements with progressively more specialized and expensive equipment moving up the network hierarchy. Unfortunately, even when deploying the highest-end IP switches/routers, resulting topologies may only support 50% of the aggregate bandwidth available at the edge of the network, while still incurring tremendous cost. Non-uniform bandwidth among data center nodes complicates application design and limits overall system performance. In this paper, we show how to leverage largely commodity Ethernet switches to support the full aggregate bandwidth of clusters consisting of tens of thousands of elements. Similar to how clusters of commodity computers have largely replaced more specialized SMPs and MPPs, we argue that appropriately architected and interconnected commodity switches may deliver more performance at less cost than available from today's higher-end solutions. Our approach requires no modifications to the end host network interface, operating system, or applications; critically, it is fully backward compatible with Ethernet, IP, and TCP.
Conference Paper
Mininet is a system for rapidly prototyping large networks on the constrained resources of a single laptop. The lightweight approach of using OS-level virtualization features, including processes and network namespaces, allows it to scale to hundreds of nodes. Experiences with our initial implementation suggest that the ability to run, poke, and debug in real time represents a qualitative change in workflow. We share supporting case studies culled from over 100 users, at 18 institutions, who have developed Software-Defined Networks (SDN). Ultimately, we think the greatest value of Mininet will be supporting collaborative network research, by enabling self-contained SDN prototypes which anyone with a PC can download, run, evaluate, explore, tweak, and build upon.
Conference Paper
Many studies show that, when Internet links go up or down, the dynamics of BGP may cause several minutes of packet loss. The loss occurs even when multiple paths between the sender and receiver domains exist, and is unwarranted given the high connectivity of the Internet. Our objective is to ensure that Internet domains stay connected as long as the underlying network is connected. Our solution, R-BGP works by pre-computing a few strategically chosen failover paths. R-BGP provably guarantees that a domain will not become disconnected from any destination as long as it will have a policy-compliant path to that destination after convergence. Surprisingly, this can be done using a few simple and practical modifications to BGP, and, like BGP, requires announcing only one path per neighbor. Simulations on the AS-level graph of the current Internet show that R-BGP reduces the number of domains that see transient disconnectivity resulting from a link failure from 22% for edge links and 14% for core links down to zero in both cases.
Article
In this paper we describe several forms of thek-partition problem and give integer programming formulations of each case. The dimension of the associated polytopes and some basic facets are identified. We also give several valid and facet defining inequalities for each of the polytopes.
Article
A significant fraction of network events (such as topology or route changes) and the resulting performance degradation stem from premeditated network management and operational tasks. This paper introduces a general class of Graceful Network State Migration (GNSM) problems, where the goal is to discover the optimal sequence of operations that progressively transition the network from its initial to a desired final state while minimizing the overall performance disruption. We investigate two specific GNSM problems: 1) Link Weight Reassignment Scheduling (LWRS) studies the optimal ordering of link weight updates to migrate from an existing to a new link weight assignment; and 2) Link Maintenance Scheduling (LMS) looks at how to schedule link deactivations and subsequent reactivations for maintenance purposes. LWRS and LMS are both combinatorial optimization problems. We use dynamic programming to find the optimal solutions when the problem size is small, and leverage ant colony optimization to get near-optimal solutions for large problem sizes. Our simulation study reveals that judiciously ordering network operations can achieve significant performance gains. Our GNSM solution framework is generic and applies to similar problems with different operational contexts, underlying network protocols or mechanisms, and performance metrics.
Article
This paper presents a solution aimed at avoiding losses of connectivity when an eBGP peering link is shut down by an operator for a maintenance. Currently, shutting down an eBGP session can lead to transient losses of connectivity even though alternate path are available at the borders of the network. This is very unfortunate as ISPs face more and more stringent service level agreements, and maintenance operations are predictable operations, so that there is time to adapt to the change and preserve the respect of the service level agreement.
Enforcing customizable consistency properties in software-defined networks
  • zhou
Increasing datacenter network utilisation with GRIN
  • agache