Conference Paper

Max-Min Fairness based Scheduling Optimization Mechanism on Switches

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
In datacenter networks, bandwidth-demanding elephant flows without deadline and delay-sensitive mice flows with strict deadline coexist. They compete with each other for limited network resources, and the effective scheduling of such mix-flows is extremely challenging. We propose a deep reinforcement learning with private link approach (DRL-PLink), which combines the software-defined network and deep reinforcement learning (DRL) to schedule mix-flows. DRL-PLink divides the link bandwidth and establishes some corresponding private-links for different types of flows to isolate them such that the competition among different types of flows can decrease accordingly. DRL is used to adaptively and intelligently allocate bandwidth resources for these private-links. Furthermore, to improve the scheduling policy, DRL-PLink introduces the novel clipped double Q-learning, exploration with noise, and prioritized experience replay technology for DDPG to address function approximation error, to induce lager and more randomness for exploration, as well as more effective and efficient experience replay in DRL respectively. The experiment results under actual datacenter network workloads (including web search and data mining workload) indicate that DRL-PLink can effectively schedule mix-flows at a small system overhead. Compared with ECMP, pFabric, and Karuna, the average flow completion time of DRL-PLink decreased by 77.79%, 65.61%, and 23.34% respectively, when the deadline meet rate is increased by 16.27%, 0.02%, and 0.836% respectively. Additionally, DRL-PLink can also well achieve load balance between paths.
Article
Full-text available
Today’s data centers act as the primary infrastructure for emerging technologies. QoS imposes requirements for more attentive techniques that can deal with different characteristics of traffic classes and patterns. In this context, network flows can be classified into large and long-lived flows called elephant flows and mice flows, which are small and short-lived flows. According to the characteristics of the emerging technologies, e.g., IoT and Big Data, mice flows are dominant; Hence, it is crucial to improve Flow Completion Time (FCT) for such delay-sensitive flows. This paper presents Sieve, a new distributed Software Defined Networks (SDN) based framework. Sieve initially schedules a portion of the flows based on the available bandwidth despite their classes. We propose a distributed sampling technique which sends a portion of the packets to the controller. Furthermore, Sieve polls the edge switches periodically to get the network information rather than polls all switches in the network, and it reschedules elephant flows only. Mininet emulator and mathematical analysis have been employed to validate the proposed solution in 4-ary Fat-Tree DCN. Sieve provides less FCT up to around 58% for mice flows and maintains throughput of elephant flows compared to Equal Cost MultiPath (ECMP) and Hedera.
Conference Paper
Full-text available
Datacenter networked applications, often require multiple data transfer flows that semantically constitute a coflow group. A coflow is thus considered completed when all the transfers in the coflow are completed. Hence, application performance is optimized whenever the completion time of a coflow is minimized, rather than that of the flows composing it. Currently, popular coflow scheduling algorithms are mostly centralized, and they incur high overheads. The decentralized approach in the ``many-to-many’’ scenario also incurs high communication overheads due to the communication among the local controllers. In this paper, we present a coflow scheduling mechanism that aims to minimize the coflow completion time for coflows that show a many-to-many communication pattern, and as a byproduct communication overhead cost is also minimized. Our algorithm preserves compatibility with existing commodity switches and network protocols and improves the coflow completion times on average by 1.8 times compared to the baseline as demonstrated via testbed implementation and large-scale simulation
Article
Full-text available
SDN (Software defined networking) provides effective technical methods for optimal resource management. However, current related schemes suffer from resource conflicts frequently and seriously because they mix elephant and mice flows on shared transmission path. So, controllers in SDN has to be smart enough to detect elephant flows with low cost and then reroute elephant and mice flows, respectively. Unfortunately, existing elephant flow detection schemes suffer from high bandwidth consumption and long detection time; and little literature considers mice-flow scheduling. In this paper, we firstly propose an Efficient Sampling and Classification Approach (ESCA). Then, we propose a novel differentiated scheduling approach that sets up paths for elephant and mice flows separately and dynamically. Our ESCA significantly reduces sampling overhead through estimating the arrival interval of elephant flows and filtering out redundant samples, and efficiently classifies samples with a new supervised classification algorithm based on correlation among data flows. Based on our low-cost ESCA, we propose a feature-aware flow scheduling approach DiffSch that differentiatedly schedules elephant and mice flows. The theoretical analysis proofs our ESCA outperforms related schemes. Extensive experiment results further demonstrate that our ESCA can provide accurate detection with less sampled packets and shorter detection time; and our scheduling approach DiffSch significantly outperforms related proposals.
Conference Paper
Full-text available
Traffic characterization is one of the major challenges in today’s security industry. The continuous evolution and generation of new applications and services, together with the expansion of encrypted communications makes it a difficult task. Virtual Private Networks (VPNs) are an example of encrypted communication service that is becoming popular, as method for bypassing censorship as well as accessing services that are geographically locked. In this paper, we study the effectiveness of flow-based time-related features to detect VPN traffic and to characterize encrypted traffic into different categories, according to the type of traffic e.g., browsing, streaming, etc. We use two different well-known machine learning techniques (C4.5 and KNN) to test the accuracy of our features. Our results show high accuracy and performance, confirming that time-related features are good classifiers for encrypted traffic characterization.
Article
Full-text available
OpenFlow is a vendor-agnostic API for controlling hardware and software switches. In its current form, OpenFlow is specific to particular protocols, making it hard to add new protocol headers. It is also tied to a specific processing paradigm. In this paper we make a strawman proposal for how OpenFlow should evolve in the future, starting with the definition of an abstract forwarding model for switches. We have three goals: (1) Protocol independence: Switches should not be tied to any specific network protocols. (2) Target independence: Programmers should describe how switches are to process packets in a way that can be compiled down to any target switch that fits our abstract forwarding model. (3) Reconfigurability in the field: Programmers should be able to change the way switches process packets once they are deployed in a network. We describe how to write programs using our abstract forwarding model and our P4 programming language in order to configure switches and populate their forwarding tables.
Article
Full-text available
Most prior work on congestion in datagram systems focuses on buffer management. We find it illuminating to consider the case of a packet switch with infinite storage. Such a packet switch can never run out of buffers. It can, however, still become congested. The meaning of congestion in an infinite-storage system is explored. We demonstrate the unexpected result that a datagram network with infinite storage, first-in, first-out queueing, at least two packet switches, and a finite packet lifetime will, under overload, drop all packets. By attacking the problem of congestion for the infinite-storage case, we discover new solutions applicable to switches with finite storage.
Conference Paper
Full-text available
We present an admission control scheme which provides per-flow delay and bandwidth guarantees based solely upon simple class-based strict priority queueing. We derive basic properties of the worst-case behaviour in strict priority queueing systems using network calculus. Building upon these properties, a flow admission control scheme is devised. The rationale behind this work is the appealing simplicity as well as the almost ubiquitous availability of strict priority queueing in today's routers and the thus promising applicability of our results for practical purposes in providing quality of service (QoS) in the Internet.
Article
Full-text available
This paper considers the potentially negative impacts of an increasing deployment of non-congestion-controlled best-effort traffic on the Internet. These negative impacts range from extreme unfairness against competing TCP traffic to the potential for congestion collapse. To promote the inclusion of end-to-end congestion control in the design of future protocols using best-effort traffic, we argue that router mechanisms are needed to identify and restrict the bandwidth of selected high-bandwidth best-effort flows in times of congestion. The paper discusses several general approaches for identifying those flows suitable for bandwidth regulation. These approaches are to identify a high-bandwidth flow in times of congestion as unresponsive, “not TCP-friendly”, or simply using disproportionate bandwidth. A flow that is not “TCP-friendly” is one whose long-term arrival rate exceeds that of any conformant TCP in the same circumstances. An unresponsive flow is one failing to reduce its offered load at a router in response to an increased packet drop rate, and a disproportionate-bandwidth flow is one that uses considerably more bandwidth than other flows in a time of congestion
Article
Mobile apps are increasingly relying on high-throughput and low-latency content delivery, while the available bandwidth on wireless access links is inherently time-varying. The handoffs between base stations and access modes due to user mobility present additional challenges to deliver a high level of user Quality-of-Experience (QoE). The ability to predict the available bandwidth and the upcoming handoffs will give applications valuable leeway to make proactive adjustments to avoid significant QoE degradation. In this paper, we explore the possibility and accuracy of realtime mobile bandwidth and handoff predictions in 4G/LTE and 5G networks. Towards this goal, we collect long consecutive traces with rich bandwidth, channel, and context information from public transportation systems. We develop Recurrent Neural Network models to mine the temporal patterns of bandwidth evolution in fixed-route mobility scenarios. Our models consistently outperform the conventional univariate and multivariate bandwidth prediction models. For the next second bandwidth prediction, in terms of Mean Absolute Error (MAE), our model is on average 15.28% better than the other methods in 4G traces and 15.37% better than the other methods in 5G traces. For 4G & 5G co-existing networks, we propose a new problem of handoff prediction between 4G and 5G, which is important to achieve good application performance in realistic 5G scenarios. We develop classification and regression based prediction models, which achieve more than 80% accuracy in predicting handoffs between 4G and 5G in a recent 5G dataset.
Article
In this paper, we present a novel bandwidth management scheme that we call adaptive bandwidth binning (ABB). ABB is presented in the context of a DOCSIS cable network, but it has obvious applicability to downstream service on any shared medium access network in which all downstream traffic is scheduled by a single headend device such as a CMTS or wireless base station or access point. ABB is capable of providing approximate weighted max-min fair sharing of downstream bandwidth via a low-overhead scheduler that requires only a small number of permanently allocated queues. A modern delay-based active queue management (AQM) technique is employed to control delays. The performance of ABB is evaluated via ns-2 simulations in which workloads include FTP, HTTP-based adaptive streaming (HAS), and web traffic targeted for different tiered service quality levels. Our results show that ABB is able to provide approximate weighted max-min fair bandwidth allocation among responsive high bandwidth flows while isolating them from low bandwidth and latency sensitive flows. The use of CoDel in each ABB queue is effective in managing latency as required. The use of flow weights in ABB supports service tiering in which subscribers pay more for higher service rates.
Article
Human-to-machine (H2M) communications in emerging tactile-haptic applications are characterized by stringent low latency transmission. To achieve low-latency transmissions over existing optical and wireless access networks, this paper proposes a machine learning-based predictive dynamic bandwidth allocation (DBA) algorithm, termed MLP-DBA, to address the uplink bandwidth contention and latency bottleneck of such networks. The proposed algorithm utilizes an artificial neural network (ANN) at the central office (CO) to predict H2M packet bursts arriving at each optical network unit wireless access point (ONU-AP), thereby enabling the uplink bandwidth demand of each ONU-AP to be estimated. As such, arriving packet bursts at the ONU-APs can be allocated bandwidth for transmitted by the CO without having to wait to transmit in the following transmission cycles. Extensive simulations show that the ANNbased prediction of H2M packet bursts achieves > 90% accuracy, significantly improving bandwidth demand estimation over existing prediction algorithms. MLP-DBA also makes adaptive bandwidth allocation decisions by classifying each ONU-AP according to its estimated bandwidth, with results showing reduced uplink latency and packet drop ratio as compared to conventional predictive DBA algorithms.
Conference Paper
Cloud applications generate a mix of flows with and without deadlines. Scheduling such mix-flows is a key challenge; our experiments show that trivially combining existing schemes for deadline/non-deadline flows is problematic. For example, prioritizing deadline flows hurts flow completion time (FCT) for non-deadline flows, with minor improvement for deadline miss rate. We present Karuna, a first systematic solution for scheduling mix-flows. Our key insight is that deadline flows should meet their deadlines while minimally impacting the FCT of non-deadline flows. To achieve this goal, we design a novel Minimal-impact Congestion control Protocol (MCP) that handles deadline flows with as little bandwidth as possible. For non-deadline flows, we extend an existing FCT minimization scheme to schedule flows with known and unknown sizes. Karuna requires no switch modifications and is back- ward compatible with legacy TCP/IP stacks. Our testbed experiments and simulations show that Karuna effectively schedules mix-flows, for example, reducing the 95th percentile FCT of non-deadline flows by up to 47.78% at high load compared to pFabric, while maintaining low (<5.8%) deadline miss rate.
Article
The problem of allocating network resources to the users of an integrated services network is investigated in the context of rate-based flow control. The network is assumed to be a virtual circuit, connection-based packet network. It is shown that the use of generalized processor sharing (GPS), when combined with leaky bucket admission control, allows the network to make a wide range of worst-case performance guarantees on throughput and delay. The scheme is flexible in that different users may be given widely different performance guarantees and is efficient in that each of the servers is work conserving. The authors present a practical packet-by-packet service discipline, PGPS that closely approximates GPS. This allows them to relate results for GPS to the packet-by-packet scheme in a precise manner. The performance of a single-server GPS system is analyzed exactly from the standpoint of worst-case packet delay and burstiness when the sources are constrained by leaky buckets. The worst-case session backlogs are also determined.< ></ETX
Conference Paper
For provisioning large-scale online applications such as web search, social networks and advertisement systems, data centers face extreme challenges in providing low latency for short flows (that result from end-user actions) and high throughput for background flows (that are needed to maintain data consistency and structure across massively distributed systems). We propose L2DCT, a practical data center transport protocol that targets a reduction in flow completion times for short flows by approximating the Least Attained Service (LAS) scheduling discipline, without requiring any changes in application software or router hardware, and without adversely affecting the long flows. L2DCT can co-exist with TCP and works by adapting flow rates to the extent of network congestion inferred via Explicit Congestion Notification (ECN) marking, a feature widely supported by the installed router base. Though L2DCT is deadline unaware, our results indicate that, for typical data center traffic patterns and deadlines and over a wide range of traffic load, its deadline miss rate is consistently smaller compared to existing deadline-driven data center transport protocols. L2DCT reduces the mean flow completion time by up to 50% over DCTCP and by up to 95% over TCP. In addition, it reduces the completion for 99th percentile flows by 37% over DCTCP. We present the design and analysis of L2DCT, evaluate its performance, and discuss an implementation built upon standard Linux protocol stack.
Article
The growing use of computers for mechanized inventory control and production planning has brought with it the need for explicit forecasts of sales and usage for individual products and materials. These forecasts must be made on a routine basis for thousands of products, so that they must be made quickly, and, both in terms of computing time and information storage, cheaply; they should be responsive to changing conditions. The paper presents a method of forecasting sales which has these desirable characteristics, and which in terms of ability to forecast compares favorably with other, more traditional methods. Several models of the exponential forecasting system are presented, along with several examples of application.
Article
Previous studies of Internet traffic have shown that a very small percentage of flows consume most of the network bandwidth. It is important to understand the characteristics of such flows for traffic monitoring and modeling purposes. Several prior researchers have characterized such flows using different classification schemes: by size as elephant and mice; by duration as tortoise and dragonfly; and by burstiness as alpha and beta traffic. However, it is not clear how these different definitions of flows are related to each other. In this work, using data recorded from two different operational networks, we study these “heavy-hitter” flows in four different dimensions, namely size, duration, rate and burstiness, and examine how they are correlated. This paper makes three contributions: first, we systematically characterize prior definitions for the properties of such heavy-hitter traffic. Second, based on our datasets, we observe that there are strong correlations between some combinations of size, rate and burstiness. Finally, we provide a plausible explanation for the observed correlations. We show that these correlations could be explained by transport and application-level protocol mechanisms.
Article
The paper provides a systematic development of the forecasting expressions for exponential weighted moving averages. Methods for series with no trend, or additive or multiplicative trend are examined. Similarly, the methods cover non-seasonal, and seasonal series with additive or multiplicative error structures. The paper is a reprinted version of the 1957 report to the Office of Naval Research (ONR 52) and is being published here to provide greater accessibility. (C) 2004 Published by Elsevier B.V. on behalf of International Institute of Forecasters.
Conference Paper
Fairness is an essential requirement of any operating system scheduler. Unfortunately, existing fair scheduling algorithms are either inaccurate or inefficient and non-scalable for multiprocessors. This problem is becoming increasingly severe as the hardware industry continues to produce larger scale multi-core processors. This paper presents Distributed Weighted Round-Robin (DWRR), a new scheduling algorithm that solves this problem. With distributed thread queues and small additional overhead to the underlying scheduler, DWRR achieves high efficiency and scalability. Besides conventional priorities, DWRR enables users to specify weights to threads and achieve accurate proportional CPU sharing with constant error bounds. DWRR operates in concert with existing scheduler policies targeting other system attributes, such as latency and throughput. As a result, it provides a practical solution for various production OSes. To demonstrate the versatility of DWRR,we have implemented it in Linux kernels 2.6.22.15 and 2.6.24, which represent two vastly different scheduler designs. Our evaluation shows that DWRR achieves accurate proportional fairness and high performance for a diverse set of workloads.
Conference Paper
Although there is tremendous interest in designing improved networks for data centers, very little is known about the network-level traffic characteristics of data centers today. In this paper, we conduct an empirical study of the network traffic in 10 data centers belonging to three different categories, including university, enterprise campus, and cloud data centers. Our definition of cloud data centers includes not only data centers employed by large online service providers offering Internet-facing applications but also data centers used to host data-intensive (MapReduce style) applications). We collect and analyze SNMP statistics, topology and packet-level traces. We examine the range of applications deployed in these data centers and their placement, the flow-level and packet-level transmission properties of these applications, and their impact on network and link utilizations, congestion and packet drops. We describe the implications of the observed traffic patterns for data center internal traffic engineering as well as for recently proposed architectures for data center networks.
Conference Paper
Fair queuing is a technique that allows each flow passing through a network device to have a fair share of network resources. Previous schemes for fair queuing that achieved nearly perfect fairness were expensive to implement: specifically, the work required to process a packet in these schemes was O(log(n)), where n is the number of active flows. This is expensive at high speeds. On the other hand, cheaper approximations of fair queuing that have been reported in the literature exhibit unfair behavior. In this paper, we describe a new approximation of fair queuing, that we call Deficit Round Robin. Our scheme achieves nearly perfect fairness in terms of throughput, requires only O(1) work to process a packet, and is simple enough to implement in hardware. Deficit Round Robin is also applicable to other scheduling problems where servicing cannot be broken up into smaller units, and to distributed queues.
Article
Caption title. Includes bibliographical references (p. 47-48). Research supported by a Vinton Hayes Fellowship. Abhay K. Parekh and Robert G. Gallager.
Article
Previous studies of Internet tra#c have shown that a very small percentage of flows consume most of the network bandwidth. It is important to understand the characteristics of such flows for tra#c monitoring and modeling purposes. Several prior researchers have characterized such flows using di#erent classification schemes: by size as elephant and mice; by duration as tortoise and dragonfly; and by burstiness as alpha and beta tra#c. However, it is not clear how these di#erent definitions of flows are related to each other. In our work, we study these "heavy-hitter" flows in four orthogonal dimensions, namely size, duration, rate and burstiness, and examine how they are correlated. This paper makes three contributions: First, we systematically characterize prior definitions for the properties of such heavy-hitter tra#c. Second, we show that there are strong correlations between some combinations of size, rate and burstiness. Finally, we show that these correlations can be explained by transport and application-level protocol mechanisms.
TCP congestion control
  • M Allman
  • V Paxson
  • E Blanton
Elephant flow detection and load-balanced routing with efficient sampling and classification
  • F Tang
  • H Zhang
  • L T Yang
  • L Chen
Approximating fair queueing on reconfigurable switches
  • Sharma
Approximating fair queueing on reconfigurable switches
  • N K Sharma
  • M Liu
  • K Atreya
  • A Krishnamurthy