Conference Paper

# Online VNF Chaining and Scheduling with Prediction: Optimality and Trade-Offs

Authors:
• Shenzhen Institute of Artificial Intelligence and Robotics for Society
To read the full-text of this research, you can request a copy directly from the authors.

## No full-text available

... Ref. [55] suggested that earlier performance prediction frameworks perform poorly on contemporary architectures and NFVs because they consider memory as a monolithic unit and ignores the fact that the memory subsystem has several components that might individually create congestion. An adjustable trade-off between numerous system metrics is achieved with POSCARS [56], an efficient, distributed, and online method that uses predictive scheduling to maintain stability. POSCARS presents three variations that employed randomized load balancing to reduce the sampling overhead. ...
Article
Full-text available
Network function virtualization (NFV) enables network operators to save costs and flexibility by replacing dedicated hardware with software network functions running on commodity servers. There is a high need for network acceleration to achieve performance comparable to hardware, which is vital for the implementation of NFV. The necessity of NFV acceleration stems from the lengthy packet delivery path following virtualization and the unavailability of generic operating system designs to serve network-specific scenarios. Therefore, the software approach alters the operating system’s processing architecture through Kernel Bypass or offload packet processing to hardware. A typical classification scheme divides it into two main categories based on technology with software and hardware. Only these two categories can be utilized to rapidly and easily establish a classification system. However, it is difficult to suggest the specifics and peculiarities of any acceleration approach during real-world operation. For a more comprehensive classification of NFV acceleration, we refer to the ETSI NFV architectural framework in this research. As the framework clearly illustrates, the technical infrastructure layer of NFV and the corresponding management roles provides a comprehensive and intuitive view of the differences between these acceleration technologies, solutions, and initiatives. Additionally, we conducted an analysis to identify opportunities for improvement in existing solutions and propose new research programs. We expect that NFV will increasingly rely on cloud services in the future. Since cloud services do not offer a choice of hardware, our acceleration method will be primarily software-based.
... However, with the gradual deepening of VNF-FGE research, there is a certain relevance among these stages, and a better embedding performance can be obtained by considering them together. At present, there are some works [76][77][78][79] have been devoted to solving the VNF-FGE problem with the VNF-CC or VNF-SCH coordination. These works can achieve good results. ...
Article
Full-text available
With the development of heterogeneous network structure, dynamic user requests as well as complex service types and applications scenarios, current networks may not accommodate the increasingly stringent requirements. As a result, the research of the beyond fifth generation (B5G) or the sixth generation (6G) networks has been put on the agenda. In B5G/6G networks, achieving the automatic, flexible, and cost-effective orchestration and management of network resources is a significant but challenging issue. Network function virtualization (NFV), as a promising paradigm to address this issue, has received considerable attention from both industry and academia. NFV leverages the virtualization technology to decouple network functions from dedicated hardware appliances to software middleboxes or called virtual network functions (VNFs) that run on the commodity servers. The demand for a network service becomes a request for running a set of VNFs deployed on the substrate network. The requested network service is orchestrated in the form of a VNF-forwarding graph (VNF-FG). The problem of embedding the VNF-FG into the substrate network is known as VNF-FG embedding (VNF-FGE). The efficiency and the management cost of a network are highly dependent on the optimization of VNF-FGE. This paper mainly presents a survey on solving the VNF-FGE problem. To this end, we present a general formulation and several objectives of the VNF-FGE problem. In the meanwhile, we summarize its different application scenarios from four perspectives and divide the approaches into four main categories based on the optimization methods. The main challenges and potential future directions due to the appearance of B5G/6G are also discussed.
Article
For NFV systems, the key design space includes the function chaining for network requests and the resource scheduling for servers. The problem is challenging since NFV systems usually require multiple (often conflicting) design objectives and the computational efficiency of real-time decision making with limited information. Furthermore, the benefits of predictive scheduling to NFV systems still remain unexplored. In this article, we propose POSCARS, an efficient predictive and online service chaining and resource scheduling scheme that achieves tunable trade-offs among various system metrics with stability guarantee. Through a careful choice of granularity in system modeling, we acquire a better understanding of the trade-offs in our design space. By a non-trivial transformation, we decouple the complex optimization problem into a series of online sub-problems to achieve the optimality with only limited information. By employing randomized load balancing techniques, we propose three variants of POSCARS to reduce the overheads of decision making. Theoretical analysis and simulations show that POSCARS and its variants require only mild-value of future information to achieve near-optimal system cost with an ultra-low request response time.
Article
The emergence of Network Functions Virtualisation (NFV) is drastically reshaping the arrangement of network functions. Instead of being built on dedicated hardware (network appliances), network functions are now implemented as software components that run on top of general purpose hardware through virtualisation, namely virtualised network functions (VNFs). From this paradigm-shifting technology arise two problems: (i) how to place VNFs in an NFV-enabled network; and (ii) how to chain these VNFs. These problems are jointly referred to as the VNF forwarding graph embedding (VNF-FGE) problem. Having efficient solutions to the VNF-FGE problem is key to the success of NFV because placing and chaining VNFs automatically and efficiently reduces network and computing resources, thus reducing capital expenditure (CAPEX) and operating expenditure (OPEX). In this work, we systematically review the literature on the VNF-FGE problem. We present a novel taxonomy for the classification and study of proposed solutions to this problem. Research challenges that remain unaddressed are also discussed, providing recommendations for future work.
Article
For wireless caching networks, the scheme design for content delivery is non-trivial in the face of the following tradeoff. On one hand, to optimize overall throughput, users can associate their nearby APs with great channel capacities; however, this may lead to unstable queue backlogs on APs and prolong request delays. On the other hand, to ensure queue stability, some users may have to associate APs with inferior channel states, which would incur throughput loss. Moreover, for such systems, how to conduct predictive scheduling to reduce delays and the fundamental limits of its benefits remain unexplored. In this paper, we formulate the problem of online user-AP association and resource allocation for content delivery with predictive scheduling under a fixed content placement as a stochastic network optimization problem. By exploiting its unique structure, we transform the problem into a series of modular maximization sub-problems with matroid constraints. Then we devise PUARA , a Predictive User-AP Association and Resource Allocation scheme which achieves a provably near-optimal throughput with queue stability. Our theoretical analysis and simulation results show that PUARA can not only perform a tunable control between throughput maximization and queue stability, but also incur a notable delay reduction with predicted information.
Article
Most online service providers deploy their own data stream processing systems in the cloud to conduct large-scale and real-time data analytics. However, such systems, e.g., Apache Heron, often adopt naive scheduling schemes to distribute data streams (in the units of tuples) among processing instances, which may result in workload imbalance and system disruption. Hence, there still exists a mismatch between the temporal variations of data streams and such inflexible scheduling scheme designs. Besides, the fundamental benefits of predictive scheduling to data stream processing systems still remain unexplored. In this paper, we focus on the problem of tuple scheduling with predictive service in Apache Heron. With a careful choice in the granularity of system modeling and decision making, we formulate the problem as a stochastic network optimization problem and propose POTUS, an online predictive scheduling scheme that aims to minimize the response time of data stream processing by steering data streams in a distributed fashion. Theoretical analysis and simulation results show that POTUS achieves an ultra-low response time with stability guarantee. Moreover, POTUS only requires mild-value of future information to further reduce the response time, even with mis-prediction.
Preprint
For wireless caching networks, the scheme design for content delivery is non-trivial in the face of the following tradeoff. On one hand, to optimize overall throughput, users can associate their nearby APs with great channel capacities; however, this may lead to unstable queue backlogs on APs and prolong request delays. On the other hand, to ensure queue stability, some users may have to associate APs with inferior channel states, which would incur throughput loss. Moreover, for such systems, how to conduct predictive scheduling to reduce delays and the fundamental limits of its benefits remain unexplored. In this paper, we formulate the problem of online user-AP association and resource allocation for content delivery with predictive scheduling under a fixed content placement as a stochastic network optimization problem. By exploiting its unique structure, we transform the problem into a series of modular maximization sub-problems with matroid constraints. Then we devise PUARA, a Predictive User-AP Association and Resource Allocation scheme which achieves a provably near-optimal throughput with queue stability. Our theoretical analysis and simulation results show that PUARA can not only perform a tunable control between throughput maximization and queue stability but also incur a notable delay reduction with predicted information.
Preprint
Most online service providers deploy their own data stream processing systems in the cloud to conduct large-scale and real-time data analytics. However, such systems, e.g., Apache Heron, often adopt naive scheduling schemes to distribute data streams (in the units of tuples) among processing instances, which may result in workload imbalance and system disruption. Hence, there still exists a mismatch between the temporal variations of data streams and such inflexible scheduling scheme designs. Besides, the fundamental benefits of predictive scheduling to data stream processing systems also remain unexplored. In this paper, we focus on the problem of tuple scheduling with predictive service in Apache Heron. With a careful choice in the granularity of system modeling and decision making, we formulate the problem as a stochastic network optimization problem and propose POTUS, an online predictive scheduling scheme that aims to minimize the response time of data stream processing by steering data streams in a distributed fashion. Theoretical analysis and simulation results show that POTUS achieves an ultra-low response time with queue stability guarantee. Moreover, POTUS only requires mild-value of future information to effectively reduce the response time, even with mis-prediction.
Conference Paper
Full-text available
In this paper we present Metron, a Network Functions Virtualization (NFV) platform that achieves high resource utilization by jointly exploiting the underlying network and commodity servers' resources. This synergy allows Metron to: (i) offload part of the packet processing logic to the network, (ii) use smart tagging to setup and exploit the affinity of traffic classes, and (iii) use tag-based hardware dispatching to carry out the remaining packet processing at the speed of the servers' fastest cache(s), with zero inter-core communication. Metron also introduces a novel resource allocation scheme that minimizes the resource allocation overhead for large-scale NFV deployments. With commodity hardware assistance, Metron deeply inspects traffic at 40 Gbps and realizes stateful network functions at the speed of a 100 GbE network card on a single server. Metron has 2.75-6.5x better efficiency than OpenBox, a state of the art NFV system, while ensuring key requirements such as elasticity, fine-grained load balancing, and flexible traffic steering.
Article
Full-text available
Network function virtualization (NFV) has already been a new paradigm for network architectures. By migrating network functions (NF) from dedicated hardware to virtualization platform, NFV can effectively improve the flexibility to deploy and manage service function chains (SFC). However, resource allocation for requested SFC in NFV-based infrastructures is not trivial as it mainly consists of three phases: virtual network functions (VNF) chain composition, VNFs forwarding graph embedding and VNFs scheduling. The decision of these three phases can be mutually dependent, which also makes it a tough task. Therefore, a coordinated approach is studied in this paper to jointly optimize NFV resource allocation in these three phases. We apply a general cost model to consider both network costs and service performance. The coordinate NFV-RA is formulated as a mixed-integer linear programming (MILP), and a heuristic based algorithm (JoraNFV) is proposed to get the near optimal solution. To make the coordinated NFV-RA more tractable, JoraNFV is divided into two sub-algorithms, one-hop optimal traffic scheduling and a multi-path greedy algorithm for VNF chain composition and VNF forwarding graph embedding. Lastly, extensive simulations are performed to evaluate the performance of JoraNFV, results have shown that JoraNFV can get a solution within 1.25 times of the optimal solution with reasonable execution time, which indicates that JoraNFV can be used for on-line NFV planning.
Article
Full-text available
Network Function Virtualization (NFV) has drawn significant attention from both industry and academia as an important shift in telecommunication service provisioning. By decoupling Network Functions (NFs) from the physical devices on which they run, NFV has the potential to lead to significant reductions in Operating Expenses (OPEX) and Capital Expenses (CAPEX) and facilitate the deployment of new services with increased agility and faster time-to-value. The NFV paradigm is still in its infancy and there is a large spectrum of opportunities for the research community to develop new architectures, systems and applications, and to evaluate alternatives and trade-offs in developing technologies for its successful deployment. In this paper, after discussing NFV and its relationship with complementary fields of Software Defined Networking (SDN) and cloud computing, we survey the state-of-the-art in NFV, and identify promising research directions in this area. We also overview key NFV projects, standardization efforts, early implementations, use cases and commercial products.
Conference Paper
Full-text available
The virtualization and softwarization of modern computer networks enables the definition and fast deployment of novel network services called service chains: sequences of virtualized network functions (e.g., firewalls, caches, traffic optimizers) through which traffic is routed between source and destination. This paper attends to the problem of admitting and embedding a maximum number of service chains, i.e., a maximum number of source-destination pairs which are routed via a sequence of to-be-allocated, capacitated network functions. We consider an Online variant of this maximum Service Chain Embedding Problem, short OSCEP, where requests arrive over time, in a worst-case manner. Our main contribution is a deterministic O(log L)-competitive online algorithm, under the assumption that capacities are at least logarithmic in L. We show that this is asymptotically optimal within the class of deterministic and randomized online algorithms. We also explore lower bounds for offline approximation algorithms, and prove that the offline problem is APX-hard for unit capacities and small L > 2, and even Poly-APX-hard in general, when there is no bound on L. These approximation lower bounds may be of independent interest, as they also extend to other problems such as Virtual Circuit Routing. Finally, we present an exact algorithm based on 0-1 programming, implying that the general offline SCEP is in NP and by the above hardness results it is NP-complete for constant L.
Article
Full-text available
Today's data centers need efficient traffic management to improve resource utilization in their networks. In this work, we study a joint tenant (e.g., server or virtual machine) placement and routing problem to minimize traffic costs. These two complementary degrees of freedom—placement and routing—are mutually-dependent, however, are often optimized separately in today's data centers. Leveraging and expanding the technique of Markov approximation, we propose an efficient online algorithm in a dynamic environment under changing traffic loads. The algorithm requires a very small number of virtual machine migrations and is easy to implement in practice. Performance evaluation that employs the real data center traffic traces under a spectrum of elephant and mice flows, demonstrates a consistent and significant improvement over the benchmark achieved by common heuristics.
Conference Paper
Full-text available
We explore the nature of trac in data centers, designed to su p- port the mining of massive data sets. We instrument the servers to collect socket-level logs, with negligible performance impact. In a 1500 server operational cluster, we thus amass roughly a petabyte of measurements over two months, from which we obtain and re- portdetailedviewsoftracandcongestionconditionsandp atterns. We further consider whether trac matrices in the clustermi ght be obtained instead via tomographic inference from coarser-grained counter data.
Conference Paper
In software-defined networking (SDN) systems, the scalability and reliability of the control plane still remain as major concerns. Existing solutions adopt either multi-controller designs or control devolution back to the data plane. The former requires a flexible yet efficient switch-controller association mechanism to adapt to workload changes and potential failures, while the latter demands timely decision making with low overheads. The integrate design for both is even more challenging. Meanwhile, the dramatic advancement in machine learning techniques has boosted the practice of predictive scheduling to improve the responsiveness in various systems. Nonetheless, so far little work has been conducted for SDN systems. In this paper, we study the joint problem of dynamic switch-controller association and control devolution, while investigating the benefits of predictive scheduling in SDN systems. We propose POSCAD, an efficient, online, and distributed scheme that exploits predictive future information to minimize the total system cost and the average request response time with queueing stability guarantee. Theoretical analysis and trace-driven simulation results show that POSCAD requires only mild-value of future information to achieve a near-optimal system cost and near-zero average request response time. Further, POSCAD is robust against mis-prediction to reduce the average request response time.
Conference Paper
Network function virtualization (NFV) can significantly reduce the operation cost and speed up the deployment for network services to markets. Under NFV, a network service is composed by a chain of ordered virtual functions, or we call a "network function chain." A fundamental question is when given a number of network function chains, on which servers should we place these functions and how should we form a chain on these functions? This is challenging due to the intricate dependency relationship of functions and the intrinsic complex nature of the optimization. In this paper, we formulate the function placement and chaining problem as an integer optimization, where each variable is an indicator whether one service chain can be deployed on a configuration (or a possible function placement of a service chain). While this problem is generally NP-hard, our contribution is to show that it can be mapped to an exponential number of min-cost flow problems. Instead of solving all the min-cost problems, one can select a small number of mapped min-cost problems, which are likely to have a low cost. To achieve this, we relax the integer problem into a fractional linear problem, and theoretically prove that the fractional solutions possess some desirable properties, i.e., the number and the utilization of selected configurations can be upper and lower bounded, respectively. Based on such properties, we determine some "good" configurations selected from the fractional solution and determine the mapped min-cost flow problem, and this helps us to develop efficient algorithms for network function placement and chaining. Via extensive simulations, we show that our algorithms significantly outperform state-of-art algorithms and achieve near-optimal performance.
Article
This new edition presents a thorough discussion of the mathematical theory and computational schemes of Kalman filtering. The filtering algorithms are derived via different approaches, including a direct method consisting of a series of elementary steps, and an indirect method based on innovation projection. Other topics include Kalman filtering for systems with correlated noise or colored noise, limiting Kalman filtering for time-invariant systems, extended Kalman filtering for nonlinear systems, interval Kalman filtering for uncertain systems, and wavelet Kalman filtering for multiresolution analysis of random signals. Most filtering algorithms are illustrated by using simplified radar tracking examples. The style of the book is informal, and the mathematics is elementary but rigorous. The text is self-contained, suitable for self-study, and accessible to all readers with a minimum knowledge of linear algebra, probability theory, and system engineering. Over 100 exercises and problems with solutions help deepen the knowledge. This new edition has a new chapter on filtering communication networks and data processing, together with new exercises and new real-time applications. © 2009, 1999, 1991, 1987 Springer-Verlag Berlin Heidelberg and Springer International Publishing AG 2017.
Article
Network Functions Virtualization (NFV) has enabled operators to dynamically place and allocate resources for network services to match workload requirements. However, unbounded end-to-end (e2e) latency of Service Function Chains (SFCs) resulting from distributed Virtualized Network Function (VNF) deployments can severely degrade performance. In particular, SFC instantiations with inter-data center links can incur high e2e latencies and Service Level Agreement (SLA) violations. These latencies can trigger timeouts and protocol errors with latency-sensitive operations. Traditional solutions to reduce e2e latency involve physical deployment of service elements in close proximity. These solutions are, however, no longer viable in the NFV era. In this paper, we present our solution that bounds the e2e latency in SFCs and inter-VNF control message exchanges by creating micro-service aggregates based on the affinity between VNFs. Our system, Contain-ed, dynamically creates and manages affinity aggregates using light-weight virtualization technologies like containers, allowing them to be placed in close proximity and hence bounding the e2e latency. We have applied Contain-ed to the Clearwater IP Multimedia Subsystem and built a proof-of-concept. Our results demonstrate that, by utilizing application and protocol specific knowledge, affinity aggregates can effectively bound SFC delays and significantly reduce protocol errors and service disruptions.
Article
Forecasting is a common data science task that helps organizations with capacity planning, goal setting, and anomaly detection. Despite its importance, there are serious challenges associated with producing reliable and high quality forecasts – especially when there are a variety of time series and analysts with expertise in time series modeling are relatively rare. To address these challenges, we describe a practical approach to forecasting “at scale” that combines configurable models with analyst-in-the-loop performance analysis. We propose a modular regression model with interpretable parameters that can be intuitively adjusted by analysts with domain knowledge about the time series. We describe performance analyses to compare and evaluate forecasting procedures, and automatically flag forecasts for manual review and adjustment. Tools that help analysts to use their expertise most effectively enable reliable, practical forecasting of business time series.
Article
In many computing and networking applications, arriving tasks have to be routed to one of many servers, with the goal of minimizing queueing delays. When the number of processors is very large, a popular routing algorithm works as follows: select two servers at random and route an arriving task to the least loaded of the two. It is well known that this algorithm dramatically reduces queueing delays compared to an algorithm, which routes to a single randomly selected server. In recent cloud computing applications, it has been observed that even sampling two queues per arriving task can be expensive and can even increase delays due to messaging overhead. So there is an interest in reducing the number of sampled queues per arriving task. In this paper, we show that the number of sampled queues can be dramatically reduced by using the fact that tasks arrive in batches (called jobs). In particular, we sample a subset of the queues such that the size of the subset is slightly larger than the batch size (thus, on average, we only sample slightly more than one queue per task). Once a random subset of the queues is sampled, we propose a new load-balancing method called batch-filling to attempt to equalize the load among the sampled servers. We show that, asymptotically, our algorithm dramatically reduces the sample complexity compared to previously proposed algorithms.
Conference Paper
This manuscript investigates the issue of implementing chains of network functions in a “softwarized” environment where edge network middle-boxes are replaced by software appliances running in virtual machines within a data center. The primary goal is to show that this approach allows space and time diversity in service chaining, with a higher degree of dynamism and flexibility with respect to conventional hardware-based architectures. The manuscript describes implementation alternatives of the virtual function chaining in a SDN scenario, showing that both layer 2 and layer 3 approaches are functionally viable. A proof-of-concept implementation with the Mininet emulation platform is then presented to provide a practical example of the feasibility and degree of complexity of such approaches.
Conference Paper
Network resource virtualization emerged as the future of communication technology recently, and the advent of Software Define Network (SDN) and Network Function Virtualization (NFV) enables the realization of network resource virtualization. NFV virtualizes traditional physical middle-boxes that implement specific network functions. Since multiple network functions can be virtualized in a single server or data center, the network operator can save Capital Expenditure (CAPEX) and Operational Expenditure (OPEX) through NFV. Since each customer demands different types of VNFs with various applications, the service requirements are different for all VNFs. Therefore, allocating multiple Virtual Network Functions(VNFs) to limited network resource requires efficient resource allocation. We propose an efficient resource allocation strategy of VNFs in a single server by employing mixed queuing network model while minimizing the customers' waiting time in the system. The problem is formulated as a convex problem. However, this problem is impossible to be solved because of the closed queuing network calculation. So we use an approximation algorithm to solve this problem. Numerical results of this model show performance metrics of mixed queuing network. Also, we could find that the approximate algorithm has a close optimal solution by comparing them with neighbor solutions.
Article
Diverse proprietary network appliances increase both the capital and operational expense of service providers, meanwhile causing problems of network ossification. Network function virtualization (NFV) is proposed to address these issues by implementing network functions as pure software on commodity and general hardware. NFV allows flexible provisioning, deployment, and centralized management of virtual network functions. Integrated with SDN, the software-defined NFV architecture further offers agile traffic steering and joint optimization of network functions and resources. This architecture benefits a wide range of applications (e.g., service chaining) and is becoming the dominant form of NFV. In this survey, we present a thorough investigation of the development of NFV under the software-defined NFV architecture, with an emphasis on service chaining as its application. We first introduce the software-defined NFV architecture as the state of the art of NFV and present relationships between NFV and SDN. Then, we provide a historic view of the involvement from middlebox to NFV. Finally, we introduce significant challenges and relevant solutions of NFV, and discuss its future research directions by different application domains.
Article
The integration of network function virtualization (NFV) and software defined networks (SDN) seeks to create a more flexible and dynamic software-based network environment. The line between entities involved in forwarding and those involved in more complex middle box functionality in the network is blurred by the use of high-performance virtualized platforms capable of performing these functions. A key problem is how and where network functions should be placed in the network and how traffic is routed through them. An efficient placement and appropriate routing increases system capacity while also minimizing the delay seen by flows. In this paper, we formulate the problem of network function placement and routing as a mixed integer linear programming problem. This formulation not only determines the placement of services and routing of the flows, but also seeks to minimize the resource utilization. We develop heuristicsto solve the problem incrementally, allowing us to support a large number of flows and to solve the problem for incoming flows without impacting existing flows.
Article
In online service systems, the delay experienced by a user from the service request to the service completion is one of the most critical performance metrics. To improve user delay experience, recent industrial practice suggests a modern system design mechanism: proactive serving, where the system predicts future user requests and allocates its capacity to serve these upcoming requests proactively. In this paper, we investigate the fundamentals of proactive serving from a theoretical perspective. In particular, we show that proactive serving decreases average delay exponentially (as a function of the prediction window size). Our results provide theoretical foundations for proactive serving and shed light on its application in practical systems.
Article
Network function virtualization was recently proposed to improve the flexibility of network service provisioning and reduce the time to market of new services. By leveraging virtualization technologies and commercial off-the-shelf programmable hardware, such as general-purpose servers, storage, and switches, NFV decouples the software implementation of network functions from the underlying hardware. As an emerging technology, NFV brings several challenges to network operators, such as the guarantee of network performance for virtual appliances, their dynamic instantiation and migration, and their efficient placement. In this article, we provide a brief overview of NFV, explain its requirements and architectural framework, present several use cases, and discuss the challenges and future directions in this burgeoning research area.
Article
In a queuing process, let 1/λ be the mean time between the arrivals of two consecutive units, L be the mean number of units in the system, and W be the mean time spent by a unit in the system. It is shown that, if the three means are finite and the corresponding stochastic processes strictly stationary, and, if the arrival process is metrically transitive with nonzero mean, then L = λW.
Conference Paper
Large-scale data analytics frameworks are shifting towards shorter task durations and larger degrees of parallelism to provide low latency. Scheduling highly parallel jobs that complete in hundreds of milliseconds poses a major challenge for task schedulers, which will need to schedule millions of tasks per second on appropriate machines while offering millisecond-level latency and high availability. We demonstrate that a decentralized, randomized sampling approach provides near-optimal performance while avoiding the throughput and availability limitations of a centralized design. We implement and deploy our scheduler, Sparrow, on a 110-machine cluster and demonstrate that Sparrow performs within 12% of an ideal scheduler.
Article
Motivated by the increasing popularity of learning and predicting human user behavior in communication and computing systems, in this paper, we investigate the fundamental benefit of predictive scheduling, i.e., predicting and pre-serving arrivals, in controlled queueing systems. Based on a lookahead window prediction model, we first establish a novel equivalence between the predictive queueing system with a \emph{fully-efficient} scheduling scheme and an equivalent queueing system without prediction. This connection allows us to analytically demonstrate that predictive scheduling necessarily improves system delay performance and can drive it to zero with increasing prediction power. We then propose the \textsf{Predictive Backpressure (PBP)} algorithm for achieving optimal utility performance in such predictive systems. \textsf{PBP} efficiently incorporates prediction into stochastic system control and avoids the great complication due to the exponential state space growth in the prediction window size. We show that \textsf{PBP} can achieve a utility performance that is within $O(\epsilon)$ of the optimal, for any $\epsilon>0$, while guaranteeing that the system delay distribution is a \emph{shifted-to-the-left} version of that under the original Backpressure algorithm. Hence, the average packet delay under \textsf{PBP} is strictly better than that under Backpressure, and vanishes with increasing prediction window size. This implies that the resulting utility-delay tradeoff with predictive scheduling beats the known optimal $[O(\epsilon), O(\log(1/\epsilon))]$ tradeoff for systems without prediction.
Conference Paper
Although there is tremendous interest in designing improved networks for data centers, very little is known about the network-level traffic characteristics of data centers today. In this paper, we conduct an empirical study of the network traffic in 10 data centers belonging to three different categories, including university, enterprise campus, and cloud data centers. Our definition of cloud data centers includes not only data centers employed by large online service providers offering Internet-facing applications but also data centers used to host data-intensive (MapReduce style) applications). We collect and analyze SNMP statistics, topology and packet-level traces. We examine the range of applications deployed in these data centers and their placement, the flow-level and packet-level transmission properties of these applications, and their impact on network and link utilizations, congestion and packet drops. We describe the implications of the observed traffic patterns for data center internal traffic engineering as well as for recently proposed architectures for data center networks.
Conference Paper
Today's data centers may contain tens of thousands of computers with significant aggregate bandwidth requirements. The network architecture typically consists of a tree of routing and switching elements with progressively more specialized and expensive equipment moving up the network hierarchy. Unfortunately, even when deploying the highest-end IP switches/routers, resulting topologies may only support 50% of the aggregate bandwidth available at the edge of the network, while still incurring tremendous cost. Non-uniform bandwidth among data center nodes complicates application design and limits overall system performance. In this paper, we show how to leverage largely commodity Ethernet switches to support the full aggregate bandwidth of clusters consisting of tens of thousands of elements. Similar to how clusters of commodity computers have largely replaced more specialized SMPs and MPPs, we argue that appropriately architected and interconnected commodity switches may deliver more performance at less cost than available from today's higher-end solutions. Our approach requires no modifications to the end host network interface, operating system, or applications; critically, it is fully backward compatible with Ethernet, IP, and TCP.
Book
This text presents a modern theory of analysis, control, and optimization for dynamic networks. Mathematical techniques of Lyapunov drift and Lyapunov optimization are developed and shown to enable constrained optimization of time averages in general stochastic systems. The focus is on communication and queueing systems, including wireless networks with time-varying channels, mobility, and randomly arriving traffic. A simple drift-plus-penalty framework is used to optimize time averages such as throughput, throughput-utility, power, and distortion. Explicit performance-delay tradeoffs are provided to illustrate the cost of approaching optimality. This theory is also applicable to problems in operations research and economics, where energy-efficient and profit-maximizing decisions must be made without knowing the future. Topics in the text include the following: • Queue stability theory • Backpressure, max-weight, and virtual queue methods • Primal-dual methods for non-convex stochastic utility maximization • Universal scheduling theory for arbitrary sample paths • Approximate and randomized scheduling theory • Optimization of renewal systems and Markov decision systems Detailed examples and numerous problem set questions are provided to reinforce the main concepts.
Article
Industry experience indicates that the ability to incrementally expand data centers is essential. However, existing high-bandwidth network designs have rigid structure that interferes with incremental expansion. We present Jellyfish, a high-capacity network interconnect, which, by adopting a random graph topology, yields itself naturally to incremental expansion. Somewhat surprisingly, Jellyfish is more cost-efficient than a fat-tree: A Jellyfish interconnect built using the same equipment as a fat-tree, supports as many as 25% more servers at full capacity at the scale of a few thousand nodes, and this advantage improves with scale. Jellyfish also allows great flexibility in building networks with different degrees of oversubscription. However, Jellyfish's unstructured design brings new challenges in routing, physical layout, and wiring. We describe and evaluate approaches that resolve these challenges effectively, indicating that Jellyfish could be deployed in today's data centers.
Article
We consider the following natural model: customers arrive as a Poisson stream of rate λn, λ<1, at a collection of n servers. Each customer chooses some constant d servers independently and uniformly at random from the n servers and waits for service at the one with the fewest customers. Customers are served according to the first-in first-out (FIFO) protocol and the service time for a customer is exponentially distributed with mean 1. We call this problem the supermarket model. We wish to know how the system behaves and in particular we are interested in the effect that the parameter d has on the expected time a customer spends in the system in equilibrium. Our approach uses a limiting, deterministic model representing the behavior as n→∞ to approximate the behavior of finite systems. The analysis of the deterministic model is interesting in its own right. Along with a theoretical justification of this approach, we provide simulations that demonstrate that the method accurately predicts system behavior, even for relatively small systems. Our analysis provides surprising implications. Having d=2 choices leads to exponential improvements in the expected time a customer spends in the system over d=1, whereas having d=3 choices is only a constant factor better than d=2. We discuss the possible implications for system design
Predictive switch-controller association and control devolution for sdn systems
• X Huang
• S Bian
• Z Shao
• Y Yang