Article

# Online VNF Chaining and Predictive Scheduling: Optimality and Trade-Offs

Authors:
• Shenzhen Institute of Artificial Intelligence and Robotics for Society
To read the full-text of this research, you can request a copy directly from the authors.

## Abstract

For NFV systems, the key design space includes the function chaining for network requests and the resource scheduling for servers. The problem is challenging since NFV systems usually require multiple (often conflicting) design objectives and the computational efficiency of real-time decision making with limited information. Furthermore, the benefits of predictive scheduling to NFV systems still remain unexplored. In this article, we propose POSCARS, an efficient predictive and online service chaining and resource scheduling scheme that achieves tunable trade-offs among various system metrics with stability guarantee. Through a careful choice of granularity in system modeling, we acquire a better understanding of the trade-offs in our design space. By a non-trivial transformation, we decouple the complex optimization problem into a series of online sub-problems to achieve the optimality with only limited information. By employing randomized load balancing techniques, we propose three variants of POSCARS to reduce the overheads of decision making. Theoretical analysis and simulations show that POSCARS and its variants require only mild-value of future information to achieve near-optimal system cost with an ultra-low request response time.

## No full-text available

... In addition, none of the above works studies discrete caching with predictions. Finally, it is worth stressing that employing predictions for improving the performance of communication/computing systems is not a new idea: predictions have been incorporated in stochastic optimization [44], [45] which assume the requests and system perturbations are stationary; and in online learning [46], [47] which do not adapt to predictions' accuracy (considered known). Here, we make no assumptions on the predictions' quality, which can be even adversarial. ...
Preprint
Full-text available
We take a systematic look at the problem of storing whole files in a cache with limited capacity in the context of optimistic learning, where the caching policy has access to a prediction oracle (provided by, e.g., a Neural Network). The successive file requests are assumed to be generated by an adversary, and no assumption is made on the accuracy of the oracle. In this setting, we provide a universal lower bound for prediction-assisted online caching and proceed to design a suite of policies with a range of performance-complexity trade-offs. All proposed policies offer sublinear regret bounds commensurate with the accuracy of the oracle. Our results substantially improve upon all recently-proposed online caching policies, which, being unable to exploit the oracle predictions, offer only $O(\sqrt{T})$ regret. In this pursuit, we design, to the best of our knowledge, the first comprehensive optimistic Follow-the-Perturbed leader policy, which generalizes beyond the caching problem. We also study the problem of caching files with different sizes and the bipartite network caching problem. Finally, we evaluate the efficacy of the proposed policies through extensive numerical experiments using real-world traces.
... Meanwhile, simulate the VNF placement and traffic routing problem using a Markov decision [53] process to account for dynamic network state changes. For scheduling, Ref. [54] presents a scalable, distributed, and online method for configuring a trade-off between a large number of system parameters while maintaining stability, all while exploiting predictive scheduling power. Ref. [55] suggested that earlier performance prediction frameworks perform poorly on contemporary architectures and NFVs because they consider memory as a monolithic unit and ignores the fact that the memory subsystem has several components that might individually create congestion. ...
Article
Full-text available
Network function virtualization (NFV) enables network operators to save costs and flexibility by replacing dedicated hardware with software network functions running on commodity servers. There is a high need for network acceleration to achieve performance comparable to hardware, which is vital for the implementation of NFV. The necessity of NFV acceleration stems from the lengthy packet delivery path following virtualization and the unavailability of generic operating system designs to serve network-specific scenarios. Therefore, the software approach alters the operating system’s processing architecture through Kernel Bypass or offload packet processing to hardware. A typical classification scheme divides it into two main categories based on technology with software and hardware. Only these two categories can be utilized to rapidly and easily establish a classification system. However, it is difficult to suggest the specifics and peculiarities of any acceleration approach during real-world operation. For a more comprehensive classification of NFV acceleration, we refer to the ETSI NFV architectural framework in this research. As the framework clearly illustrates, the technical infrastructure layer of NFV and the corresponding management roles provides a comprehensive and intuitive view of the differences between these acceleration technologies, solutions, and initiatives. Additionally, we conducted an analysis to identify opportunities for improvement in existing solutions and propose new research programs. We expect that NFV will increasingly rely on cloud services in the future. Since cloud services do not offer a choice of hardware, our acceleration method will be primarily software-based.
... On the other hand, predictive scheduling is conducive to solving conflicts between service function chains of different network requests and improving server resource scheduling efficiency. Huang et al. [27] proposed POSCARS, an efficient predictive and online service chaining and resource scheduling scheme that achieves tunable tradeoffs among various system metrics with stability guarantee. By a non-trivial transformation, the complex optimization problem was decoupled into a series of online sub-problems to achieve the optimality with only limited information. ...
Article
Full-text available
As an emerging and prospective paradigm, the Industrial Internet of Things (IIoT) enables intelligent manufacturing through the interconnection and interaction of industrial production elements. In this paper, we propose a network slicing orchestration system for remote adaptation and configuration in smart factories. We exploit Software-Defined Networking (SDN) and Network Functions Virtualization (NFV) to slice the physical network into multiple virtual networks. Different applications can use a dedicated network that meets its requirements with limited network resources with this scheme. To optimize network resource allocation and adapt to the dynamic network environments, we propose two heuristic algorithms with the assistance of Artificial Intelligence (AI) and the theoretical analysis of the network slicing system. We conduct numerical simulations to learn the performance of the proposed algorithms. Our experimental results show the effectiveness and efficiency of our proposed algorithms when multiple network services are concurrently running in the IIoT.
Chapter
Service function chaining (SFC), consisting of a sequence of virtual network functions (VNFs) (i.e., firewalls and load balancers), is an effective service provision technique in modern data center networks. By requiring cloud user traffic to traverse the VNFs in order, SFC improves the security and performance of the cloud user applications. In this paper, we study how to place an SFC inside a data center to minimize the network traffic of the virtual machine (VM) communication. We take a cooperative multi-agent reinforcement learning approach, wherein multiple agents collaboratively figure out the traffic-efficient route for the VM communication.Underlying the SFC placement is a fundamental graph-theoretical problem called the k-stroll problem. Given a weighted graph G(V, E), two nodes s, $$t \in V$$, and an integer k, the k-stroll problem is to find the shortest path from s to t that visits at least k other nodes in the graph. Our work is the first to take a multi-agent learning approach to solve k-stroll problem. We compare our learning algorithm with an optimal and exhaustive algorithm and an existing dynamic programming(DP)-based heuristic algorithm. We show that our learning algorithm, although lacking the complete knowledge of the network assumed by existing research, delivers comparable or even better VM communication time while taking two orders of magnitude of less execution time.KeywordsService function chainingData centersReinforcement learningk-stroll Problem
Article
We take a systematic look at the problem of storing whole files in a cache with limited capacity in the context of optimistic learning, where the caching policy has access to a prediction oracle (provided by, e.g., a Neural Network). The successive file requests are assumed to be generated by an adversary, and no assumption is made on the accuracy of the oracle. In this setting, we provide a universal lower bound for prediction-assisted online caching and proceed to design a suite of policies with a range of performance-complexity trade-offs. All proposed policies offer sublinear regret bounds commensurate with the accuracy of the oracle. Our results substantially improve upon all recently-proposed online caching policies, which, being unable to exploit the oracle predictions, offer only O(√T) regret. In this pursuit, we design, to the best of our knowledge, the first comprehensive optimistic Follow-the-Perturbed leader policy, which generalizes beyond the caching problem. We also study the problem of caching files with different sizes and the bipartite network caching problem. Finally, we evaluate the efficacy of the proposed policies through extensive numerical experiments using real-world traces.
Article
Network service provisioning becomes flexible and programmable with the help of Network Function Virtualization (NFV), since NFV abstracts various service functions into software components called Virtual Network Function (VNF) and VNFs can be flexibly and quickly composed to form new services. It is commonly known that sharing the same VNF among different services can improve the resource utilization. However, we should be aware that such sharing also leads to serious resource preemption. In addition, VNF sharing aggravates the generation of the performance bottleneck, which then causes the rate mismatch problem between the upstream and downstream VNFs belonging to the same service chain. In this work, we propose a dynamic and flexible algorithm to jointly address the VNF sharing resource allocation and the rate coordination between the upstream and downstream VNFs. Specifically, 1) the VNFs are shared among different service chains with a fairness factor considered for the purpose of reducing the resource preemption probability and improving the resource utilization; 2) the backpressure indicator of each VNF is defined to judge its pressure condition, based on which we can dynamically adjust the processing rates between it and its downstream or upstream VNFs by maximizing the idle resource utilization. The experimental results indicate that the proposed algorithm outperforms the other methods in terms of the average delay, the flow completion time, the throughput and the backlog, etc. Meanwhile, the proposed algorithm achieves more stable performance than the other methods.
Article
With the growing popularity of immersive interaction applications, e.g., industrial teleoperation and remote-surgery, the service demand of communication network is switching from packet delivery to remote control-based communication. The Tactile Internet (TI) is a promising paradigm of remote control-based wireless communication service, which enables tactile users to perceive, manipulate, or control real and virtual objects in perceived real-time. To support TI, the ultra-reliable and low latency communication service is required. However, the multi-tactile to multi-teleoperator interactive property, in-network computing demand, ordered service function chaining (SFC) requirement, as well as other features of TI, challenge the ultra-low latency provisioning. This paper studies the joint wireless resource allocation and SFC routing and scheduling problem for end-to-end delay reduction. We first formulate the problem as an end-to-end delay minimization problem constrained to SFC and wireless resource. Then, a distributed and cooperative scheme, which consists of the min–max (MM) wireless resource allocation algorithm and the delay-aware scheduling (DAS) of SFC algorithm, thus called MM-DAS, is proposed to address the problem. In MM-DAS, we use the MM algorithm for uplink/downlink communication at wireless edges, while DAS is proposed to solve the virtual network function (VNF) mapping and scheduling at the wireless core, for achieving the goal of providing low end-to-end delay to tactile–teleoperator pairs. The simulation results have illustrated the efficiency of the proposal for low end-to-end delay performance provisioning in TI environments.
Conference Paper
Full-text available
With the evolution of network function virtualization (NFV), diverse network services can be flexibly offered as service function chains (SFCs) consisted of different virtual network functions (VNFs). However, network state and traffic typically exhibit unpredictable variations due to stochastically arriving requests with different quality of service (QoS) requirements. Thus, an adaptive online SFC deployment approach is needed to handle the real-time network variations and various service requests. In this paper, we firstly introduce a Markov decision process (MDP) model to capture the dynamic network state transitions. In order to jointly minimize the operation cost of NFV providers and maximize the total throughput of requests, we propose NFVdeep, an adaptive, online, deep reinforcement learning approach to automatically deploy SFCs for requests with different QoS requirements. Specifically, we use a serialization-and-backtracking method to effectively deal with large discrete action space. We also adopt a policy gradient based method to improve the training efficiency and convergence to optimality. Extensive experimental results demonstrate that NFVdeep converges fast in the training process and responds rapidly to arriving requests especially in large, frequently transferred network state space. Consequently, NFVdeep surpasses the state-of-the-art methods by 32.59% higher accepted throughput and 33.29% lower operation cost on average.
Conference Paper
Full-text available
In this paper we present Metron, a Network Functions Virtualization (NFV) platform that achieves high resource utilization by jointly exploiting the underlying network and commodity servers' resources. This synergy allows Metron to: (i) offload part of the packet processing logic to the network, (ii) use smart tagging to setup and exploit the affinity of traffic classes, and (iii) use tag-based hardware dispatching to carry out the remaining packet processing at the speed of the servers' fastest cache(s), with zero inter-core communication. Metron also introduces a novel resource allocation scheme that minimizes the resource allocation overhead for large-scale NFV deployments. With commodity hardware assistance, Metron deeply inspects traffic at 40 Gbps and realizes stateful network functions at the speed of a 100 GbE network card on a single server. Metron has 2.75-6.5x better efficiency than OpenBox, a state of the art NFV system, while ensuring key requirements such as elasticity, fine-grained load balancing, and flexible traffic steering.
Article
Full-text available
Network Function Virtualization (NFV) provides higher flexibility for network operators and reduces the complexity in network service deployment. Using NFV, Virtual Network Functions (VNF) can be located in various network nodes and chained together in a Service Chain (SC) to provide a specific service. Consolidating multiple VNFs in a smaller number of locations would allow to decrease capital expenditures. However, excessive consolidation of VNFs might cause additional latency penalties due to processing-resource sharing, and this is undesirable, as SCs are bounded by service-specific latency requirements. In this paper, we identify two different types of penalties (referred as "costs") related to the processingresource sharing among multiple VNFs: the context switching costs and the upscaling costs. Context switching costs arise when multiple CPU processes (e.g., supporting different VNFs) share the same CPU and thus repeated loading/saving of their context is required. Upscaling costs are incurred by VNFs requiring multi-core implementations, since they suffer a penalty due to the load-balancing needs among CPU cores. These costs affect how the chained VNFs are placed in the network to meet the performance requirement of the SCs. We evaluate their impact while considering SCs with different bandwidth and latency requirements in a scenario of VNF consolidation.
Article
Full-text available
The Network Function Virtualization (NFV) paradigm has gained increasing interest in both academia and industry as it promises scalable and flexible network management and orchestration. In NFV networks, network services are provided as chains of different Virtual Network Functions (VNFs), which are instantiated and executed on dedicated VNF-compliant servers. The problem of composing those chains is referred to as the Service Chain Composition problem. In contrast to centralized solutions that suffer from scalability and privacy issues, in this paper we leverage non-cooperative game theory to achieve a low-complexity distributed solution to the above problem. Specifically, to account for selfish and competitive behavior of users, we formulate the service chain composition problem as an atomic weighted congestion game with unsplittable flows and player-specific cost functions. We show that the game possesses a weighted potential function and admits a Nash Equilibrium (NE). We prove that the price of anarchy (PoA) is upper-bounded, and also propose a distributed and privacy-preserving algorithm which provably converges towards a NE of the game in polynomial time. Finally, through extensive numerical results, we assess the performance of the proposed distributed solution to the service chain composition problem.
Article
Full-text available
Network function virtualization (NFV) has already been a new paradigm for network architectures. By migrating network functions (NF) from dedicated hardware to virtualization platform, NFV can effectively improve the flexibility to deploy and manage service function chains (SFC). However, resource allocation for requested SFC in NFV-based infrastructures is not trivial as it mainly consists of three phases: virtual network functions (VNF) chain composition, VNFs forwarding graph embedding and VNFs scheduling. The decision of these three phases can be mutually dependent, which also makes it a tough task. Therefore, a coordinated approach is studied in this paper to jointly optimize NFV resource allocation in these three phases. We apply a general cost model to consider both network costs and service performance. The coordinate NFV-RA is formulated as a mixed-integer linear programming (MILP), and a heuristic based algorithm (JoraNFV) is proposed to get the near optimal solution. To make the coordinated NFV-RA more tractable, JoraNFV is divided into two sub-algorithms, one-hop optimal traffic scheduling and a multi-path greedy algorithm for VNF chain composition and VNF forwarding graph embedding. Lastly, extensive simulations are performed to evaluate the performance of JoraNFV, results have shown that JoraNFV can get a solution within 1.25 times of the optimal solution with reasonable execution time, which indicates that JoraNFV can be used for on-line NFV planning.
Conference Paper
Full-text available
Network Functions Virtualization (NFV) is incrementally deployed by Internet Service Providers (ISPs) in their carrier networks, by means of Virtual Network Function (VNF) chains, to address customers' demands. The motivation is the increasing manageability, reliability and performance of NFV systems, the gains in energy and space granted by virtualization, at a cost that becomes competitive with respect to legacy physical network function nodes. From a network optimization perspective, the routing of VNF chains across a carrier network implies key novelties making the VNF chain routing problem unique with respect to the state of the art: the bitrate of each demand flow can change along a VNF chain, the VNF processing latency and computing load can be a function of the demands traffic, VNFs can be shared among demands, etc. In this paper, we provide an NFV network model suitable for ISP operations. We define the generic VNF chain routing optimization problem and devise a mixed integer linear programming formulation. By extensive simulation on realistic ISP topologies, we draw conclusions on the trade-offs achievable between legacy Traffic Engineering (TE) ISP goals and novel combined TE-NFV goals.
Conference Paper
Full-text available
An experimental setup of 32 honeypots reported 17M login attempts originating from 112 different countries and over 6000 distinct source IP addresses. Due to decoupled control and data plane, Software Defined Networks (SDN) can handle these increasing number of attacks by blocking those network connections at the switch level. However, the challenge lies in defining the set of rules on the SDN controller to block malicious network connections. Historical network attack data can be used to automatically identify and block the malicious connections. There are a few existing open-source software tools to monitor and limit the number of login attempts per source IP address one-by-one. However, these solutions cannot efficiently act against a chain of attacks that comprises multiple IP addresses used by each attacker. In this paper, we propose using machine learning algorithms, trained on historical network attack data, to identify the potential malicious connections and potential attack destinations. We use four widely-known machine learning algorithms: C4.5, Bayesian Network (BayesNet), Decision Table (DT), and Naive-Bayes to predict the host that will be attacked based on the historical data. Experimental results show that average prediction accuracy of 91.68% is attained using Bayesian Networks.
Article
Full-text available
Network functions virtualization (NFV) is a new network architecture framework where network function that traditionally used dedicated hardware (middleboxes or network appliances) are now implemented in software that runs on top of general purpose hardware such as high volume server. NFV emerges as an initiative from the industry (network operators, carriers, and manufacturers) in order to increase the deployment flexibility and integration of new network services with increased agility within operator's networks and to obtain significant reductions in operating expenditures and capital expenditures. NFV promotes virtualizing network functions such as transcoders, firewalls, and load balancers, among others, which were carried out by specialized hardware devices and migrating them to software-based appliances. One of the main challenges for the deployment of NFV is the resource allocation of demanded network services in NFV-based network infrastructures. This challenge has been called the NFV resource allocation (NFV-RA) problem. This paper presents a comprehensive state of the art of NFV-RA by introducing a novel classification of the main approaches that pose solutions to solve it. This paper also presents the research challenges that are still subject of future investigation in the NFV-RA realm.
Article
Most online service providers deploy their own data stream processing systems in the cloud to conduct large-scale and real-time data analytics. However, such systems, e.g., Apache Heron, often adopt naive scheduling schemes to distribute data streams (in the units of tuples) among processing instances, which may result in workload imbalance and system disruption. Hence, there still exists a mismatch between the temporal variations of data streams and such inflexible scheduling scheme designs. Besides, the fundamental benefits of predictive scheduling to data stream processing systems still remain unexplored. In this paper, we focus on the problem of tuple scheduling with predictive service in Apache Heron. With a careful choice in the granularity of system modeling and decision making, we formulate the problem as a stochastic network optimization problem and propose POTUS, an online predictive scheduling scheme that aims to minimize the response time of data stream processing by steering data streams in a distributed fashion. Theoretical analysis and simulation results show that POTUS achieves an ultra-low response time with stability guarantee. Moreover, POTUS only requires mild-value of future information to further reduce the response time, even with mis-prediction.
Article
For software-defined networking (SDN) systems, to enhance the scalability and reliability of control plane, existing solutions adopt either multi-controller design with static switch-controller association, or static control devolution by delegating certain request processing back to switches. Such solutions can fall short in face of temporal variations of request traffics, incurring considerable local computation costs on switches and their communication costs to controllers. So far, it still remains an open problem to develop a joint online scheme that conducts dynamic switch-controller association and dynamic control devolution. In addition, the fundamental benefits of predictive scheduling to SDN systems still remain unexplored. In this paper, we identify the non-trivial trade-off in such a joint design and formulate a stochastic network optimization problem which aims to minimize time-averaged total system costs and ensure long-term queue stability. By exploiting the unique problem structure, we devise a predictive online switch-controller association and control devolution (POSCAD) scheme, which solves the problem through a series of online distributed decision making. Theoretical analysis shows that without prediction, POSCAD can achieve near-optimal total system costs a tunable trade-off for queue stability. With prediction, POSCAD can achieve even better performance with shorter latencies. We conduct extensive simulations to evaluate POSCAD. Notably, with mild-value of future information, POSCAD incurs a significant reduction in request latencies, even when faced with prediction errors.
Conference Paper
Future networks are expected to support low-latency, context-aware and user-specific services in a highly flexible and efficient manner. One approach to support emerging use cases such as, e.g., virtual reality and in-network image processing is to introduce virtualized network functions (vNF)s at the edge of the network, placed in close proximity to the end users to reduce end-to-end latency, time-to-response, and unnecessary utilisation of the core network. While placement of vNFs has been studied before, it has so far mostly focused on reducing the utilisation of server resources (i.e., minimising the number of servers required in the network to run a specific set of vNFs), and not taking network conditions into consideration such as, e.g., end-to-end latency, the constantly changing network dynamics, and user mobility patterns. In this paper, we first formulate the Edge vNF placement problem to allocate vNFs to a distributed edge infrastructure, minimising end-to-end latency from all users to their associated vNFs. Furthermore, we present a way to dynamically reschedule the optimal placement of vNFs based on temporal network-wide latency fluctuations using optimal stopping theory. We evaluate our dynamic scheduler over a simulated nationwide backbone network using real-world ISP latency characteristics. We show that our proposed dynamic placement scheduler minimises vNF migrations compared to other schedulers (e.g., periodic and always-on scheduling of a new placement), and offers Quality of Service guarantees by not exceeding a maximum number of latency violations that can be tolerated by certain applications.
Article
Conference Paper
Recent deployments of Network Function Virtualization (NFV) architectures have gained tremendous traction. While virtualization introduces benefits such as lower costs and easier deployment of network functions, it adds additional layers that reduce transparency into faults at lower layers. To improve fault analysis and prediction for virtualized network functions (VNF), we envision a runtime predictive analysis system that runs in parallel with existing reactive monitoring systems to provide network operators timely warnings against faulty conditions. In this paper, we propose a deep learning based approach to reliably identify anomaly events from NFV system logs, and perform an empirical study using 18 consecutive months in 2016--2018 of real-world deployment data on virtualized provider edge routers. Our deep learning models, combined with customization and adaptation mechanisms, can successfully identify anomalous conditions that correlate with network trouble tickets. Analyzing these anomalies can help operators to optimize trouble ticket generation and processing rules in order to enable fast, or even proactive actions against faulty conditions.
Article
Cloud computing and network slicing are essential concepts of forthcoming 5G mobile systems. Network slices are essentially chunks of virtual computing and connectivity resources, configured and provisioned for particular services according to their characteristics and requirements. The success of cloud computing and network slicing hinges on the efficient allocation of virtual resources (e.g. VCPU, VMDISK) and the optimal placement of Virtualized Network Functions (VNFs) composing the network slices. In this context, this paper elaborates issues that may disrupt the placement of VNFs and VMs. The paper classifies the existing solutions for VM Placement (VMP) based on their nature, whether the placement is dynamic or static, their objectives, and their metrics. The paper then proposes a classification of VNF Placement (VNFP) approaches, first, regarding the general placement and management issues of VNFs, and second, based on the target VNF type.
Article
Dynamic Service Function Chaining (SFC) is a technique that facilitates the enforcement of advanced services and differentiated traffic forwarding policies. It dynamically steers the traffic through an ordered list of Service Functions (SFs). Enabling SFC capabilities in the context of a Software Defined Networking (SDN) architecture is promising, as it takes advantage of the SDN flexibility and automation abilities to structure service chains and improve the delivery time. However, the delivery time depends also on the traffic steering techniques used by an SFC solution. This paper provides a closer look at the current SDN architectures for SFC and provides an analysis of traffic steering techniques used by the current SDN-based SFC approaches. This study presents a comprehensive analysis of these approaches using efficiency criteria. It concludes that the studied solutions are not efficient enough to be deployed in real-life networks, principally due to scalability and flexibility limitations. Accordingly, the paper identifies relevant research challenges.
Conference Paper
Network function virtualization (NFV) can significantly reduce the operation cost and speed up the deployment for network services to markets. Under NFV, a network service is composed by a chain of ordered virtual functions, or we call a "network function chain." A fundamental question is when given a number of network function chains, on which servers should we place these functions and how should we form a chain on these functions? This is challenging due to the intricate dependency relationship of functions and the intrinsic complex nature of the optimization. In this paper, we formulate the function placement and chaining problem as an integer optimization, where each variable is an indicator whether one service chain can be deployed on a configuration (or a possible function placement of a service chain). While this problem is generally NP-hard, our contribution is to show that it can be mapped to an exponential number of min-cost flow problems. Instead of solving all the min-cost problems, one can select a small number of mapped min-cost problems, which are likely to have a low cost. To achieve this, we relax the integer problem into a fractional linear problem, and theoretically prove that the fractional solutions possess some desirable properties, i.e., the number and the utilization of selected configurations can be upper and lower bounded, respectively. Based on such properties, we determine some "good" configurations selected from the fractional solution and determine the mapped min-cost flow problem, and this helps us to develop efficient algorithms for network function placement and chaining. Via extensive simulations, we show that our algorithms significantly outperform state-of-art algorithms and achieve near-optimal performance.
Article
This new edition presents a thorough discussion of the mathematical theory and computational schemes of Kalman filtering. The filtering algorithms are derived via different approaches, including a direct method consisting of a series of elementary steps, and an indirect method based on innovation projection. Other topics include Kalman filtering for systems with correlated noise or colored noise, limiting Kalman filtering for time-invariant systems, extended Kalman filtering for nonlinear systems, interval Kalman filtering for uncertain systems, and wavelet Kalman filtering for multiresolution analysis of random signals. Most filtering algorithms are illustrated by using simplified radar tracking examples. The style of the book is informal, and the mathematics is elementary but rigorous. The text is self-contained, suitable for self-study, and accessible to all readers with a minimum knowledge of linear algebra, probability theory, and system engineering. Over 100 exercises and problems with solutions help deepen the knowledge. This new edition has a new chapter on filtering communication networks and data processing, together with new exercises and new real-time applications. © 2009, 1999, 1991, 1987 Springer-Verlag Berlin Heidelberg and Springer International Publishing AG 2017.
Article
Network Functions Virtualization (NFV) has enabled operators to dynamically place and allocate resources for network services to match workload requirements. However, unbounded end-to-end (e2e) latency of Service Function Chains (SFCs) resulting from distributed Virtualized Network Function (VNF) deployments can severely degrade performance. In particular, SFC instantiations with inter-data center links can incur high e2e latencies and Service Level Agreement (SLA) violations. These latencies can trigger timeouts and protocol errors with latency-sensitive operations. Traditional solutions to reduce e2e latency involve physical deployment of service elements in close proximity. These solutions are, however, no longer viable in the NFV era. In this paper, we present our solution that bounds the e2e latency in SFCs and inter-VNF control message exchanges by creating micro-service aggregates based on the affinity between VNFs. Our system, Contain-ed, dynamically creates and manages affinity aggregates using light-weight virtualization technologies like containers, allowing them to be placed in close proximity and hence bounding the e2e latency. We have applied Contain-ed to the Clearwater IP Multimedia Subsystem and built a proof-of-concept. Our results demonstrate that, by utilizing application and protocol specific knowledge, affinity aggregates can effectively bound SFC delays and significantly reduce protocol errors and service disruptions.
Article
Forecasting is a common data science task that helps organizations with capacity planning, goal setting, and anomaly detection. Despite its importance, there are serious challenges associated with producing reliable and high quality forecasts – especially when there are a variety of time series and analysts with expertise in time series modeling are relatively rare. To address these challenges, we describe a practical approach to forecasting “at scale” that combines configurable models with analyst-in-the-loop performance analysis. We propose a modular regression model with interpretable parameters that can be intuitively adjusted by analysts with domain knowledge about the time series. We describe performance analyses to compare and evaluate forecasting procedures, and automatically flag forecasts for manual review and adjustment. Tools that help analysts to use their expertise most effectively enable reliable, practical forecasting of business time series.
Conference Paper
The softwarization of networks provides a new degree of flexibility in network operation but its software components can result in unexpected runtime performance and erratic network behavior. This challenges the deployment of flexible software functions in performance critical (core) networks. To address this challenge, we present a methodology enabling the prediction of runtime performance and testing of functional behavior of Network Functions. Unlike traditional performance evaluation, e.g., testbed testing or simulation, our methodology can characterize the Network Function performance for any possible workload only by code analysis.
Conference Paper
The goals of load balancing are diverse. We may distribute the load to servers in order to achieve the same utilizations or average latencies. However, these goods are not a perfect fit in virtualized or software-defined networks. First, it is more difficult to assume homogeneous server capacities. Even for two (virtualized) functions with the same capacities, the capacities seen by the customer might be heterogeneous simply because they belong to different providers, are shared by others, or locate themselves differently and the communication costs are different. Heterogeneous server capacity will blur the aim of keeping the same utilizations. Second, usually the metric of latency in those networks is the (stochastic) bound instead of average value. In this paper, we parameterize the server capacities, and use the stochastic latency bound as the metric to further support inferring load balancing. We also model the load balancing process as a Markov-modulated process and observe the influence of its parameters onto achieving balance. The proposed model will benefit the load balancing function implementation and infrastructure design in virtualized or software-defined networks.
Article
In many computing and networking applications, arriving tasks have to be routed to one of many servers, with the goal of minimizing queueing delays. When the number of processors is very large, a popular routing algorithm works as follows: select two servers at random and route an arriving task to the least loaded of the two. It is well known that this algorithm dramatically reduces queueing delays compared to an algorithm, which routes to a single randomly selected server. In recent cloud computing applications, it has been observed that even sampling two queues per arriving task can be expensive and can even increase delays due to messaging overhead. So there is an interest in reducing the number of sampled queues per arriving task. In this paper, we show that the number of sampled queues can be dramatically reduced by using the fact that tasks arrive in batches (called jobs). In particular, we sample a subset of the queues such that the size of the subset is slightly larger than the batch size (thus, on average, we only sample slightly more than one queue per task). Once a random subset of the queues is sampled, we propose a new load-balancing method called batch-filling to attempt to equalize the load among the sampled servers. We show that, asymptotically, our algorithm dramatically reduces the sample complexity compared to previously proposed algorithms.
Article
Although network function virtualization (NFV) is a promising approach for providing elastic network functions, it faces several challenges in terms of adaptation to diverse network appliances and reduction of the capital and operational expenses of the service providers. In particular, to deploy service chains, providers must consider different objectives, such as minimizing the network latency or the operational cost, which are coupled objectives that have traditionally been addressed separately. In this paper, the problem of virtual network function (vNF) placement for service chains is studied for the purpose of energy and traffic-aware cost minimization. This problem is formulated as an optimization problem named the joint operational and network traffic cost (OPNET) problem. First, a sampling-based Markov approximation (MA) approach is proposed to solve the combinatorial NP-hard problem, OPNET. Even though the MA approach can yield a near-optimal solution, it requires a long convergence time that can hinder its practical deployment. To overcome this issue, a novel approach that combines the MA with matching theory, named as SAMA, is proposed to find an efficient solution for the original problem OPNET. Simulation results show that the proposed framework can reduce the total incurred cost by up to 19% compared to the existing non-coordinated approach.
Conference Paper
This manuscript investigates the issue of implementing chains of network functions in a “softwarized” environment where edge network middle-boxes are replaced by software appliances running in virtual machines within a data center. The primary goal is to show that this approach allows space and time diversity in service chaining, with a higher degree of dynamism and flexibility with respect to conventional hardware-based architectures. The manuscript describes implementation alternatives of the virtual function chaining in a SDN scenario, showing that both layer 2 and layer 3 approaches are functionally viable. A proof-of-concept implementation with the Mininet emulation platform is then presented to provide a practical example of the feasibility and degree of complexity of such approaches.
Conference Paper
Network resource virtualization emerged as the future of communication technology recently, and the advent of Software Define Network (SDN) and Network Function Virtualization (NFV) enables the realization of network resource virtualization. NFV virtualizes traditional physical middle-boxes that implement specific network functions. Since multiple network functions can be virtualized in a single server or data center, the network operator can save Capital Expenditure (CAPEX) and Operational Expenditure (OPEX) through NFV. Since each customer demands different types of VNFs with various applications, the service requirements are different for all VNFs. Therefore, allocating multiple Virtual Network Functions(VNFs) to limited network resource requires efficient resource allocation. We propose an efficient resource allocation strategy of VNFs in a single server by employing mixed queuing network model while minimizing the customers' waiting time in the system. The problem is formulated as a convex problem. However, this problem is impossible to be solved because of the closed queuing network calculation. So we use an approximation algorithm to solve this problem. Numerical results of this model show performance metrics of mixed queuing network. Also, we could find that the approximate algorithm has a close optimal solution by comparing them with neighbor solutions.
Article
Diverse proprietary network appliances increase both the capital and operational expense of service providers, meanwhile causing problems of network ossification. Network function virtualization (NFV) is proposed to address these issues by implementing network functions as pure software on commodity and general hardware. NFV allows flexible provisioning, deployment, and centralized management of virtual network functions. Integrated with SDN, the software-defined NFV architecture further offers agile traffic steering and joint optimization of network functions and resources. This architecture benefits a wide range of applications (e.g., service chaining) and is becoming the dominant form of NFV. In this survey, we present a thorough investigation of the development of NFV under the software-defined NFV architecture, with an emphasis on service chaining as its application. We first introduce the software-defined NFV architecture as the state of the art of NFV and present relationships between NFV and SDN. Then, we provide a historic view of the involvement from middlebox to NFV. Finally, we introduce significant challenges and relevant solutions of NFV, and discuss its future research directions by different application domains.
Article
The integration of network function virtualization (NFV) and software defined networks (SDN) seeks to create a more flexible and dynamic software-based network environment. The line between entities involved in forwarding and those involved in more complex middle box functionality in the network is blurred by the use of high-performance virtualized platforms capable of performing these functions. A key problem is how and where network functions should be placed in the network and how traffic is routed through them. An efficient placement and appropriate routing increases system capacity while also minimizing the delay seen by flows. In this paper, we formulate the problem of network function placement and routing as a mixed integer linear programming problem. This formulation not only determines the placement of services and routing of the flows, but also seeks to minimize the resource utilization. We develop heuristicsto solve the problem incrementally, allowing us to support a large number of flows and to solve the problem for incoming flows without impacting existing flows.
Article
In online service systems, the delay experienced by a user from the service request to the service completion is one of the most critical performance metrics. To improve user delay experience, recent industrial practice suggests a modern system design mechanism: proactive serving, where the system predicts future user requests and allocates its capacity to serve these upcoming requests proactively. In this paper, we investigate the fundamentals of proactive serving from a theoretical perspective. In particular, we show that proactive serving decreases average delay exponentially (as a function of the prediction window size). Our results provide theoretical foundations for proactive serving and shed light on its application in practical systems.