Archived project

Metro-Haul: METRO High bandwidth, 5G Application-aware optical network, with edge storage, compUte and low Latency

Goal: The overall Metro-Haul objective is to architect and design cost-effective, energy-efficient, agile and programmable metro networks that are scalable for 5G access and future requirements, encompassing the design of all-optical metro nodes (including full compute and storage capabilities), which interface effectively with both 5G access and multi-Tbit/s elastic core networks.

Metro-Haul has taken the 5G KPIs and already determined their implication for the optical network with these 5 targets: (i) 100 x more 5G capacity supported over the same optical fibre infrastructure, (ii) 10 times less energy consumption, (iii) Latency-aware metro network in which latency-sensitive slices are handled at the metro edge ensuring the metro network adds no additional latency, (iv) End to end SDN-based management framework enabling fast configuration time to set up or reconfigure services handling 5G applications, specifically 1 minute for simple network path set-up and 10 minutes for full installation of a new VNF and 1 hour for setting up a new virtual network slice and (v) reduction in CAPEX of a factor of 10, plus a reduction in OPEX of at least 20%.

Date: 1 June 2017 - 30 September 2020

Updates
0 new
0
Recommendations
0 new
0
Followers
0 new
80
Reads
0 new
646

Project log

Jorge E. López de Vergara Méndez
added a research item
We report the automated deployment of 5G services across a latency-aware, semidisaggregated, and virtualized metro network. We summarize the key findings in a detailed analysis of end-to-end latency, service setup time, and soft-failure detection time.
Jorge E. López de Vergara Méndez
added a research item
In network management, it is important to model baselines, trends, and regular behaviors to adequately deliver network services. However, their characterization is complex, so network operation and system alarming become a challenge. Several problems exist: Gaussian assumptions cannot be made, time series have different trends, and it is difficult to reduce their dimensionality. To overcome this situation, we propose Deep-FDA, a novel approach for network service modeling that combines functional data analysis (FDA) and neural networks. Specifically, we explore the use of functional clustering and functional depth measurements to characterize network services with time series generated from enriched flow records, showing how this method can detect different separated trends. Moreover, we augment this statistical approach with the use of autoencoder neural networks, improving the classification results. To evaluate and check the applicability of our proposal, we performed experiments with synthetic and real-world data, where we show graphically and numerically the performance of our method compared to other state-of-the-art alternatives. We also exemplify its application in different network management use cases. The results show that FDA and neural networks are complementary, as they can help each other to improve the drawbacks that both analysis methods have when are applied separately.
Jorge E. López de Vergara Méndez
added a research item
Periods of low load have been used for the scheduling of non-interactive tasks since the early stages of computing. Nowadays, the scheduling of bulk transfers-i.e., large-volume transfers without precise timing, such as database distribution, resources replication or backups-stands out among such tasks, given its direct effect on both the performance and billing of networks. Through visual inspection of traffic-demand curves of diverse points of presence (PoP), either a network, link, Internet service provider or Internet exchange point, it becomes apparent that low-use periods of bandwidth demands occur at early morning, showing a noticeable convex shape. Such observation led us to study and model the time when such demands reach their minimum, on what we have named valley time of a PoP, as an approximation to the ideal moment to carry out bulk transfers. After studying and modeling single-PoP scenarios both temporally and spatially seeking homogeneity in the phenomenon, as well as its extension to multi-PoP scenarios or paths-a meta-PoP constructed as the aggregation of several single PoPs-, we propose a final predictor system for the valley time. This tool works as an oracle for scheduling bulk transfers, with different versions according to time scales and the desired trade-off between precision and complexity. The evaluation of the system, named VTP, has proven its usefulness with errors below an hour on estimating the occurrence of valley times, as well as errors around 10% in terms of bandwidth between the prediction and actual valley traffic.
Jorge E. López de Vergara Méndez
added a research item
The automation of Network Services (NS) consisting of virtual functions connected through a multilayer packet-over-optical network requires predictable Quality of Service (QoS) performance, measured in terms of throughput and latency, to allow making proactive decisions. QoS is typically guaranteed by overprovisioning capacity dedicated to the NS, which increases costs for customers and network operators, especially when the traffic generated by the users and/or the virtual functions highly varies over the time. This paper presents the PILOT methodology for modeling the performance of connectivity services during commissioning testing in terms of throughput and latency. Benefits are double: first, an accurate per-connection model allows operators to better operate their networks and reduce the need for overprovisioning; and second, customers can tune their applications to the performance characteristics of the connectivity. PILOT runs in a sandbox domain and constructs a scenario where an efficient traffic flow simulation environment, based on the CURSA-SQ model, is used to generate large amounts of data for Machine Learning (ML) model training and validation. The simulation scenario is tuned using real measurements of the connection (including throughput and latency) obtained from a set of active probes in the operator network. PILOT has been experimentally validated on a distributed testbed connecting UPC and Telefónica premises.
Daniel King
added a research item
As part of the 5G-PPP Initiative, the Software Network Working Group prepared this white paper as a follow-up of 2018 Cloud-Native transformation white paper to analyze how 5G-PPP projects 1 interpret Cloud-Native design patterns and identify adoption barriers. The Software Network Working Group conducted a survey to collect technical inputs from 5G-PPP Phase 2 2 and Phase 3 3 projects on their:-supported vertical use-cases,-adopted virtualization technologies,-followed architecture patterns. Results are twofold: 1) Projects are clustered according to their architecture patterns: a. Most project prototypes evolved from ETSI MANO relying on an Openstack VIM exclusively to include Kubernetes-on bare metal and public cloud-as a new VIM in parallel to Openstack. b. Meanwhile, they kept orchestration intelligence centralized in a VNFM-like box. c. Therefore, only a few of them fully exploited Kubernetes as a complete autonomous platform with its own orchestration intelligence able to host both Containerized Network Functions and classical VM-based VNFs. 2) Analysis: a. We acknowledge a reluctance for using fully Cloud-Native design provided by e.g. Kubernetes. This reluctance has been analyzed to extract the underlying reasons motivating projects to select this intermediate step where Kubernetes is considered only as a VIM. These reasons are presented as the barriers to adopt the Cloud-Native patterns. b. These barriers are essentially the lack of standard and technological maturity, implying human adaptation resistance.
Daniel King
added 2 research items
Operators' network management continuously measures network health by collecting data from the deployed network devices; data is used mainly for performance reporting and diagnosing network problems after failures, as well as by human capacity planners to predict future traffic growth. Typically, these network management tools are generally reactive and require significant human effort and skills to operate effectively. As optical networks evolve to fulfil highly flexible connectivity and dynamicity requirements, and supporting ultra-low latency services, they must also provide reliable connectivity and increased network resource efficiency. Therefore, reactive human-based network measurement and management will be a limiting factor in the size and scale of these new networks. Future optical networks must support fully automated management, providing dynamic resource re-optimization to rapidly adapt network resources based on predicted conditions and events; identify service degradation conditions that will eventually impact connectivity and highlight critical devices and links for further inspection; and augment rapid protection schemes if a failure is predicted or detected, and facilitate resource optimization after restoration events. Applying automation techniques to network management requires both the collection of data from a variety of sources at various time frequencies, but it must also support the capability to extract knowledge and derive insight for performance monitoring, troubleshooting, and maintain network service continuity. Innovative analytics algorithms must be developed to derive meaningful input to the entities that orchestrate and control network resources; these control elements must also be capable of proactively programming the underlying optical infrastructure. In this article, we review the emerging requirements for optical network management automation, the capabilities of current optical systems, and the development and standardization status of data models and protocols to facilitate automated network monitoring. Finally, we propose an architecture to provide Monitoring and Data Analytics (MDA) capabilities, we present illustrative control loops for advanced network monitoring use cases, and the findings that validate the usefulness of MDA to provide automated optical network management.
Automating the provisioning of 5G services, deployed over a heterogeneous infrastructure (in terms of domains, technologies, and management platforms), remains a complex task, yet driven by the constant need to provide end-to-end connections at network slices at reducing costs and service deployment time. At the same time, such services are increasingly conceived around interconnected functions and require allocation of computing, storage, and networking resources. The METRO-HAUL 5G research initiative acknowledges the need for automation and strives to develop an orchestration platform for services and resources that extends, integrates, and builds on top of existing approaches, macroscopically adopting Transport Software Defined Networking principles, and leveraging the programmability and open control of Transport SDN.
Jorge E. López de Vergara Méndez
added a research item
New techniques, based on Software-Defined Networks, are used to deploy optical paths dynamically. This demonstration shows how active measurements at 100 Gbit/s are performed to check before operation that the performance requirements are met in terms of capacity, delay or packet loss.
Jorge E. López de Vergara Méndez
added a research item
Many network management actions need a simultaneous consideration of several elements' state. This is becoming an even more complex matter with the advent of reconfigurable deployments, where scaling functions up can prevent performance bottlenecks. Therefore, fine-grained detection of significant burdens arises as a cornerstone to optimize their monitoring and operation. We present AdPRISMA (Advanced distributed Passive Retrieval of Information, and Statistical Multi-point Analysis), a passive monitoring system intended to fit models for network delay measurements with clustering elements to improve representation of central and extreme behaviors. As distinguishing features, it relies on cost-effective multi-point round-trip time (RTT) passive network measurements, and is able to select a suitable parametric model optimizing the trade-off between fitting and complexity. AdPRISMA can correlate records collected from several vantage points and detect where performance issues are most likely to appear; adjust alarms in terms of the probability of events; and adapt its behavior to dynamic network conditions while presenting a fair identification of anomalous situations. We evaluate AdPRISMA with experiments both in virtual environments and with real-world data to provide evidences of its applicability and capabilities to represent network elements' delay.
Jorge E. López de Vergara Méndez
added a research item
Presentation given at the MetroHaul workshop within NGON. The presentation shows the experience of defining and maintaining a YANG model at IETF.
Jorge E. López de Vergara Méndez
added a research item
Incoming 5G networks will evolve regarding how they operate due to the use of virtualization technologies. Network functions that are necessary for communication will be virtual and will run on top of commodity servers. Among these functions, it will be essential to deploy monitoring probes, which will provide information regarding how the network is behaving, which will be later analyzed for self-management purposes. However, to date, the network probes have needed to be physical to perform at link-rates in high-speed networks, and it is challenging to deploy them in virtual environments. Thus, it will be necessary to rely on bare-metal accelerators to deal with existing input/output (I/O) performance problems. Next, to control the costs of implementing these virtual network probes, our approach is to leverage the capabilities that current commercial off-the-shelf network cards provide for virtual environments. Specifically, to this end, we have implemented HPCAP40vf, which is a driver that is GPL-licensed and available for download, for network capture in virtual machines. This driver handles the communication with an Intel XL710 40 Gbit/s commercial network card to enable a network monitoring application run within a virtual machine. To store the captured traffic, we have relied on NVMe drives due to their high transference rate, as they are directly connected to the PCIe bus. We have assessed the performance of this approach and compared it with DPDK, in terms of both capturing and storing the network traffic by measuring the achieved data rates. The evaluation has taken into account two virtualization technologies, namely, KVM and Docker, and two access methods to the underlying hardware, namely, VirtIO and PCI passthrough. With this methodology, we have identified bottlenecks and determined the optimal solution in each case to reduce overheads due to virtualization. This approach can also be applied to the development of other performance-hungry virtual network functions. The obtained results demonstrate the feasibility of our proposed approach: when we correctly use the capabilities that current commercial network cards provide, our virtual network probe can monitor at 40 Gbit/s with full packet capture and storage and simultaneously track the traffic among other virtual network functions inside the host and with the external network.
Daniel King
added 2 research items
Automating the provisioning of 5G services, deployed over a heterogeneous infrastructure (in terms of domains, technologies, and management platforms), remains a complex task, yet driven by the constant need to provide end-to-end connections at network slices at reducing costs and service deployment time. At the same time, such services are increasingly conceived around interconnected functions and require allocation of computing, storage, and networking resources. The METRO-HAUL 5G research initiative acknowledges the need for automation and strives to develop an orchestration platform for services and resources that extends, integrates, and builds on top of existing approaches, macroscopically adopting Transport Software Defined Networking principles, and leveraging the programmability and open control of Transport SDN.
Daniel King
added a research item
ecent advances, related to the concepts of Artificial Intelligence (AI) and Machine Learning (ML) and with applications across multiple technology domains, have gathered significant attention due, in particular, to the overall performance improvement of such automated systems when compared to methods relying on human operation. Consequently, using AI/ML for managing, operating and optimizing transport networks is increasingly seen as a potential opportunity targeting, notably, large and complex environments.Such AI-assisted automated network operation is expected to facilitate innovation in multiple aspects related to the control and management of future optical networks and is a promising milestone in the evolution towards autonomous networks, where networks self-adjust parameters such as transceiver configuration.To accomplish this goal, current network control, management and orchestration systems need to enable the application of AI/ML techniques. It is arguable that Software-Defined Networking (SDN) principles, favouring centralized control deployments, featured application programming interfaces and the development of a related application ecosystem are well positioned to facilitate the progressive introduction of such techniques, starting, notably, in allowing efficient and massive monitoring and data collection.In this paper, we present the control, orchestration and management architecture designed to allow the automatic deployment of 5G services (such as ETSI NFV network services) across metropolitan networks, conceived to interface 5G access networks with elastic core optical networks at multi Tb/s. This network segment, referred to as Metro-haul, is composed of infrastructure nodes that encompass networking, storage and processing resources, which are in turn interconnected by open and disaggregated optical networks. In particular, we detail subsystems like the Monitoring and Data Analytics or the in-operation planning backend that extend current SDN based network control to account for new use cases.
Jorge E. López de Vergara Méndez
added a research item
Many management actions for networking infrastructures require to simultaneously consider the state of several network elements. This is particularly critical in the case of reconfigurable deployments, such as Virtual or Software-Defined Networks, to scale the affected equipment up and prevent performance bottlenecks. In this light, we present dPRISMA (distributed Passive Retrieval of Information, and Statistical Multi-point Analysis), a passive monitoring system intended to fit statistical models for network measurements and raise alarms in the case of extreme behaviors. As distinguishing features, dPRISMA relies on cost-effective multi-point network measurements , and is able to select a suitable parametric model optimizing the trade-off between fitting and complexity. Therefore, it can (i) correlate records collected from several vantage points and detect where performance issues are most likely to appear; (ii) adjust alarms in terms of the probability of events; and (iii) adapt its behavior to dynamic network conditions while presenting a fair identification of anomalous situations. We evaluate dPRISMA with experiments both in virtual environments and with real-world data to provide evidences of its applicability.
Roberto Morro
added a research item
This demo shows how a hierarchical control plane of ONOS SDN controllers orchestrates the dynamic provisioning of end-to-end Carrier Ethernet circuits on a composite network, programming the whole data path from the CPE to the core optical equipment.
Jorge E. López de Vergara Méndez
added 2 research items
This document defines a YANG model for managing flexi-grid optical media channels, complementing the information provided by the flexi-grid TED model. It is also grounded on other defined YANG abstract models.
This document defines a YANG model for managing flexi-grid optical networks. The model described in this document defines a flexi-grid traffic engineering database. A complementary module is referenced to detail the flexi-grid media channels. This module is grounded on other defined YANG abstract models.
Jorge E. López de Vergara Méndez
added a research item
En este trabajo se presenta una arquitectura basada en FPGA, diseñada para la agregación y posterior exportación de registros de sesiones TCP en enlaces de hasta 40 Gbit/s sin realizar muestreo de paquetes, incluso a la máxima tasa de paquetes. De esta manera, se descarga a exportadores de flujos basados en hardware de propósito específico de tareas para las cuales las FPGA ofrecen una flexibilidad y un desempeño adecuados, reduciendo los requerimientos del sistema completo. Un prototipo funcional del sistema ha sido implementado en la plataforma NetFPGA-SUME, donde fue sometido a tráfico real. En el mismo, se incorporó una estimación de las retransmisiones por flujo, además de otras estadísticas estándar, tales como número de bytes y paquetes de cada conexión de red.
Roberto Morro
added a research item
A first demonstration of ROADM White Box augmented with machine learning capabilities is demonstrated. The white box includes various level of disaggregation, NETCONF/YANG control, telemetry and spectrum-based advanced monitoring functionalities.
Jorge E. López de Vergara Méndez
added a research item
New Ethernet standards, such as 40 GbE or 100 GbE, are already being deployed commercially along with their corresponding Network Interface Cards (NICs) for the servers. However, network measurement solutions are lagging behind: while there are several tools available for monitoring 10 or 20 Gbps networks, higher speeds pose a harder challenge that requires more new ideas, different from those applied previously, and so there are less applications available. In this paper, we show a system capable of capturing, timestamping and storing 40 Gbps network traffic using a tailored network driver together with Non-Volatile Memory express (NVMe) technology and the Storage Performance Development Kit (SPDK) framework. Also, we expose core ideas that can be extended for the capture at higher rates: a multicore architecture capable of synchronization with minimal overhead that reduces disordering of the received frames, methods to filter the traffic discarding unwanted frames without being computationally expensive, and the use of an intermediate buffer that allows simultaneous access from several applications to the same data and efficient disk writes. Finally, we show a testbed for a reliable benchmarking of our solution using custom DPDK traffic generators and replayers, which have been made freely available for the network measurement community.
Jorge E. López de Vergara Méndez
added a research item
Analyzing multi-Gb/s links is a very tough task. Classical solutions to overcome this problem used different sampling methods. However, keeping traffic information at 100 Gb/s and higher data rates becomes very difficult. Current Big Data-based architectures are very inefficient in terms of resource management and rely on suboptimal programming languages. Our novel high performance lambda-based architecture can overcome these problems.
Jorge E. López de Vergara Méndez
added a research item
In many optical network scenarios, such as Storage Array Network (SAN) replication, keeping latency under control is cornerstone to provide a proper Quality of Service (QoS). Hence, measuring latencies in such optical networks becomes fundamental. However, for low distances, microseconds resolution is required, which, in turn, demands ad-hoc hardware implementation for the measurement device. Alternatively, a more cost-effective solution is that of software-based methods, but up to date they were not precise enough at 10 Gbit/s or above. In this paper, we analyze current high-performance packet engines, such as DPDK, and pinpoint the issues involved when it comes to measure latencies in high-speed optical networks. Based on these findings, we propose the use of a software-based solution to measure latency. Furthermore, we also propose an extension that serves to measure bandwidth as well, with the novel concept of convoy of packet trains.
Jorge E. López de Vergara Méndez
added a research item
CASTOR architecture to enable cognitive networking is demonstrated. Extended nodes make local decisions, whilst a centralized system beside the network controller makes network-wide decisions. Interaction with ONOS, Net2Plan, and passive monitoring devices is exhibited.
Ricard Vilalta
added a project reference
Daniel King
added 5 research items
Transport network domains, including Optical Transport Network (OTN) and Wavelength Division Multiplexing (WDM) networks, are typically deployed based on a single vendor or technology platforms. They are often managed using proprietary interfaces to dedicated Element Management Systems (EMS), Network Management Systems (NMS) and increasingly Software Defined Network (SDN) controllers. A well-defined open interface to each domain management system or controller is required for network operators to facilitate control automation and orchestrate end-to-end services across multi-domain networks. These functions may be enabled using standardized data models (e.g. YANG), and appropriate protocol (e.g., RESTCONF). This document describes the key use cases and requirements for transport network control and management. It reviews proposed and existing IETF transport network data models, their applicability, and highlights gaps and requirements.
The Software Defined Network (SDN) is an established network paradigm, architecture and principles, that attracted significant research effort in recent years. An SDN-enabled infrastructure decouples network control from forwarding and enables direct programming. Recently, there is an increasing effort to introduce SDN support in the transport layers of the network operators WAN infrastructure, like Layer 0 (WDM & DWDM) and Layer 1 (SONET/SDH & OTN) technologies. We refer to this infrastructure as the “Software Defined Transport Network”, and benefits include network management devolvement, timely connectivity provision, improved scalability, and open and flexible programmability using well-defined API. This paper outlines the main elements of Software Defined Transport Networks and highlights relevant Application-Based Network Operations (ABNO) enabling technologies. We demonstrate how this technology will benefit network operators, and provide an overview of research results and deployment examples. Finally, we identify some of the technology gaps and future research opportunities.
Daniel King
added a project goal
The overall Metro-Haul objective is to architect and design cost-effective, energy-efficient, agile and programmable metro networks that are scalable for 5G access and future requirements, encompassing the design of all-optical metro nodes (including full compute and storage capabilities), which interface effectively with both 5G access and multi-Tbit/s elastic core networks.
Metro-Haul has taken the 5G KPIs and already determined their implication for the optical network with these 5 targets: (i) 100 x more 5G capacity supported over the same optical fibre infrastructure, (ii) 10 times less energy consumption, (iii) Latency-aware metro network in which latency-sensitive slices are handled at the metro edge ensuring the metro network adds no additional latency, (iv) End to end SDN-based management framework enabling fast configuration time to set up or reconfigure services handling 5G applications, specifically 1 minute for simple network path set-up and 10 minutes for full installation of a new VNF and 1 hour for setting up a new virtual network slice and (v) reduction in CAPEX of a factor of 10, plus a reduction in OPEX of at least 20%.