Article

An experimental study of delayed internet routing convergence

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This paper examines the latency in Internet path failure, failover and repair due to the convergence properties of inter-domain routing. Unlike switches in the public telephony network which exhibit failover on the order of milliseconds, we show that inter-domain routers in the packet switched Internet may take several minutes to reach a consistent view of the network topology after a fault. These delays stem from temporary routing table oscillations formed during operation of the BGP path selection process on Internet backbone routers. During these periods of delayed convergence , end-to-end Internet paths will experience intermittent loss of connectivity, as well as increased packet loss and latency. We present a two-year study of Internet routing convergence through the experimental instrumentation of key portions of the Internet infrastructure, including both passive data collection and fault-injection machines at major Internet exchange points. Based on data from the injection and measurement of several hundred thousand inter-domain routing faults, we describe several unexpected properties of convergence and show that the measured upper bound on Internet inter-domain routing convergence delay is an order of magnitude slower than previously thought. Our analysis also shows that the upper computational bound on the number of router states and control messages exchanged during the process of BGP convergence is exponential with respect to the number of autonomous systems on the Internet. Finally, we demonstrate that much of the observed convergence delay stems from both specific router vendor implementation decisions, as well as ambiguity in the BGP specification.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Service providers would like to have their services always available for client sessions. However, studies have shown that inter-domain Internet path availability is very poor [1], and that Internet route recovery can take of the order of a few minutes [2]. This in turn reflects on the availability of the composed service. ...
... The motivation for this is that inter-domain network-path failures can happen quite often—studies show that inter-domain Internet paths can have availability as low as 95% [1]. Importantly, when such failures do happen, they can last for several tens of seconds to several minutes [2]. The choice of service instances for service-level path creation/recovery is somewhat like web-mirror selection, but is more complicated, since in general, we may need to select a set of instances for a client session. ...
... 3—the first leg of the service-level path. This is important since unlike the telephone network, the Internet paths are known to have much lesser availability [1,2]. While the notion of failure if very application specific, for our purposes, we consider Internet path outages such as those that happen when there is a BGP-level failure [2]. ...
Article
Service composition provides a flexible way to quickly enable new application functionalities in next generation networks. We focus on the scenario where next generation portal providers ‘compose’ the component services of other providers. We have developed an architecture based on an overlay network of service clusters to provide failure-resilient composition of services across the wide-area Internet: our algorithms detect and recover quickly from failures in composed client sessions.In this paper, we present an evaluation of our architecture whose overarching goal is quick recovery of client sessions. The evaluation of an Internet-scale system like ours is challenging. Simulations do not capture true workload conditions and Internet-wide deployments are often infeasible. We have developed an emulation platform for our evaluation—one that allows a realistic and controlled design study. Our experiments show that the control overhead involved in implementing our recovery mechanism is minimal in terms of network as well as processor resources; minimal additional provisioning is required for this. Failure recovery can be effected using alternate service replicas within about 1 s after failure detection on Internet paths. We collect trace data to show that failure detection itself can be tight on wide-area Internet paths—within about 1.8 s. Failure detection and recovery within these time bounds represents a significant improvement over existing Internet path recovery mechanisms that take several tens of seconds to a few minutes.
... Note that a single root cause can lead to multiple BGP updates. For example, previous measurement results show that a link failure can trigger a large number of BGP updates due to path exploration and slow convergence [8], [10]. To quantify the existence of routing convergence, we need to group all updates from each monitor that are associated with the same root cause into a single routing event. ...
... i This process is called path exploration. As illustrated by [8], [10], during path exploration, the paths received by an AS have a general trend. During T down and T long , it goes from more preferred path to less preferred path; during T up and T short , it goes from less preferred path to more preferred path. ...
... 15% of the T down events take more than 2 minutes and 2.13% of them take more than 5 minutes, whereas for other Path Change events, only 6% of the events last more than 2 minutes and 0.66% more than 5 minutes. These results is in accordance with our previous formal analysis results[18], but different from previous measurement work [8], which concluded that the duration of T long event is expected to be similar to that of T down , and noticeably longer than that of T up and T short . In [18] we showed that the T long event duration for a peer of RouteViews should be upper bounded by M (P − J), where M is the MRAI timer value, P is the length of the peer's path to the destination after the event, and J is the distance from the failure location to the destination. ...
Conference Paper
Full-text available
A number of previous measurement studies [10, 12, 17] have shown the existence of path exploration and slow convergence in the global Internet routing system, and a number of protocol enhancements have been proposed to remedy the problem [21, 15, 4, 20, 5]. However all the previous measurements were conducted over a small number of testing prefixes. There has been no systematic study to quantify the pervasiveness of BGP slow convergence in the operational Internet, nor there is any known effort to deploy any of the proposed solutions.In this paper we present our measurement results from identifying BGP slow convergence events across the entire global routing table. Our data shows that the severity of path exploration and slow convergence varies depending on where prefixes are originated and where the observations are made in the Internet routing hierarchy. In general, routers in tier-1 ISPs observe less path exploration, hence shorter convergence delays than routers in edge ASes, and prefixes originated from tier-1 ISPs also experience less path exploration than those originated from edge ASes. Our data also shows that the convergence time of route fail-over events is similar to that of new route announcements, and significantly shorter than that of route failures, which confirms our earlier analytical results [19]. In addition, we also developed a usage-time based path preference inference method which can be used by future studies of BGP dynamics.
... Convergence problems have been identified in BGP since almost a decade [18, 21]. Two delay mechanisms have been introduced in BGP to reduce the number of routing messages exchanged between routers: MRAI and route flap damping. ...
... Once a routing protocol is designed in such a way that it is guaranteed to converge, even if only under certain conditions [16], the next step in the design should tackle the dynamical properties of the protocol: how many messages need to be exchanged under topological changes or how much time it takes to converge. The latter aspects have been mostly untouched in the Internet, even though we have reasons to worry about the convergence time of BGP [21] and the load in terms of the number of routing messages exchanged. For the first time, we tackle in this paper the problem of defining a timer mechanism that will enforce an ordering of routing messages such that the number of exchanged messages is minimal during the convergence of the protocol. ...
... The MRAI-5 setting leads to far smaller convergence times (average and maximum) than MRAI-30. We can appreciate the particularly bad convergence times of the MRAI-30 timers, consistent with observations in the Internet [21]. MRAI-5 leads to pretty fast network convergence, even smaller than MRPC timers for our settings on average (but not in the worst case). ...
Conference Paper
The behavior of routing protocols during convergence is critical as it impacts end-to-end performance. Network convergence is partic- ularly important in BGP, the current interdomain routing protocol. In order to decrease the amount of exchanged routing messages and transient routes, BGP routers rely on MRAI timers and route flap damping. These timers are intended to limit the exchange of tran- sient routing messages. In practice, these timers have been shown to be partly ineffective at improving convergence, making it even slower in some situations. In this paper, we propose to add a timer mechanism to routing protocols, that enforces an ordering of the routing messages such that path exploration is drastically reduced while controlling con- vergence time. Our approach is based on known results in gener- alized path algorithms and endomorphism semi-rings. Our timers, called MRPC (metrics and routing policies compliant), are set in- dependently by each router and depend only on the metrics of the routes received by the router as well as the routing policies of the router. No sharing of information about routing policies between neighboring ASs is required by our solution. Similarly to the case of routing policies that may lead to BGP convergence problems, arbitrary routing policies can also make it impossible to enforce an ordering of the messages that will prevent path exploration to occur. We explain under which conditions path exploration can be avoided with our timers, and provide simulations to understand how they compare to MRAI.
... This paper is not (principally) concerned with modeling Internet routing dynamics. The dynamics are clearly important, but considerable effort has already gone into such modeling e.g., [14,2324252627. In our first prototype for predicting Internet behavior we model the equilibrium behavior of this system, for the (vastly) predominant case that a stable routing solution exists. It is these equilibrium behaviors that are of most interest for the questions posed earlier. ...
... Improving our understanding of routing dynamics has been a topic of huge interest over the last few years, e.g. [23,24,27,404142. Most of the attention has been given to the dynamics of the BGP protocol, e.g., to understand why convergence time of BGP can be rather long [23,24,40]. ...
... [23,24,27,404142. Most of the attention has been given to the dynamics of the BGP protocol, e.g., to understand why convergence time of BGP can be rather long [23,24,40]. Oscillations in BGP [43] can occur; see [44] for a review of their possible causes. ...
Conference Paper
Full-text available
An understanding of the topological structure of the Internet is needed for quite a number of networking tasks, e. g., making decisions about peering relationships, choice of upstream providers, inter-domain traffic engineering. One essential component of these tasks is the ability to predict routes in the Internet. However, the Internet is composed of a large number of independent autonomous systems (ASes) resulting in complex interactions, and until now no model of the Internet has succeeded in producing predictions of acceptable accuracy.We demonstrate that there are two limitations of prior models: (i) they have all assumed that an Autonomous System (AS) is an atomic structure - it is not, and (ii) models have tended to oversimplify the relationships between ASes. Our approach uses multiple quasi-routers to capture route diversity within the ASes, and is deliberately agnostic regarding the types of relationships between ASes. The resulting model ensures that its routing is consistent with the observed routes. Exploiting a large number of observation points, we show that our model provides accurate predictions for unobserved routes, a first step towards developing structural mod-els of the Internet that enable real applications.
... However, it has been observed that in many cases, BGP routers explore a large number of pos-sible routes before converging on a new stable route. Labovitz et al. (2001) found that the delay in Internet inter-domain path fail-over now averages 3 min and some non-trivial percentage of fail-overs trigger routing table oscillations lasting up to 15 min. ...
... Researchers have been looking into it in different angle trying all to improve, understand or resolve BGP convergence problem. Labovitz et al. (2001) examines the latency in internet path failure/failover and repair die to convergence properties of inter-domain routing. Pei et al. (2002) shows that BGP can take hundreds of seconds to converge after failure, while the delay can be increased for large-scale failure. ...
Article
BGP is a distant vector inters Autonomous System (AS) routing protocol that comes up after EGP to eliminate the inefficiency of EGP with respect to flexibility and scalability and to give support of an actual routing protocol. BGP handles the scalability problem using Classless Inter- Domain Routing (CIDR) and solves the inefficiency of EGP by accumulating all the possible route information to a destination and running a decision process to select a route to be used and to advertise to the peers. Recently BGP protocol starts to encounter several problems such as routing table growth, load balancing problems, BGP hijacking and transit-AS problems and increasing time of Convergence delay. Convergence delay is the time between the selection process for the best path and when the routers settled. Convergence delay started recently to be an issue for internet and larger network as it started to be increased which causes instability in the network. Instability lead to lost packets, delayed delivery, loss of connectivity and long end-to-end delay in the Internet as well as added overhead to BGP routers. The goal of this research is to study the behavior of different network topology in terms of BGP convergence delay besides defining a mathematical model to represent the relationship between convergence delay and number of nodes. Simulation results show that Mesh topology has the highest convergence delay. The study of the relation between convergence delay and number of nodes leads to mathematical equations which some of them represent linear relationship while others represent compound relationship.
... Service-level paths could stretch across the wide-area, and an end-to-end composed session could be long-lived: last for several minutes to probably a few hours. Now, network partitions and routing flaps on the Internet occur often, and could persist for a long time [23]. Hence we are faced with the challenge of providing continued service to the end-user in the presence of such failures. ...
... While this value of two seconds may not be good enough for interactive applications such as two-way telephony, it is at least an order of magnitude better than what is possible today with Internet route recovery -which could take anywhere from 30 seconds to more than ten minutes [23]. And this value of 2 seconds is definitely tolerable for buffered on-demand streaming applications that have buffer-data -typically these have 5-10 seconds worth of buffered data. ...
Article
Application services for the end-user are all important in today' s communication networks, and could dictate the suc- cess or failure of technology or service providers (39). It is important to develop and deploy application functionality quickly (18). The ability to compose services from inde- pendent providers provides a fle xible way to quickly build new end-to-end functionality. Such composition of services across the network and across different service providers becomes especially important in the context of growing popularity and heterogeneity in 3G+ access networks and devices (25). In this work, we provide a framework for constructing such composed services on the Internet. Robustness and high-availability are crucial for Internet services. While cluster-based models for resilience to fail- ures have been built for web-servers (27) as well as proxy services (11, 4), these are inadequate in the context of com- posed services. This is especially so when the application session is long-lived, and failures have to be handled during a session. In the context of composed services, we address the im- portant and challenging issues of resilience to failures, and adapting to changes in overall performance during long- lived sessions. Our framework is based on a connection-oriented over- lay network of compute-clusters on the Internet. The over- lay network provides the context for composing services over the wide-area, and monitoring for liveness and per- formance of a session. We have performed initial analy- ses of the feasibility of network failure detection over the wide-area Internet. And we have a preliminary evaluation of the overhead associated with such an overlay network. We present our plans for further evaluation and refinement of the architecture; and for examining issues related to the creation of the overlay topology.
... This is especially true in the inter-domain case due to excessively long convergence properties of Border Gateway Protocol (BGP) [6]. Research by C. Labovitz et al. [7] presents results, supported by a two year study, demonstrating the delay in Internet inter-domain path failovers averaging three minutes and some percentage of failover recoveries triggered routing table fluctuations lasting up to fifteen minutes. Furthermore the report states that "Internet path failover has significant deleterious impact on end-to-end performance-measured packet loss growth by a factor of 30 and latency by a factor of four during path restoration". ...
... Recovery can therefore take place within a few milliseconds while the complete domain convergence to the new route may take 10s of seconds. For the inter-domain situation studied in the paper, the use of inter-domain routing protocols such as BGP4 makes the recovery latencies much worse [7]. ...
Article
Full-text available
With the fast growth of Internet and a new widespread interest in optical networks, the unparalleled potential of Multi-Protocol Label Switching (MPLS) is leading to further research and development efforts. One of those areas of research is Path Protection Mechanism. It is widely accepted that layer three pro-tection and recovery mechanisms are too slow for today's reliability requirements. Failure recovery latencies ranging from several sec-onds to minutes, for layer three routing protocols, have been widely reported. For this reason, a recovery mechanism at the MPLS layer capable of recovering from failed paths in 10's of milliseconds has been sought. In light of this, several MPLS based protection mech-anisms have been proposed, such as end-to-end path protection and local repair mechanism. Those mechanisms are designed for intra-domain recoveries and little or no attention has been given to the case of non-homogenous independent inter-domains. This pa-per presents a novel solution for the setup and maintenance of in-dependent protection mechanisms within individual domains and merged at the domain boundaries. This innovative solution offers significant advantages including fast recovery across multiple non-homogeneous domains and high scalability. Detailed setup and op-eration procedures are described. Finally, simulation results using OPNET are presented showing recovery times of a few millisec-onds.
... Although routing dynamics and especially BGP [1] dynamics have been extensively studied within the last few years, e.g., [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17], they are still poorly understood. In his work on developing a signal propagation model for BGP updates, T. Griffin [2] observed that "In practice, BGP updates are perplexing and interpretation is very difficult". ...
... The assumption that BGP converges within some limited time and that it is possible to identify the new best route is addressed next. While, e.g., Griffin [2], Labovitz et al. [15], and Maennel and Feldmann [10], have shown that changes to the AS path can lead to BGP path exploration involving many BGP updates spread across a significant time period, it is possible to identify certain "stable" routes [10,9]. Note that not all possible updates within the BGP path exploration process are indeed observable at all points [2]. ...
Conference Paper
Full-text available
This paper presents a methodology for identifying the autonomous system (or systems) responsible when a routing change is observed and propagated by BGP. The origin of such a routing instability is deduced by examining and correlating BGP updates for many prefixes gathered at many observation points. Although interpreting BGP updates can be perplexing, we find that we can pinpoint the origin to either a single AS or a session between two ASes in most cases. We verify our methodology in two phases. First, we perform simulations on an AS topology derived from actual BGP updates using routing policies that are compatible with inferred peering/customer/provider relationships. In these simulations, in which network and router behavior are "ideal", we inject inter-AS link failures and demonstrate that our methodology can effectively identify most origins of instability. We then develop several heuristics to cope with the limitations of the actual BGP update propagation process and monitoring infrastructure, and apply our methodology and evaluation techniques to actual BGP updates gathered at hundreds of observation points. This approach of relying on data from BGP simulations as well as from measurements enables us to evaluate the inference quality achieved by our approach under ideal situations and how it is correlated with the actual quality and the number of observation points.
... For example, replication through clustering or content-delivery networks is expensive and commonly limited to high-end Web sites. Multi-homing (provisioning a site with multiple ISP links) protects against single-link failure, but it cannot avoid the long BGP fail-over times required to switch away from a bad path [12]. Overlay routing networks, such as RON, have been proposed to monitor path quality and select the best available path via the Internet or a series of RON nodes [2]. ...
... The advantage of operating at the packet level, as we do, is more rapid response to failures. Conversely, it is known that BGP dynamics can result in a relatively long fail-over period [12] and that BGP misconfigurations are common [14]. ...
Conference Paper
Full-text available
Recent work has focused on increasing availability in the face of Internet path failures. To date, pro- posed solutions have relied on complex routing and path- monitoring schemes, trading scalability for availability among a relatively small set of hosts. This paper proposes a simple, scalable approach to re- cover from Internet path failures. Our contributions are threefold. First, we conduct a broad measurement study of Internet path failures on a collection of 3,153 Internet destinations consisting of popular Web servers, broad- band hosts, and randomly selected nodes. We monitored these destinations from 67 PlanetLab vantage points over a period of seven days, and found availabilities ranging from 99.6% for servers to 94.4% for broadband hosts. When failures do occur, many appear too close to the destination (e.g., last-hop and end-host failures) to be mitigated through alternative routing techniques of any kind. Second, we show that for the failures that can be addressed through routing, a simple, scalable tech- nique, called one-hop source routing, can achieve close to the maximum benefit available with very low over- head. When a path failure occurs, our scheme attempts to recover from it by routing indirectly through a small set of randomly chosen intermediaries. Third, we implemented and deployed a prototype one- hop source routing infrastructure on PlanetLab. Over a three day period, we repeatedly fetched documents from 982 popular Internet Web servers and used one-hop source routing to attempt to route around the failures we observed. Our results show that our prototype success- fully recovered from 56% of network failures. However, we also found a large number of server failures that can- not be addressed through alternative routing.
... Further, subsequent recovery by using alternate service instances can be completed in a few hundred milliseconds. Thus, network path failures lasting several tens of seconds to minutes [5] can be completely masked from the end client. ...
Conference Paper
Full-text available
Services are capabilities that enable applications and are of crucial importance to pervasive computing in next-generation networks. Service Composition is the construction of complex services from primitive ones; thus enabling rapid and flexible creation of new services. The presence of multiple independent service providers poses new and significant challenges. Managing trust across providers and verifying the performance of the components in composition become essential issues. Adapting the composed service to network and user dynamics by choosing service providers and instances is yet another challenge. In SAHARA, we are developing a comprehensive architecture for the creation, placement, and management of services for composition across independent providers. In this paper, we present a layered reference model for composition based on a classification of different kinds of composition. We then discuss the different overarching mechanisms necessary for the successful deployment of such an architecture through a variety of case-studies involving composition.
... Currently, the de facto EGP is BGP4 [14] and several of its important extensions [15,16,17,18,19]. After years of experimenting and engineering, BGP is reasonably stable and scalable, although route instability [20] and slow route convergence [21] remain significant problems in the backbone networks today. ...
Article
Resource reservation protocols were originally designed to signal end hosts and network routers to provide quality of service to individual real-time flows. More recently, Internet Service Providers (ISPs) have been using the same signaling mechanisms to set up provider-level Virtual Private Networks (VPNs) in the form of MPLS Label Switched Path (LSP). It is likely that the need for reservation signaling protocols will increase, and these protocols will eventually become an indispensable part of Internet service. Therefore, reservation signaling must scale well with the rapidly growing size of the Internet. Over the years, there have been debates over whether or not there is a need for resource reservation. Some people have been advocating over-provisioning as the means to solve link congestion and end-to-end delay problems. The over-provisioning argument is largely driven by the expectation that the bandwidth price will drop drastically. From our investigation, however, we found that many end users have not been benefiting from over-provisioning: the current Internet has bandwidth bottleneck links that can cause longlasting congestion and delay. At the same time, leased line cost has not been reduced sufficiently in a timely manner for many network providers to deploy high-speed links everywhere
... While telephone networks recover from failures on the order of milliseconds [2], the Internet routing table convergence has been observed to take a much longer time [3]; it is not uncommon to have tens of minutes of downtime before recovery has taken place. Although empirical observations of the dynamics have been made in [4,5], formal models of the process are not very common (there has been recent work on an analytical study of Route Flap Damping [6], and another study of BGP in congested networks [7]). There are other models of BGP that verify correctness, satisfiability etc. [8,9,10], but in this paper we explore the dynamical behavior of BGP. ...
Article
Full-text available
Recent studies [1] have revealed vulnerabilities in the routing infrastructure of the In-ternet. It has been conjectured that these vulnerabilities could lead to cascading failures. In this paper we develop simple models for the interaction of routers, looking specifically at the clique topology. We construct two related models, and our analysis indicates that it is indeed possible to have cascading failures in systems such as a BGP clique. We encounter phase transitions in both our models, and we explore the dependence of system parameters on the nature and intensity of the phase transitions. Finally, we comment on the insights that we gain from our analysis.
... Routing overlays [2, 12, 22] allow end-hosts to control the path taken by a packet in the Internet. They have been proposed as a means to address the shortcomings of BGP [16], which does not explicitly select low-latency routes [17] or adapt quickly to failures [7]. Routing overlays make a diversity of paths available to applications and end users, paths that may be more reliable, less loaded, shorter, or have higher bandwidth, than those chosen by ISPs; that is, the potential benefit is high. ...
Article
Routing overlays have the potential to circumvent Internet pathologies to construct faster or more reliable paths. We suggest that overlay routing protocols have yet to become ubiquitous because they do not incorporate mechanisms for finding and negotiating with mutually advantageous peers: nodes in the overlay that can benefit equally from each other. We show that mutually advantageous peers exist in the Inter-net and that one-hop detour routing is sufficient for a latency-reducing overlay. We then simulate such an overlay con-struction process to show that it is efficient and scalable.
... Although congestion outbursts within seconds are hard to detect and bypass, the delay in Internet inter-domain path failovers averages over three minutes [16]. Our loss rate estimation will filter out measurement noise with smoothing techniques, such as exponentially-weighted moving average (EWMA), and detect these path failovers quickly to have applications circumvent them. ...
Conference Paper
Full-text available
Overlay network monitoring enables distributed Internet applications to detect and recover from path outages and periods of degraded performance within seconds. For an overlay network with n end hosts, existing systems either require O(n2) measurements, and thus lack scalability, or can only estimate the latency but not congestion or failures. Unlike other network tomography systems, we characterize end-to-end losses (this extends to any additive metrics, including latency) rather than individual link losses. We find a minimal basis set of k linearly independent paths that can fully describe all the O(n,2) paths. We selectively monitor and measure the loss rates of these paths, then apply them to estimate the loss rates of all other paths. By extensively studying synthetic and real topologies, we find that for reasonably large n (e.g., 100), k is only in the range of O(n log n). This is explained by the moderately hierarchical nature of Internet routine.Our scheme only assumes the knowledge of underlying IP topology, and any link can become lossy or return to normal. In addition, our technique is tolerant to topology measurement inaccuracies, and is adaptive to topology changes.
... Measurement-driven analysis of live networks is critical to understanding and improving their operation (e.g., [3, 28, 13, 18, 22, 24] ). In the case of wireless, however, very little detailed information is currently available on the performance of real deployments, even though wireless protocols are a subject of intense research and standardization [1, 15, 20, 16, 25]. ...
Conference Paper
Full-text available
We present Wit, a non-intrusive tool that builds on passive monitoring to analyze the detailed MAC-level behavior of operational wireless networks. Wit uses three processing steps to construct an enhanced trace of system activity. First, a robust merging procedure combines the necessarily incomplete views from multiple, independent monitors into a single, more complete trace of wireless activity. Next, a novel inference engine based on formal language methods reconstructs packets that were not captured by any monitor and determines whether each packet was received by its destination. Finally, Wit derives network performance measures from this enhanced trace; we show how to estimate the number of stations competing for the medium. We assess Wit with a mix of real traces and simulation tests. We find that merging and inference both significantly enhance the originally captured trace. We apply Wit to multi-monitor traces from a live network to show how it facilitates 802.11 MAC analyses that would otherwise be difficult or rely on less accurate heuristics.
... Consequently, FRM requires more storage and algorithmic sophistication at routers and can be less efficient in bandwidth consumption than traditional multicast solutions. However this tradeoff – the avoidance of distributed computation and configuration at the cost of optimal efficiency – is one we believe is worth exploring given technology trends [5] that can endow routers with significant memory and processing on the one hand and our continued difficulties taming wide-area routing algorithms on the other [6, 7]. The primary focus of this paper is the design and evaluation of FRM. ...
Conference Paper
This paper revisits a much explored topic in networking - the search for a simple yet fully-general multicast design. The many years of research into multicast routing have led to a generally pessimistic view that the complexity of multicast routing-and inter-domain multicast routing in particular - can only be overcome by restricting the service model (as in single-source) multicast. This paper proposes a new approach to implementing IP multicast that we hope leads to a reevaluation of this commonly held view.
... Further, subsequent recovery by using alternate service instances can be completed in a few hundred milliseconds. Thus, network path failures lasting several tens of seconds to minutes [15] can be completely masked from the end client (Table 1, row 1). ...
Conference Paper
Full-text available
Services are capabilities that enable applications and are of crucial importance to pervasive computing in next-generation networks. Service Composition is the construction of complex services from primitive ones; thus enabling rapid and flexible creation of new services. The presence of multiple independent service providers poses new and significant challenges. Managing trust across providers and verifying the performance of the components in composition become essential issues. Adapting the composed service to network and user dynamics by choosing service providers and instances is yet another challenge. In SAHARA, we are developing a comprehensive architecture for the creation, placement, and management of services for composition across independent providers. In this paper, we present a layered reference model for composition based on a classification of different kinds of composition. We then discuss the different overarching mechanisms necessary for the successful deployment of such an architecture through a variety of case-studies involving composition.
... The goal of the traffic engineering technique presented in this paper is to track an optimal distribution of the outbound interdomain traffic on timescales of several minutes. Eventhough engineering the interdomain traffic on such short timescales as minutes might not make sense from an interdomain traffic engineering viewpoint, our purpose in relying on such a fine time-granularity is to demonstrate the feasibility of working on timescales at which BGP convergence takes places [29, 35] to tweak the BGP routes while keeping the burden on BGP very small. The problem we address would thus allow stub ASes to have a more reactive BGP routing under changes of the traffic pattern or of routing failure, only by tweaking the BGP routes. ...
Article
Full-text available
Today, most multi-connected autonomous systems (AS) need to control the flow of their interdomain traffic for both performance and economical reasons. This is usually done by manually tweaking the BGP configurations of the routers on an error-prone trial-and-error basis. In this paper, we demonstrate that designing systematic BGP-based traffic engineering techniques for stub ASes are possible. Our approach to solve this traffic engineering problem is to allow the network operator to define objective functions on the interdomain traffic. Those objective functions are used by an optimization box placed inside the AS that controls the interdomain traffic by tuning the iBGP messages distributed inside the AS. We show that the utilization of an efficient evolutionary algorithm allows to both optimize the objective function and limit the number of iBGP messages. By keeping a lifetime on the tweaked routes, we also show that providing stability to the interdomain path followed by the traffic is possible. We evaluate the performance of solution based on traffic traces from two stub ASes of different sizes. Our simulations show that the interdomain traffic can be efficiently engineered by using not more than a few iBGP advertisements per minute.Our contribution in this paper is to demonstrate that by carefully thinking the design of the interdomain traffic engineering technique, stub ASes can engineer their outbound traffic over relatively short timescales, by exclusively tweaking their BGP routes, and with a minimal burden on BGP. Systematic BGP-based traffic engineering for stub ASes is thus possible at a very limited cost in terms of iBGP messages.
... In [2], it has been proved that the standard inter-domain routing protocol (BGP) is generally associated with a long convergence properties leading to potential latency in internet path failure, failover and repair. Moreover, for two years, research in [3] has demonstrated that the inter domain failover may reach over 3 minutes and can cause, therefore, several routing fluctuations up to 15 minutes. It has also been announced that such fluctuations cause critical end-to-end packet loss rate and delay that may reach respectively a factor of 30 and 4 during path restoration. ...
Article
Full-text available
This paper suggests a new mechanism for inter domain recovery purposewithin MPLS-based networks. This mechanism is based on e±cient collabo-ration between several entities called PCE (Path Computation Element). Wedenote a PCE per domain. All PCEs should communicate in order to ensurepropitious End-to-End failure handling, recovery and restoration. Based onnormative instructions described in the [RFC 5298] and a novel approachpresented in [RFC5441], known as the BRPC approach (Backward Recur-sive PCE-based Computation), the new mechanism o®ers an opportunity toachieve E2E recovery using up-to-date information and giving a way to main-tain optimal network states as well in intra-domain scope as in inter domainscope, in order to generate a global visibility of the entire network withoutcare about heterogeneity neither autonomy of crossed domains. Simulationresults prove that the proposed solution is able to resolve, more e±ciently,inter domain recovery issue regardless the AS policies and rules and is ableto overcome the inter-domain routing protocol (BGP, Border Gateway Pro-tocol) shortcuts and divergences.
Conference Paper
Weather is a leading threat to the stability of our vital infrastructure. Last-mile Internet is no exception. Yet, unlike other vital infrastructure, weather's effect on last-mile Internet outages is not well understood. This work is the first attempt to quantify the effect of weather on residential outages. Investigating outages in residential networks due to weather is challenging because residential Internet is heterogeneous: there are different media types, different protocols, and different providers, in varying contexts of different local climate and geography. Sensitivity to these different factors leads to narrow categories when estimating how weather affects these different links. To address these issues we perform a large-scale study looking at eight years of active outage measurements that were collected across the bulk of the last mile Internet infrastructure in the United States.
Chapter
Full-text available
In order to understand the Internet topology, it is important to analyze the underlying networks’ characteristics. Internet is enabled by independently operating Autonomous Systems (ASes) that collaborate to provide end-to-end communication. In this paper, we investigate the network characteristics of backbone ASes that provide transit connectivity. We collect router-level probe data sets from all of the public Internet topology measurement platforms and obtain network topologies of the backbone ASes. We then analyze the network characteristics of each AS and perform an in-depth analysis of the high ranked ASes. Analyzing two snapshots, we observe disassortative network topologies in the majority of AS topologies independent of their network size. Also, most of the top-ranked ASes have a densely connected core and exhibit power-law degree distributions.
Conference Paper
Full-text available
The Border Gateway Protocol (BGP) has been used for decades as the de facto protocol to exchange reachability information among networks in the Internet. However, little is known about how this protocol is used to restrict reachability to selected destinations, e.g., that are under attack. While such a feature, BGP blackholing, has been available for some time, we lack a systematic study of its Internet-wide adoption, practices, and network efficacy, as well as the profile of blackholed destinations. In this paper, we develop and evaluate a methodology to automatically detect BGP blackholing activity in the wild. We apply our method to both public and private BGP datasets. We find that hundreds of networks, including large transit providers, as well as about 50 Internet exchange points (IXPs) offer blackholing service to their customers, peers, and members. Between 2014--2017, the number of blackholed prefixes increased by a factor of 6, peaking at 5K concurrently blackholed prefixes by up to 400 Autonomous Systems. We assess the effect of blackholing on the data plane using both targeted active measurements as well as passive datasets, finding that blackholing is indeed highly effective in dropping traffic before it reaches its destination, though it also discards legitimate traffic. We augment our findings with an analysis of the target IP addresses of blackholing. Our tools and insights are relevant for operators considering offering or using BGP blackholing services as well as for researchers studying DDoS mitigation in the Internet.
Conference Paper
BGP, the only inter-domain routing protocol used today, often converges slowly upon outages. While fast-reroute solutions exist, they can only protect from local outages, not remote ones (e.g., a failure in a transit network). To address this problem, we proposed SWIFT, a fast-reroute framework enabling BGP routers to locally restore connectivity upon remote outages by combining fast inference mechanisms in the control-plane with fast data plane updates. While SWIFT is deployable on a per-router basis, we show in this demonstration that we can deploy SWIFT in Software-Defined Internet Exchange Points (SDXes) with a simple software update. We show that "SWIFTing" an SDX is highly beneficial as it enables to converge the entire fabric within few seconds instead of the tens of seconds required by the original software.
Article
The Internet was designed to always find a route if there is a policy-compliant path. However, in many cases, connectivity is disrupted despite the existence of an underlying valid path. The research community has focused on short-term outages that occur during route convergence. There has been less progress on addressing avoidable long-lasting outages. Our measurements show that long-lasting events contribute significantly to overall unavailability. To address these problems, we develop LIFEGUARD, a system for automatic failure localization and remediation. LIFEGUARD uses active measurements and a historical path atlas to locate faults, even in the presence of asymmetric paths and failures. Given the ability to locate faults, we argue that the Internet protocols should allow edge ISPs to steer traffic to them around failures, without requiring the involvement of the network causing the failure. Although the Internet does not explicitly support this functionality today, we show how to approximate it using carefully crafted BGP messages. LIFEGUARD employs a set of techniques to reroute around failures with low impact on working routes. Deploying LIFEGUARD on the Internet, we find that it can effectively route traffic around an AS without causing widespread disruption.
Conference Paper
To study Internet's routing behavior on the granularity of Autonomous Systems (ASes), one needs to understand inter-domain routing policy. Routing policy changes over time, and may cause route oscillation, network congestion, and other problems. However, there are few works on routing policy changes and their impact on BGP's routing behaviors. In this paper, we model inter-domain routing policy as the preference to neighboring ASes, noted as neighbor preference, and propose an algorithm for quantifying routing policy changes based on neighbor preference. As a further analysis, we study the routing policy changes for the year of 2012, and find that generally an AS may experience a routing policy change for at least 20% prefixes within 6 months. An AS changes its routing policy mainly by exchanging two neighboring ASes' preference. In most cases, an AS changes a stable fraction of its prefixes' routing policy, but non-tier1 ASes may endure a large scale routing policy changing event. We also analyse the main reasons of routing policy changes, and exclude the possibilities of AS business relationship changes and topology changes.
Conference Paper
This paper describes the design and implementation of SecondSite, a cloud-based service for disaster tolerance. SecondSite extends the Remus virtualization-based high availability system by allowing groups of virtual machines to be replicated across data centers over wide-area Internet links. The goal of the system is to commodify the property of availability, exposing it as a simple tick box when configuring a new virtual machine. To achieve this in the wide area, we have had to tackle the related issues of replication traffic bandwidth, reliable failure detection across geographic regions and traffic redirection over a wide-area network without compromising on transparency and consistency.
Conference Paper
Full-text available
The Minimal Route Advertisement Interval (MRAI) plays a prominent role in convergence of the Border Gateway Protocol (BGP). Previous studies have suggested using adaptive MRAI and reusable timers to reduce the BGP convergence time. The adaptive MRAI timers perform well under the normal load of BGP updates. However, a large number of BGP updates may flood Internet routers. We propose a new algorithm, MRAI with Flexible Load Dispersing (FLD-MRAI), which reduces the router's overhead by dispersing the load in case of a large number of BGP updates. We also examine the MRAI timers under the normal load of BGP updates. Since BGP routing policies play a significant role in preserving the Internet routing stability, we evaluate their impact on BGP convergence time and Route Flap Damping (RFD) algorithms. The proposed algorithms are evaluated using the ns-BGP network simulator.
Conference Paper
Internet is composed of numbers of independent autonomous systems. BGP is used to disseminate reachability information and establishing path between autonomous systems. Each autonomous system is allowed to select a single route to a destination and then export the selected route to its neighbors. The selection of single best route imposes restrictions on the use of alternative paths during interdomain link failure and thus, incurred packet loss. Packet loss still occurs even when multiple paths exist between source and destination but these paths have not been utilized. To minimize the packet loss, when multiple paths exist, multipath routing techniques are introduced. Multipath routing techniques ensure the use of alternative paths on link failure. It computes set of paths which can be used when primary path is not available and it also provides a way to transit domains to have control over the traffic flow. This can be achieved by little modification to current BGP. This paper highlights different multipath routing techniques and also discusses the overhead incurred by each of them.
Conference Paper
In this paper, we propose a new session maintenance protocol called managed session protocol (MSP). It provides services such as automatic restart of transport session when down, multi-session and multi-homing. As a result, MSP offers many advantages for upper layer applications (ULP). For instance, border gateway protocol (BGP) can use MSP in order to automatic set up multiple sessions between a pair of peers. In addition, MSP can be used without re-compiling or re-linking existing binaries and without making any kernel modifications. We have verified relevant properties focused on maintenance features of MSP by using model checking techniques. In addition, we have carried out an implementation of MSP in the framework of an open source router platform so-called extensible open router platform (XORP). We present as well as both a BGP re-implementation using MSP and some performance analysis
Article
Full-text available
Researchers from the University of Massachusetts developed fluid-based methodologies for characterizing the behavior of large IP networks handling large numbers of TCP and UDP flows. These methodologies provide for rapid and efficient rates of individual and aggregate flows. These fluid models were also applied to the problems of characterizing the spread of worms and viruses and to the cascade of failures within the BGP routing infrastructure. The resulting fluid models were used to develop novel active queue management mechanisms resulting in more stable TCP performance and novel rate controllers for the purpose of providing minimum rate guarantees to TCP flow aggregates. Last, methodologies and tools were developed to integrate fluid network simulation with packet-level simulation.
Article
Network datasets are necessary for many types of network research. While there has been significant discussion about specific datasets, there has been less about the overall state of network data collection. The goal of this paper is to explore the research questions facing the Internet today, the datasets needed to answer those questions, and the challenges to using those datasets. We suggest several practices that have proven important in use of current data sets, and open challenges to improve use of network data.
Conference Paper
In the hope of stimulating discussion, we present a heuristic decision tree that designers can use to judge how suitable a P2P solution might be for a particular problem. It is based on characteristics of a wide range of P2P systems from the literature, both proposed and deployed. These include budget, resource relevance, trust, rate of system change, and criticality.
Conference Paper
Future military networks will use satellite nodes in conjunction with terrestrial ad-hoc networks in a packet and IP based framework. These long-range networks have a unique combination of characteristics and requirements that have caused network architects to consider the border gateway protocol (BGP) as the routing protocol. In this paper we analyze the performance of BGP in various scenarios relevant to the network under consideration based on both an analysis of protocol behavior as well as detailed simulations and traces of BGP. More specifically, we study the interaction of various BGP and TCP timers amidst node mobility and intermittent links and analyze their impact on routing and convergence times. Finally, through our simulations and analyses, we propose ways to address the routing and convergence time problems associated with link dynamics and BGP, which in essence include guidance/insights to judicious setting of timer values, modifications to BGP configuration, and its control via configuration management
Conference Paper
Full-text available
This paper proposes a novel recovery mechanism from large-scale network failures caused by earthquakes, terrorist attacks, large-scale power outages and software bugs. Our method, which takes advantage of overlay networking technologies, pre-calculates multiple routing configurations to prevent possible simultaneous network failures and selects one configuration immediately after detecting the failures. Through numerical calculation results using actual AS-level topology, we show that our proactive method improves network reachability from 89% to 99%, while keeping the path length sufficiently short, when up to 8% of the nodes in a network are down simultaneously.
Conference Paper
The Border Gateway Protocol, BGP, is currently the de-facto inter-domain routing protocol employed on the Internet. It is responsible for exchanging routing information between autonomous systems(ASes) and selecting routes to destination networks. Each AS defines and applies local policies to select routes and propagate routes to neighboring ASes. However, studies show that conflicting local policies among a collection of ASes can lead to BGP divergence. This paper presents an online method to eliminate BGP divergence. The proposed method uses the AS path history, which is composed of the AS path contained in the best route, to discover routes divergence. When there is a cycle of AS path in the history, the potential BGP divergence occurs. In order to eliminate the BGP divergence, the proposed method adjusts the routes preference of involving based on AS relationships and advertise best routes according to relationship between announcer AS and receiver AS.
Article
Full-text available
The connectivity of the Internet at the Autonomous System level is influenced by the network operator policies implemented. These in turn impose a direction to the announcement of address advertisements and, consequently, to the paths that can be used to reach back such destinations. We propose to use directed graphs to properly represent how destinations propagate through the Internet and the number of arc-disjoint paths to quantify this network¿s path diversity. Moreover, in order to understand the effects that policies have on the connectivity of the Internet, numerical analyses of the resulting directed graphs were conducted. Results demonstrate that, even after policies have been applied, there is still path diversity which the Border Gateway Protocol cannot currently exploit.
Article
This paper proposes a novel overlay architecture to improve availability and performance of end-to-end communication over the Internet. Connectivity and network availability are becoming business-critical resources as the Internet is increasingly utilized as a business necessity. For example, traditional voice and military systems are turning into IP-based network applications. With these applications, even short-lived failures of the Internet infrastructure can generate significant losses.To satisfy these needs, the concept of overlay networks has been widely discussed. However, in the previous studies of overlay networks, a measurable number of path outages were still unavoidable even with use of such overlay networks. We believe that an overlay network’s ability to quickly recover from path outages and congestion is limited unless we ensure path independence at the IP layer. Hence, we develop a simple but effective overlay architecture increasing path independence without degrading performance. The proposed overlay architecture enhances prior studies in the following ways: (1) we deploy overlay nodes considering topology and latency information inside an ISP and also across ISP boundaries; (2) we use a source-based single-hop overlay routing combined with the above topology-aware node deployment; (3) we increase the usage of multi-homing environment at endhosts. In this framework, we develop measurement-based heuristics using extensive data collection from 232 points in 10 ISPs, and 100 PlanetLab nodes. We also validate the proposed framework using real Internet outages to show that our architecture is able to provide a significant amount of resilience to real-world failures.
Conference Paper
Full-text available
The border gateway protocol (BGP) is the routing protocol that glues together the global Internet. BGP makes use of a "managed session" to maintain a bidirectional error-free session over which reachability information is exchanged between autonomous systems (ASes) in the global Internet. Nevertheless, despite its great importance to BGP, this "managed session" suffers from some weaknesses such as a slow failure mechanism, which may represent a great deal of lost data, and some security problems. Besides, some important services to BGP are missed such as multi-session and broadcasting. To overcome these weaknesses, session maintenance should be separated from BGP protocol. Instead, a separate managed session protocol (MSP) should cope with session maintenance. Accordingly, we argue that our approach may reduce BGP design complexity, increase both BGP reliability and robustness, and lead to a routing system that is easier to manage. We also describe our prototype implementation and some evaluation details
Conference Paper
Full-text available
A proper support for multimedia communications transport has to provide fault tolerance capabilities such as the preservation of established con- nections in case of failures. While multi-homing addresses this issue, the cur- rently available solution based in massive BGP route injection presents serious scalability limitations, since it contributes to the exponential growth of the BGP table size. Alternative solutions proposed for IPv6 fail to provide equivalent fa- cilities to the current BGP based solution. In this paper we present MEX (Muti- homing through EXtension header) a novel proposal for the provision of IPv6 multi-homing capabilities. MEX preserves overall scalability by storing alter- native route information in end-hosts while at the same time reduces packet loss by allowing routers to re-route in-course packets. This behavior is enabled by conveying alternative route information within packets inside a newly defined Extension Header. The resulting system provides fault tolerance capabilities and preserves scalability, while the incurred costs, namely deployment and packet overhead, are only imposed to those that benefit from it. An implemen- tation of the MEX host and router components is also presented.
Conference Paper
Full-text available
This paper presents a new mechanism for improving the convergence properties of path vector routing algorithms, such as BGP. Using a route's path information, we develop two consistency assertions for path vector routing algorithms that are used to compare similar routes and identify infeasible routes. To apply these assertions in BGP, mechanisms to signal failure/policy withdrawal, and traffic engineering are provided. Our approach was implemented and deployed in a BGP testbed and evaluated using simulation. By identifying and ignoring the infeasible routes, we achieved substantial reduction in both BGP convergence time and the total number of intermediate route changes.
Conference Paper
Full-text available
Many large ISP networks today rely on route-reflection [1] to allow their iBGP to scale. Route-reflection was officially introduced to limit the number of iBGP sessions, compared to the n×(n1)2\frac{n\times(n-1)}{2} sessions required by an iBGP full-mesh. Besides its impact on the number of iBGP sessions, route-reflection has consequences on the diversity of the routes known to the routers inside an AS. In this paper, we quantify the diversity of the BGP routes inside a tier-1 network. Our analysis shows that the use of route-reflection leads to a very poor route diversity compared to an iBGP full-mesh. Most routers inside a tier-1 network know only a single external route in eBGP origin. We identify two causes for this lack of diversity. First, some routes are never selected as best by any router inside the network, but are known only to some border routers. Second, among the routes that are selected as best by at least one other router, a few are selected as best by a majority of the routers, preventing the propagation of many routes inside the AS. We show that the main reason for this diversity loss is how BGP chooses the best routes among those available inside the AS.
Conference Paper
Full-text available
A basic function of a Session Maintenance Protocol (SMP) is to keep sessions alives amongst two or more nodes as far as possible. However, a SMP should provide other essential services to its mission critical applications such as multi-session, multihoming, failure detection mechanism and graceful automatic restart of transport session. It has been proposed a new SMP so-called Managed Session Protocol (MSP), and a sub-protocol that operates on top of MSP so-called Multiple Managed Session Protocol (MMSP). Together MSP/MMSP provide services such as graceful automatic restart of transport session, preservation of application message boundaries, multi-session and multi-homing. In this paper, it is presented an implementation and performance analysis of MSP/MMSP. They have been implemented using a C++ based open-source router platform called eXtensible Open Router Platform (XORP). Two approaches are presented in order to implement MSP/MMSP. Aiming to compare these two approaches and validate the behaviors of MSP/MMSP, some experiments have been carried out using a real-life case study: Border Gateway Protocol (BGP). Furthermore, a performance analysis is presented that provide some insight, not only about the two MSP implementation approaches, but also the effectiveness of MSP/MMSP.
Conference Paper
An endpoint’s IP address can become inaccessible due to an interface failure, severe congestions, or due to Border Gateway Protocol’s (BGP) slow route convergence around path outages. In this situation, mission critical applications may require either a fast failure detection or multihoming service. A fast detect failure mechanism allows routing protocols to reroute traffic around problems as soon as they get aware of the link failures. On the other hand, redundancy at the network layer allows a host to be accessible even if one of its IP addresses becomes unreachable. Even though Transmission Control Protocol (TCP) provides some level of reliability for its applications, it does not provide these essential services required for mission critical applications. Accordingly, in this paper, it is investigated main requirements of a session maintenance protocol. Besides, it is presented a reliable approach of transport session management in order to overcome TCP’s shortcoming.
Conference Paper
Full-text available
The border gateway protocol (BGP) is the routing protocol used to maintain connectivity between autonomous systems in the Internet. Empirical measurements have shown that there can be considerable delay in BGP convergence after routing changes. One contributing factor in this delay is a BGP-specific timer used to limit the rate at which routing messages are transmitted. We use the SSFNet simulator to explore the relationship between convergence time and the configuration of this timer. For each simple network topology simulated, we observe that there is an optimal value for the rate-limiting timer that minimizes convergence time.
Conference Paper
Structured peer-to-peer overlays provide a natural infrastructure for resilient routing via efficient fault detection and precomputation of backup paths. These overlays can respond to faults in a few hundred milliseconds by rapidly shifting between alternate routes. In this paper, we present two adaptive mechanisms for structured overlays and illustrate their operation in the context of Tapestry, a fault-resilient overlay from Berkeley. We also describe a transparent, protocol-independent traffic redirection mechanism that tunnels legacy application traffic through overlays. Our measurements of a Tapestry prototype show it to be a highly responsive routing service, effective at circumventing a range of failures while incurring reasonable cost in maintenance bandwidth and additional routing latency.
Article
Full-text available
Previous measurement studies have shown the existence of path exploration and slow convergence in the global Internet routing system, and a number of protocol enhancements have been proposed to remedy the problem. However, existing measurements were conducted only over a small number of testing prefixes. There has been no systematic study to quantify the pervasiveness of Border Gateway Protocol (BGP) slow convergence in the operational Internet, nor any known effort to deploy any of the proposed solutions. In this paper, we present our measurement results that identify BGP slow convergence events across the entire global routing table. Our data shows that the severity of path exploration and slow convergence varies depending on where prefixes are originated and where the observations are made in the Internet routing hierarchy. In general, routers in tier-1 Internet service providers (ISPs) observe less path exploration, hence they experience shorter convergence delays than routers in edge ASs; prefixes originated from tier-1 ISPs also experience less path exploration than those originated from edge ASs. Furthermore, our data show that the convergence time of route fail-over events is similar to that of new route announcements and is significantly shorter than that of route failures. This observation is contrary to the widely held view from previous experiments but confirms our earlier analytical results. Our effort also led to the development of a path-preference inference method based on the path usage time, which can be used by future studies of BGP dynamics.
Conference Paper
Full-text available
There is an increasing economic desire driven by widespread applications like IPTV or conferencing that a next generation Internet will grant transparent group communication service to all its station- ary and mobile users. In this paper, we present a generic approach to inter-domain multicast, which is guided by an abstract, DHT-inspired overlay, but may operate on a future Internet architecture. It is based on the assumptions of a globally available end-to-end unicast routing between resolvable locators, taken from a name space that allows for ag- gregation. Our protocol design accounts for this aggregation, leading to forward-path forwarding along bidirectional shared distribution trees in prefix space. The scheme facilitates multipath multicast transport, offers fault-tolerant routing, arbitrary redundancy for packets and paths and remains mobility agnostic. We present OASIS, its application to IPv6, and evaluate signaling costs analytically based on its k-ary tree structure.
ResearchGate has not been able to resolve any references for this publication.