Figure 3 - uploaded by Andy Bavier
Content may be subject to copyright.
Source publication
Recent efforts to add new services to the Internet have led to increased interest in software-based routers. This is because such routers are easily extended to support new and changing services. This paper describes our experiences implementing a softwarebased router, with a particular focus on the main difficulty we encountered: how to schedule t...
Contexts in source publication
Context 1
... approach should not be discarded too quickly, however, because it works perfectly well for best effort flows which can live with the FIFO queue that this forwarding process services. The compromise approach is to establish a process for each flow, as illustrated in Figure 3. Since we have removed all protocol processing from the input and output processes, we reduce them to the remaining role they play, which is to READ, CLASSIFY, and ENQUEUE packets in the input process (I), and SELECT, DEQUEUE, and WRITE packets in the output process (O). ...
Context 2
... QoS support and programmability are orthogonal issues. Thus, whereas the process model shown in Figure 3 combines both capabilities, it is possible to support multiple functions without supporting differentiated services. Supporting just extensibility results in the simpler process model in which each forwarding process running some function writes packets directly to an output port. ...
Similar publications
Routers are one of the important entities in computer networks specially the Internet. Forwarding IPpackets is a valuable and vital function in Internet routers. Routers extract destination IP addressfrom packets and lookup those addresses in their own routing table. This task is called IP lookup.Internet address lookup is a challenging problem due...
Exploiting instruction-level parallelism (ILP) is extremely important for achieving high performance in application specific instruction set processors (ASIPs) and embedded processors. Existing techniques deal with either scheduling hardware pipelines to obtain higher throughput or software pipeline-an instruction scheduling technique for iterative...
Wireless sensor networks occupy a prominent role in industrial as well as scientific applications. Lifetime enhancement and coverage are the major factors considered while designing the network. Various research models are evolved by considering the scheduling and routing process to solve the network lifetime issues. However, coverage and connectiv...
Recently Cyber Physical Systems (CPS) have become new research hotspots. Node operating systems (OS) are fundamental systems supporting CPS. When designing CPS especially designing node OS there are still many problems unsolved in aspects of predictability, reliability, robustness, etc. This paper analyzes the needs of node OS and presents two impo...
Vision Transformers (ViTs) have shown impressive performance and have become a unified backbone for multiple vision tasks. But both attention and multi-layer perceptions (MLPs) in ViTs are not efficient enough due to dense multiplications, resulting in costly training and inference. To this end, we propose to reparameterize the pre-trained ViT with...
Citations
... The master scheduler (running on the master processor) provides only coarse grain adjustments to each scheduling domain to guide their independent scheduling decisions. The scheduler running in each scheduling domain builds upon our existing scheduling work [45], which in turn, is derived from the WF 2 Q+ scheduler [5]. We have added hooks to support coarse grain influence from the master processor. ...
... The scheduler running in each scheduling domain builds upon our existing scheduling work [45], which in turn, is derived from the WF 2 Q+ scheduler [5]. We have added hooks to support coarse grain influence from the master processor. ...
Providing fault tolerance and combating denial of service (DoS) attacks have traditionally been the research subjects of the fault tolerant computing community and the security community. This report takes a different perspective, one that applies a unified set of mechanisms and algorithms to the problem of protecting a network system from both failures and DoS attacks. The problem is viewed as a matter of resource allocation and management. Protection against DoS attacks is addressed as a special case of careful scheduling of time and fault tolerance is addressed as a special case of careful scheduling of space. A set of general scheduling mechanisms has been developed for both time and space. Specifically, the focus was on three key aspects of a networked system: (1) the local resources on each network router, (2) the network-wide resources applied to a given information service, and (3) the network bandwidth consumed by end- to-end flows.
... Here meeting packet-times for multi-Gbps and 10Gbps links is usually not possible as described in Section 4.1, and these systems are mostly targeted towards multimedia applications, firewalls, network intrusion and dataplane applications with modest bandwidth requirements. A number of recent projects have developed architectures, analyzed performance bottlenecks and provided flexible extensions for software-based routers [5, 19, 11, 22] on end-systems. The focus has been on extensibility and performance for data forwarding IP-protocol applications, rather than Ethernet frames for multi-Gbps links. ...
... The focus has been on extensibility and performance for data forwarding IP-protocol applications, rather than Ethernet frames for multi-Gbps links. While QoS capabilities with Deficit Round Robin[5] and H-FSC[23] have been studied in [5, 19, 11], [22] focuses on software architectures for high-rate forwarding in Network processors without considering impact of complex scheduling disciplines. ...
... We compare the performance of the ShareStreams architecture with contemporary systems from industry and academic research projects for both the line-card realization and the Endsystem/host-based router realization. [11, 19, 5]. The Click modular router described in [11] module using Linux 2.2 with a 700MHz Pentium III. ...
ShareStreams (Scalable Hardware Architectures for Stream Schedulers) is a canonical architecture for realizing a range of scheduling disciplines. This paper discusses the design choices and tradeoffs made in the development of a endsystem/host-based router realization of the ShareStreams architecture. We evaluate the impact of block decisions and aggregation on the ShareStreams architecture. Using processor resources for queuing and data movement, and FPGA hardware for accelerating stream selection and stream priority updates, ShareStreams can easily meet the wire-speeds of 10 Gbit/s links. This allows provision of customized scheduling solutions and interoperability of scheduling disciplines. Our hardware implemented in the Xilinx Virtex I family easily scales from 4 to 32 stream-slots on a single chip. A host-based router prototype with FPGA PCI card under systems software control, can provide scheduling support for a mix of EDF, static-priority and fair-share streams based on user specifications and meet the temporal bounds and packet-time requirements of multi-gigabit links.
... For example, we have measured Linux to be up to 1.3. PROBLEM 19 six times slower forwarding IP--packets than Scout [51]. More importantly, however, such general-purpose systems provide no explicit support for adding new forwarders. ...
... Within our router, there is a fundamental tension between (1) supporting arbitrary classification and forwarding functions, and (2) supporting QoS. Qie, et al. [51] shows that to support QoS effectively on a uniprocessor software router, one should use separately scheduled threads for classification, forwarding, and scheduling. In fact, to support QoS, it is important that classifiers be able to determine the fate of packets at line speed. ...
... Such processors still require an admission control decision—for example, to ensure that the average cycle demand of the admitted functions does not exceed the processor's capacity—but the dynamic scheduler is able to allocate cycles to different functions based on the actual workload (packet arrival rate) it is experiencing. A proportional share scheduler is a likely implementation since it guarantees that the function (flow) receives at least the cycle rate it requested, and fairly allocates any unused capacity among the active functions [51]. After new forwarder threads are instantiated, they must be scheduled along with all the other classifier, forwarder, and output scheduler threads. ...
The demand to extend the set of services, such as network address translation, firewalls, proxies, and virtual private networks, that are supported by Internet-connected devices represents an opportunity to extend the traditional domain of Internet routers beyond simple packet forwarding. An important characteristic is the ability for end-users to install custom services on their routers. Routers with this characteristic are extensible.
... Software-based routers have always played a role in the Inter- net [16], but they are becoming increasingly important as the set of services routers are expected to support----e.g., firewalls, intrusion detection, proxies, level-n switching, packet tagging, overlay networks----continues to grow. Although software-based routers have historically been built from PC-class machines with conventional network interface cards (NICs) [13, 19], the emergence of network processors [8, 10, 25] makes it possible to significantly improve the performance of software-based ronters at a modest increase in cost. For example, this paper describes a router, built from a PC using a 733MHz Pentium III and an IXP1200 development board, that demonstrates nearly an order of magnitude improvement in performance over a pure PC-based router, at a cost of roughly US$1500, based on an estimated US$700 for a IXP1200 board produced in low volume. ...
... This section describes our software and hardware architectures. In the case of the software architecture, our starting point is a communication-oriented OS that runs on a Pentium with non-programmable NICs [12, 19], to which we add a device driver and IXP microcode. This section gives a high-level overview of the original Pentium-based system; later sections focus on those aspects of the architecture that are relevant to a multi-level processor hierarchy (i.e., the driver and microcode components).Figure 2 depicts the software architecture for the router. ...
... Regarding scheduling, we run a proportional share scheduler on the Pentium, where deciding what share to allocate to each flow is a policy issue. For example, we allocate sufficient cycles to the OSPF control protocol to ensure that it is able to update the routing table at an acceptable rate, and we allow forwarders that implement per-flow services to reserve both a packet rate and a cycle rate [19] . We eventually plan to run a proportional share scheduler on the StrongARM, since in general it might also run arbitrary forwarders, but we currently implement a simple priority scheme that gives packets being passed up to the Pentium precedence over packets that are to be processed locally. ...
Recent efforts to add new services to the Internet have increased interest in software-based routers that are easy to extend and evolve. This paper describes our experiences using emerging network processors ---in particular, the Intel IXP1200---to implement a router. We show it is possible to combine an IXP1200 development board and a PC to build an inexpensive router that forwards minimumsized packets at a rate of 3:47Mpps. This is nearly an order of magnitude faster than existing pure PC-based routers, and sufficient to support 1:77Gbps of aggregate link bandwidth. At lesser aggregate line speeds, our design also allows the excess resources available on the IXP1200 to be used robustly for extra packet processing. For example, with 8100Mbps links, 240 register operations and 96 bytes of state storage are available for each 64-byte packet. Using a hierarchical architecture we can guarantee line-speed forwarding rates for simple packets with the IXP1200, and still have extra capacity to process exceptional packets with the Pentium. Up to 310Kpps of the traffic can be routed through the Pentium to receive 1510 cycles of extra per-packet processing. 1.
Active Node is a network device capable of forwarding packets and giving them the computation service in the meantime. It plays a critical role in capsule-based active networks to speed up the development of a protocol and facilitate the deployment of a service inside networks. When getting overloaded, however, it becomes a throughput bottleneck to all Active Applications whose packets traverse the Active Node. It can enable the Bottleneck Active Node Detouring (BAND) proposed in this paper to free Active Applications from the penalty of poor throughput because not all Active Applications need the computation service in the bottleneck Active Node. Besides, it can enable the BAND to give Active Applications other benefits identified in this paper.
A TCP forwarder is a network node that establishes and forwards data between a pair of TCP connections. An example of a TCP forwarder is a firewall that places a proxy between a TCP connection to an external host and a TCP connection to an internal host, controlling access to a resource on the internal host. Once the proxy approves the access, it simply forwards data from one connection to the other. We use the term TCP forwarding to describe indirect TCP communication via a proxy in general. This paper characterizes the behavior of TCP forwarding, and illustrates the role TCP forwarding plays in common network services like firewalls and HTTP proxies. We then introduce an optimization technique, called connection splicing, that can be applied to a TCP forwarder, and report the results of a performance study designed to evaluate its impact. Connection splicing improves TCP forwarding performance by a factor of two to four, making it competitive with IP router performance on the same hardware