David E. Taylor

Washington University in St. Louis, San Luis, Missouri, United States

Are you David E. Taylor?

Claim your profile

Publications (15)9.46 Total impact

  • Source
    David E. Taylor
    [Show abstract] [Hide abstract]
    ABSTRACT: Packet classification is an enabling function for a variety of Internet applications including Quality of Service, security, monitoring, and multimedia communications. In order to classify a packet as belonging to a particular flow or set of flows, network nodes must perform a search over a set of filters using multiple fields of the packet as the search key. In general, there have been two major threads of research addressing packet classification: algorithmic and architectural. A few pioneering groups of researchers posed the problem, provided complexity bounds, and offered a collection of algorithmic solutions. Subsequently, the design space has been vigorously explored by many offering new algorithms and improvements upon existing algorithms. Given the inability of early algorithms to meet performance constraints imposed by high speed links, researchers in industry and academia devised architectural solutions to the problem. This thread of research produced the most widely-used packet classification device technology, Ternary Content Addressable Memory (TCAM). New architectural research combines intelligent algorithms and novel architectures to eliminate many of the unfavorable characteristics of current TCAMs. We observe that the community appears to be converging on a combined algorithmic and architectural approach to the problem. Using a taxonomy based on the high-level approach to the problem and a minimal set of running examples, we provide a survey of the seminal and recent solutions to the problem. It is our hope to foster a deeper understanding of the various packet classification techniques while providing a useful framework for discerning relationships and distinctions.
    ACM Computing Surveys 09/2004; DOI:10.1145/1108956.1108958 · 4.04 Impact Factor
  • Source
    David E. Taylor, Jonathan S. Turner
    [Show abstract] [Hide abstract]
    ABSTRACT: Packet classification is the enabling technology for next generation network services and often the primary bottleneck in high-performance routers. Due to the importance and complexity of the problem, a myriad of algorithms and resulting implementations exist. The performance and capacity of many algorithms and classification devices, including TCAMs, depend upon properties of the filter set and query patterns. Unlike microprocessors in the field of computer architecture, there are no standard performance evaluation tools or techniques available to evaluate packet classification algorithms and products. Network service providers are reluctant to distribute copies of real filter databases for security and confidentiality reasons, hence realistic test vectors are a scarce commodity. The small subset of the research community who obtain real databases either limit performance evaluation to the small sample space or employ ad hoc methods of modifying those databases. We present a tool for creating synthetic filter databases that retain characteristics of a seed database and provide systematic mechanisms for varying the number and composition of the filters. We propose a benchmarking methodology based on this tool that provides a mechanism for evaluating packet classification performance on a uniform scale. We seek to initiate a broader discussion within the community that will result in a standard packet classification benchmark.
  • Source
    David E. Taylor, Jonathan S. Turner
    [Show abstract] [Hide abstract]
    ABSTRACT: A wide variety of packet classification algorithms and devices exist in the research literature and commercial market. The existing solutions exploit various design tradeoffs to provide high search rates, power and space efficiency, fast incremental up- dates, and the ability to scale to large numbers of filters. There remains a need for techniques that achieve a favorable balance among these tradeoffs and scale to support classification on addi- tional fields beyond the standard 5-tuple. We introduce Distributed Crossproducting of Field Labels (DCFL), a novel combination of new and existing packet classification techniques that leverages key observations of the structure of real filter sets and takes ad- vantage of the capabilities of modern hardware technology. Using a collection of real and synthetic filter sets, we provide analyses of DCFL performance and resource requirements on filter sets of various sizes and compositions. An optimized implementation of DCFL can provide over 100 million searches per second and stor- age for over 200 thousand filters in a current generation FPGA or ASIC without the need for external memory devices.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: IP address lookup is a central processing function of Internet routers. While a wide range of solutions to this problem have been devised, very few, simultaneously achieve high lookup rates, good update performance, high memory efficiency and low hardware cost. High performance solutions using Content Addressable Memory (CAM) devices are a popular, but high cost solution, particularly when applied to large databases. We present an efficient hardware implementation of a previously unpublished IP address lookup architecture, invented by Eatherton and Dittia. Our experimental implementation uses a single commodity SRAM chip and a less than 10% of the logic resources of a commercial configurable logic device, operating at 100 MHz. With these quite modest resources, it can perform over 9 million lookups per second, while simultaneously processing thousands of updates per second, on databases with over 100,000 entries. The lookup structure requires only about 10 bytes per address prefix, less than half that required by other methods. The architecture allows performance to be scaled up by using parallel Fast IP Lookup (FIPL) engines, which interleave accesses to a common memory interface. This architecture allows performance to scale up directly with available memory bandwidth. We describe the tree bitmap algorithm, our implementation of it in a dynamically extensible gigabit router being developed at Washington University, and the results of performance experiments designed to assess its performance under realistic operating conditions.
    IEEE Journal on Selected Areas in Communications 10/2003; DOI:10.1109/JSAC.2003.810507 · 4.14 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Tools and a design methodology have been developed to support partial run-time reconfiguration of FPGA logic on the Field Programmable Port Extender. High-speed Internet packet processing circuits on this platform are implemented as Dynamic Hardware Plugin (DHP) modules that fit within a specific region of an FPGA device. The PARBIT tool has been developed to transform and restructure bitfiles created by standard computer aided design tools into partial bitsteams that program DHPs. The methodology allows the platform to hot-swap application-specific DHP modules without disturbing the operation of the rest of the system.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Continuing growth in optical link speeds places increasing demands on the performance of Internet routers, while deployment of embedded and distributed network services imposes new demands for flexibility and programmability. IP address lookup has become a significant performance bottleneck for the highest performance routers. Amid the vast array of academic and commercial solutions to the problem, few achieve a favorable balance of performance, efficiency, and cost. New commercial products utilize Content Addressable Memory (CAM) devices to achieve high lookup speeds at an exhorbitantly high hardware cost with limited flexibility. In contrast, this paper describes an efficient, scalable lookup engine design, able to achieve highperformance with the use of a small portion of a reconfigurable logic device and a commodity Random Access Memory (RAM) device. The Fast Internet Protocol Lookup (FIPL) engine is an implementation of Eatherton and Dittia's previously unpublished Tree Bitmap algorithm [1] targeted to an open-platform research router. FIPL can be scaled to achieve guaranteed worst-case performance of over 9 million lookups per second with a single SRAM operating at the fairly modest clock speed of 100 MHz. Experimental evaluation of FIPL throughput, latency, and update performance is provided using a sample routing table from Mae West [2]. I.
    Proceedings - IEEE INFOCOM 04/2002; DOI:10.1109/INFCOM.2002.1019301
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents the dynamic hardware plugins (DHP) architecture for implementing multiple networking applications in hardware at programmable routers. By enabling multiple applications to be dynamically loaded into a single hardware device, the DHP architecture provides a scalable mechanism for implementing high-performance programmable routers. The DHP architecture is presented within the context of a programmable router architecture which processes flows in both software and hardware. Implementation options are described as well as the prototype testbed at Washington University in Saint Louis which utilizes the partial reconfiguration capability of modern field programmable gate arrays.
    Computer Networks 02/2002; DOI:10.1016/S1389-1286(01)00289-4 · 1.28 Impact Factor
  • Source
    2002 DARPA Active Networks Conference and Exposition (DANCE 2002), 29-31 May 2002, San Francisco, CA, USA; 01/2002
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Tools and a design methodology have been developed to support partial run-time reconfiguration of FPGA logic on the Field Programmable Port Extender. High-speed Internet packet processing circuits on this platform are implemented as Dynamic Hardware Plugin (DHP) modules that fit within a specific region of an FPGA device. The PARBIT tool has been developed to transform and restructure bitfiles created by standard computer aided design tools into partial bitsteams that program DHPs. The methodology allows the platform to hot-swap application-specific DHP modules without disturbing the operation of the rest of the system.
    Proceedings of the 39th Design Automation Conference, DAC 2002, New Orleans, LA, USA, June 10-14, 2002; 01/2002
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A prototype platform has been developed that allows processing of packets at the edge of a multi-gigabit-per-second network switch. This system, the Field Programmable Port Extender (FPX), enables packet processing functions to be implemented as modular components in reprogrammable hardware. All logic on the on the FPX is implemented in two Field Programmable Gate Arrays (FPGAs). Packet processing functions in the system are implemented as dynamicallyloadable modules. Core functionality of the FPX is implemented on an FPGA called the Networking Interface Device (NID). The NID contains the logic to transmit and receive packets over a network, dynamically reprogram hardware modules, and route individual tra#c flows. A full, non-blocking, switch is implemented on the NID to route packets between the networking interfaces and the modular components. Modular components of the FPX are implemented on a second FPGA called the Reprogrammable Application Device (RAD). Modules are loaded onto t...
    01/2001
  • John W. Lockwood, Jon S. Turner, David E. Taylor
    [Show abstract] [Hide abstract]
    ABSTRACT: "!$#&%('*)+'-,./.0,1! 324,5*768'9'-,":;=<>(%?246@;-AB,'*C1D" E+ F ;9"#/5*)HG'*)I$#8JK,; 6@;-AB,'*C1D" 28200-35950 F 59U EV+0,ER#7,#I,EVO""#CW F $E+/ E3,BXV +X! :=;9OY,!$,1V! ZEN5[?)'*S ; EN5[?)'*S 28940-338 F ! bROY,! ! Y#c5*XZ "!$#UdG'*)+'-,./.0,1! %)'95CePf5*"EV#N'g<>(%PhiATbM ;=1DN$E+j1 F !5k5*)l, F +./"E5m5*X no,;9XV EV+5*)EqprEV IN'*;95:s2t +,1V53u[8 5*O-X`<vnqp42iuVAZ[85*X '*"G'*)+'-,./.0,1!$t!$)+ O] (%Phw./)# F ! ";@'*";9$#x,545*XxY#+H)J5*XV_nqpi2iu=; 35950 EV+CJv,1'* O]7%X:;9 OY,! !:b5*XV/./)# F ! / ;x E;9N'95*Y#y1DN5d[z"NE ,EZ)G5* OY,!V! EVMOY,'-#B,EV#x5*XVQnqpi2iux+ +,15?; EV#x5*XVQnqpi2iu GV!$,EV]@{zXxXV,'-#[z,'* F ;9"#=Jv)'r5*XV ;tG'*)|"ON54,! ! )[8;tGD)'95*; )J5*XVz; 5*; 29060-249 !$,5*Y#_[85*XZ,EB(%Phl5*)@)GDT'-,5*8,5('-,5*"; F Gc5*)3}] ~32t +,...
  • [Show abstract] [Hide abstract]
    ABSTRACT: Field Programmable Gate Arrays (FPGAs) are being used to provide fast Internet Protocol (IP) packet routing and advanced queuing in a highly scalable network switch. A new module, called the Field-programmable Port Extender (FPX), is being built to augment the Washington University Gigabit Switch (WUGS) with reprogrammable logic.FPX modules reside at the edge of the WUGS switching fabric. Physically, the module is inserted between an optical line card and the WUGS gigabit switch back-plane. The hardware used for this project allows ports of the switch populated with an FPX to operate at rates up to 2.4 Gigabits/second. The aggregate throughput of the system scales with the number of switch ports.Logic on the FPX module is implemented with two FPGA devices. The first device is used to interface between the switch and the line card, while the second is used to prototype new networking functions and protocols. The logic on the second FPGA can be reprogrammed dynamically via control cells sent over the network.The flexibility of the FPX has made the card of interest for several networking applications. This year, fifty FPX hardware modules will be fabricated and distributed to researchers at eight universities around the country who are interested in experimenting with reprogrammable networks and per-flow queuing mechanisms. The FPX hardware will first be used to implement fast IP lookup algorithms and distributed input queueing.
    01/2000
  • Source
    David E. Taylor, Edward W. Spitznagel
    [Show abstract] [Hide abstract]
    ABSTRACT: Packet switched networks such as the Internet require packet classification at every hop in order to ap- ply services and security policies to traffic flows. The relentless increase in link speeds and traffic volume imposes astringent constraints on packet classification solutions. Ternary Content Addressable Memory (TCAM) devices are favored by most network component and equipment vendors due to the fast and de- terministic lookup performance afforded by their use of massive parallelism. While able to keep up with high speed links, TCAMs suffer from exorbitant power consumption, poor scalability to longer search keys and larger filter sets, and inefficient support of multiple matches. The research community has responded with algorithms that seek to meet the lookup rate constraint with greater efficiency through the use of com- modity Random Access Memory (RAM) technology. The most promising algorithms efficiently achieve high lookup rates by leveraging the statistical structure of real filter sets. Due to their dependence on filter set characteristics, it is difficult to provision processing and memory resources for implementations that support a wide variety of filter sets. We show how several algorithmic advances may be leveraged to im- prove the efficiency, scalability, incremental update and multiple match performance of CAM-based packet classification techniques without degrading the lookup performance. Our approach, Label Encoded Content Addressable Memory (LECAM), represents a hybrid technique that utilizes decomposition, label encoding, and a novel Content Addressable Memory (CAM) architecture. By reducing the number of implementation parameters, LECAM provides a vehicle to carry several of the recent algorithmic advances into practice. We provide a thorough overview of CAM technologies and packet classification algorithms, along with a detailed discussion of the scaling issues that arise with longer search keys and larger filter sets. We also provide a comparative analysis of LECAM and standard TCAM using a collection of real and synthetic filter sets of various sizes and compositions.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes the architecture of the Smart Port Card (SPC) designed for use with the Washing- ton University Gigabit Switch. The SPC uses an embedded Intel Pentium processor running open-source NetBSD to support network management and active networking applications. The SPC physically connects between a switch port and a normal link adapter, allowing cell streams to be processed as they enter or leave the switch. In addition to the hardware architecture, this paper describes current and future applications for the SPC.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This document describes the design and functionality of the hardware components implemented in the Field-programmable Port eXtender (FPX) to support the Washington University Network Services Platform (NSP). This includes support for the Multi-Service Router (MSR) and Extreme Networking projects. The functionality of each component is described along with supporting top-level entity diagrams, block dia-