Phil DeMar

Fermi National Accelerator Laboratory (Fermilab), Batavia, Illinois, United States

Are you Phil DeMar?

Claim your profile

Publications (24)4.12 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: The LHC is entering its fourth year of production operation. Most Tier1 facilities have been in operation for almost a decade, when development and ramp-up efforts are included. LHC's distributed computing model is based on the availability of high capacity, high performance network facilities for both the WAN and LAN data movement, particularly within the Tier1 centers. As a result, the Tier1 centers tend to be on the leading edge of data center networking technology. In this paper, we analyze past and current developments in Tier1 LAN networking, as well as extrapolating where we anticipate networking technology is heading. Our analysis will include examination into the following areas: • Evolution of Tier1 centers to their current state • Evolving data center networking models and how they apply to Tier1 centers • Impact of emerging network technologies (e.g. 10GE-connected hosts, 40GE/100GE links, IPv6) on Tier1 centers • Trends in WAN data movement and emergence of software-defined WAN network capabilities • Network virtualization
    Journal of Physics Conference Series 12/2012; 396(4):2011-.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Exascale science translates to big data. In the case of the Large Hadron Collider (LHC), the data is not only immense, it is also globally distributed. Fermilab is host to the LHC Compact Muon Solenoid (CMS) experiment's US Tier-1 Center. It must deal with both scaling and wide-area distribution challenges in processing its CMS data. This poster will describe the ongoing network-related R&D activities at Fermilab as a mosaic of efforts that combine to facilitate big data processing and movement.
    High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:; 01/2012
  • Source
    Wenji Wu, Matt Crawford, Phil DeMar
    [Show abstract] [Hide abstract]
    ABSTRACT: Receive side scaling (RSS) is a network interface card (NIC) technology. It provides the benefits of parallel receive processing in multiprocessing environments. However, existing RSS-enabled NICs lack a critical data steering mechanism that would automatically steer incoming network data to the same core on which its application process resides. This absence causes inefficient cache usage if an application is not running on the core on which RSS has scheduled the received traffic to be processed. In Linux systems, it cannot even ensure that packets in a TCP flow are processed by a single core, even if the interrupts for the flow are pinned to a specific core. This results in degraded performance. In this paper, we develop such a data steering mechanism in the NIC for multicore or multiprocessor systems. This data steering mechanism is mainly targeted at TCP, but it can be extended to other transport layer protocols. We term a NIC with such a data steering mechanism "A Transport Friendly NIC" (A-TFN). Experimental results have proven the effectiveness of A-TFN in accelerating TCP/IP performance.
    IEEE Transactions on Parallel and Distributed Systems 06/2011; · 1.80 Impact Factor
  • Source
    Wenji Wu, P. DeMar, M. Crawford
    [Show abstract] [Hide abstract]
    ABSTRACT: The Intel Ethernet Flow Director is an advanced network interface card (NIC) technology. It provides the benefits of parallel receive processing in multiprocessing environments and can automatically steer incoming network data to the same core on which its application process resides. However, our analysis and experiments show that Flow Director can cause packet reordering in multiprocessing environments. In this paper, we use a simplified model to analyze why Flow Director can cause packet reordering. Our experiments verify our analysis.
    IEEE Communications Letters 03/2011; · 1.16 Impact Factor
  • Source
    Wenji Wu, Phil DeMar, Matt Crawford
    IEEE Communications Letters 01/2011; 15:253-255. · 1.16 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Network traffic is difficult to monitor and analyze, especially in high-bandwidth networks. Performance analysis, in particular, presents extreme complexity and scalability challenges. GPU (Graphics Processing Unit) technology has been utilized recently to accelerate general purpose scientific and engineering computing. GPUs offer extreme thread-level parallelism with hundreds of simple cores. Their data-parallel execution model can rapidly solve large problems with inherent data parallelism. At Fermilab, we have prototyped a GPU-accelerated network performance monitoring system, called G-NetMon, to support large-scale scientific collaborations. In this work, we explore new opportunities in network traffic monitoring and analysis with GPUs. Our system exploits the data parallelism that exists within network flow data to provide fast analysis of bulk data movement between Fermilab and collaboration sites. Experiments demonstrate that our G-NetMon can rapidly detect sub-optimal bulk data movements.
    CoRR. 01/2011; abs/1108.1785.
  • Source
    W Wu, P Demar, A Bobyshev
    [Show abstract] [Hide abstract]
    ABSTRACT: Large-scale research efforts such as LHC experiments, ITER, and climate modelling are built upon large, globally distributed collaborations. For reasons of scalability and agility and to make effective use of existing computing resources, data processing and analysis for these projects is based on distributed computing models. Such projects thus depend on predictable and efficient bulk data movement between collaboration sites. However, the available computing and networking resources to different collaboration sites vary greatly. Large collaboration sites (such as Fermilab, CERN) have created data centres comprising hundreds, and even thousands, of computation nodes to develop massively scaled, highly distributed cluster-computing platforms. These sites are usually well connected to outside worlds with high-speed networks with bandwidth greater than 10Gbps. On the other hand, some small collaboration sites have limited computing resources or poor networking connectivity. Therefore, the bulk data movements across collaboration sites vary greatly. Fermilab is the US-CMS Tier-1 Centre and the main data centre for a few other large-scale research collaborations. Scientific traffic (e.g., CMS) dominates the traffic volumes in both inbound and outbound directions of Fermilab off-site traffic. Fermilab has deployed a Flow-based network traffic collection and analysis system to monitor and analyze the status and patterns of bulk data movement between the Laboratory and its collaboration sites. In this paper, we discuss the current status and patterns of bulk data movement between Fermilab and its collaboration sites.
    Journal of Physics Conference Series 01/2011; 33122663.
  • [Show abstract] [Hide abstract]
    ABSTRACT: At Fermilab, we have prototyped a GPU-accelerated network performance monitoring system, called G-NetMon, to support large-scale scientific collaborations. In this work, we explore new opportunities in network traffic monitoring and analysis with GPUs. Our system exploits the data parallelism that exists within network flow data to provide fast analysis of bulk data movement between Fermilab and collaboration sites. Experiments demonstrate that our G-NetMon can rapidly detect sub-optimal bulk data movements.
    01/2011;
  • A Bobyshev, P DeMar
    [Show abstract] [Hide abstract]
    ABSTRACT: Fermilab hosts the US Tier-1 Center for the LHC's Compact Muon Collider (CMS) experiment. The Tier-1s are the central points for the processing and movement of LHC data. They sink raw data from the Tier-0 at CERN, process and store it locally, and then distribute the processed data to Tier-2s for simulation studies and analysis. The Fermilab Tier-1 Center is the largest of the CMS Tier-1s, accounting for roughly 35% of the experiment's Tier-1 computing and storage capacity. Providing capacious, resilient network services, both in terms of local network infrastructure and off-site data movement capabilities, presents significant challenges. This article will describe the current architecture, status, and near term plans for network support of the US-CMS Tier-1 facility.
    Journal of Physics Conference Series 01/2011; 331(1).
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Computing is now shifting towards multiprocessing. The fundamental goal of multiprocessing is improved performance through the introduction of additional hardware threads or cores (referred to as "cores" for simplicity). Modern network stacks can exploit parallel cores to allow either message-based parallelism or connection-based parallelism as a means to enhance performance. OpenSolaris has redesigned and parallelized to better utilize additional cores. Three special technologies, named Softring Set, Soft ring and Squeue are introduced in OpenSolaris for stack parallelization. In this paper, we study the OpenSolaris packet receiving process and its core parallelism optimization techniques. Experiment results show that these techniques allow OpenSolaris to achieve better network I/O performance in multiprocessing environments; however, network stack parallelization has also brought extra overheads for system. An effective and efficient network I/O optimization in multiprocessing environments is required to cross all levers of the network stack from network interface to application.
    Local Computer Networks, Annual IEEE Conference on. 10/2010;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The TeraPaths, Lambda Station, and Phoebus projects, funded by the US Department of Energy, have successfully developed network middleware services that establish on-demand and manage true end-to-end, Quality-of-Service (QoS) aware, virtual network paths across multiple administrative network domains, select network paths and gracefully reroute traffic over these dynamic paths, and streamline traffic between packet and circuit networks using transparent gateways. These services improve network QoS and performance for applications, playing a critical role in the effective use of emerging dynamic circuit network services. They provide interfaces to applications, such as dCache SRM, translate network service requests into network device configurations, and coordinate with each other to setup up end-to-end network paths. The End Site Control Plane Subsystem (ESCPS) builds upon the success of the three projects by combining their individual capabilities into the next generation of network middleware. ESCPS addresses challenges such as cross-domain control plane signalling and interoperability, authentication and authorization in a Grid environment, topology discovery, and dynamic status tracking. The new network middleware will take full advantage of the perfSONAR monitoring infrastructure and the Inter-Domain Control plane efforts and will be deployed and fully vetted in the Large Hadron Collider data movement environment.
    Journal of Physics Conference Series 05/2010; 219(6):062034.
  • Source
    Wenji Wu, Phil Demar, Matt Crawford
    [Show abstract] [Hide abstract]
    ABSTRACT: TCP performs poorly in networks with serious packet reordering. Processing reordered packets in the TCP-layer is costly and inefficient, involving interaction of the sender and receiver. Motivated by the interrupt coalescing mechanism that delivers packets upward for protocol processing in blocks, we propose a new strategy, Sorting Reordered Packets with Interrupt Coalescing (SRPIC), to reduce packet reordering in the receiver. SRPIC works in the network device driver; it makes use of the interrupt coalescing mechanism to sort the reordered packets belonging to the same TCP stream in a block of packets before delivering them upward; each sorted block is internally ordered. Experiments have proven the effectiveness of SRPIC against forward path reordering.
    Computer Networks. 01/2009;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Lambda Station is an ongoing project of Fermi National Accelerator Laboratory and the California Institute of Technology. The goal of this project is to design, develop and deploy network services for path selection, admission control and flow based forwarding of traffic among data- intensive Grid applications such as are used in High Energy Physics and other communities. Lambda Station deals with the last-mile problem in local area networks, connecting production clusters through a rich array of wide area networks. Selective forwarding of traffic is controlled dynamically at the demand of applications. This paper introduces the motivation of this project, design principles and current status. Integration of Lambda Station client API with the essential Grid middleware such as the dCache/SRM Storage Resource Manager is also described. Finally, the results of applying Lambda Station services to development and production clusters at Fermilab and Caltech over, advanced networks such as DOE's UltraScience Net and NSF's UltraLight is covered.
    Broadband Communications, Networks and Systems, 2006. BROADNETS 2006. 3rd International Conference on; 11/2006
  • Source
    3rd International Conference on Broadband Communications, Networks, and Systems (BROADNETS 2006), 1-5 October 2006, San José, California, USA; 01/2006
  • A. Bobyshev, D. Lamore, P. Demar
    [Show abstract] [Hide abstract]
    ABSTRACT: In a large campus network, such at Fermilab, with tens of thousands of nodes, scanning initiated from either outside of or within the campus network raises security concerns. This scanning may have very serious impact on network performance, and even disrupt normal operation of many services. In this paper we introduce a system for detecting and automatic blocking excessive traffic of different kinds of scanning, DoS attacks, virus infected computers. The system, called AutoBlocker, is a distributed computing system based on quasi-real time analysis of network flow data collected from the border router and core switches. AutoBlocker also has an interface to accept alerts from IDS systems (e.g. BRO, SNORT) that are based on other technologies. The system has multiple configurable alert levels for the detection of anomalous behavior and configurable trigger criteria for automated blocking of scans at the core or border routers. It has been in use at Fermilab for about 2 years, and has become a very valuable tool to curtail scan activity within the Fermilab campus network.
    11/2004
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The Compact Muon Solenoid (CMS) experiment at CERN's Large Hadron Collider (LHC) is scheduled to come on-line in 2007. Fermilab will act as the CMS Tier-1 centre for the US and make experiment data available to more than 400 researchers in the US participating in the CMS experiment. The US CMS Users Facility group, based at Fermilab, has initiated a project to develop a model for optimizing movement of CMS experiment data between CERN and the various tiers of US CMS data centres and to design a WAN emulation facility which will enable controlled testing of unmodified or modified CMS applications and TCP implementations locally under conditions that emulate WAN connectivity. The WAN emulator facility is configurable for latency, jitter, and packet loss. The initial implementation is based on the NISTnet software product. In this paper we will describe the status of this project to date, the results of validation and comparison of performance measurements obtained in emulated and real environment for different applications including multistreams GridFTP. We also will introduce future short term and intermediate term plans, as well as outstanding problems and issues.
    11/2004
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a measurement of the mass difference m(D-s(+))-m(D+), where both the D-s(+) and D+ are reconstructed in the phipi(+) decay channel. This measurement uses 11.6 pb(-1) of data collected by CDF II using the new displaced-track trigger. The mass difference is found to be m(D-s(+))-m(D+)=99.41+/-0.38(stat)+/-0.21(syst) MeV/c(2).
    Physics Research Publications.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The LHC era will start very soon, creating immense data volumes capable of demanding allocation of an entire network circuit for task-driven applications. Circuit-based alternate network paths are one solution to meeting the LHC high bandwidth network requirements. The Lambda Station project is aimed at addressing growing requirements for dynamic allocation of alternate network paths. Lambda Station facilitates the rerouting of designated traffic through site LAN infrastructure onto so-called 'high-impact' wide-area networks. The prototype Lambda Station developed with Service Oriented Architecture (SOA) approach in mind will be presented. Lambda Station has been successfully integrated into the production version of the Storage Resource Manager (SRM), and deployed at US CMS Tier1 center at Fermilab, as well as at US-CMS Tier-2 site at Caltech. This paper will discuss experiences using the prototype system with production SciDAC applications for data movement between Fermilab and Caltech. The architecture and design principles of the production version Lambda Station software, currently being implemented as Java based web services, will also be presented in this paper.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Fermilab hosts the US Tier-1 center for data storage and analysis of the Large Hadron Collider's (LHC) Compact Muon Solenoid (CMS) experiment. To satisfy operational requirements for the LHC networking model, the network- ing group at Fermilab, in collaboration with Internet2 and ESnet, is participat- ing in the perfSONAR-PS project. This collaboration has created a collection of network monitoring services targeted at providing continuous network perfor- mance measurements across wide-area distributed computing environments. The perfSONAR-PS services are packaged as a bundle, and include a bootable disk capability. We have started on a deployment plan consisting of a decentralized mesh of these network monitoring services at US LHC Tier-1 and Tier-2 sites. The initial deployment will cover all Tier-1 and Tier2 sites of US ATLAS and US CMS. This paper will outline the basic architecture of each network monitor- ing service. Service discovery model, interoperability, and basic protocols will be presented. The principal deployment model and available packaging options will be detailed. The current state of deployment and availability of higher level user interfaces and analysis tools will be also be demonstrated.
  • Source
    Andrey Bobyshev, Phil DeMar