John Heidemann's research while affiliated with University of Southern Mississippi and other places

Publications (256)

Preprint
Full-text available
Relevance estimators are algorithms used by major social media platforms to determine what content is shown to users and its presentation order. These algorithms aim to personalize the platforms' experience for users, increasing engagement and, therefore, platform revenue. However, at the large scale of many social media platforms, many have concer...
Preprint
Full-text available
After 50 years, the Internet is still defined as "a collection of interconnected networks". Yet desires of countries for "their own internet" (Internet secession?), country-level firewalling, and persistent peering disputes all challenge the idea of a single set of "interconnected networks". We show that the Internet today has peninsulas of persist...
Preprint
Services on the public Internet are frequently scanned, then subject to brute-force and denial-of-service attacks. We would like to run such services stealthily, available to friends but hidden from adversaries. In this work, we propose a moving target defense named "Chhoyhopper" that utilizes the vast IPv6 address space to conceal publicly availab...
Preprint
Ad platforms such as Facebook, Google and LinkedIn promise value for advertisers through their targeted advertising. However, multiple studies have shown that ad delivery on such platforms can be skewed by gender or race due to hidden algorithmic optimization by the platforms, even when not requested by the advertisers. Building on prior work measu...
Preprint
The Covid-19 pandemic has radically changed our lives. Under different circumstances, people react to it in various ways. One way is to work-from-home since lockdown has been announced in many regions around the world. For some places, however, we don't know if people really work from home due to the lack of information. Since there are lots of unc...
Preprint
DNS is important in nearly all interactions on the Internet. All large DNS operators use IP anycast, announcing servers in BGP from multiple physical locations to reduce client latency and provide capacity. However, DNS is easy to spoof: third parties intercept and respond to queries for benign or malicious purposes. Spoofing is of particular risk...
Article
Operational services run 24×7 and require analytics pipelines to evaluate performance. In mature services such as domain name system (DNS), these pipelines often grow to many stages developed by multiple, loosely coupled teams. Such pipelines pose two problems: first, computation and data storage may be duplicated across components developed by dif...
Article
Distributed Denial-of-Service (DDoS) attacks launched from compromised Internet-of-Things (IoT) devices have shown how vulnerable the Internet is to large-scale DDoS attacks. To understand the risks of these attacks requires learning about these IoT devices: where are they? how many are there? how are they changing? This paper describes three new m...
Preprint
Full-text available
IP Anycast is used for services such as DNS and Content Delivery Networks to provide the capacity to handle Distributed Denial-of-Service (DDoS) attacks. During a DDoS attack service operators may wish to redistribute traffic between anycast sites to take advantage of sites with unused or greater capacity. Depending on site traffic and attack size,...
Chapter
There is a growing interest in carefully observing the reliability of the Internet’s edge. Outage information can inform our understanding of Internet reliability and planning, and it can help guide operations. Active outage detection methods provide results for more than 3M blocks, and passive methods more than 2M, but both are challenged by spars...
Preprint
Full-text available
Machine-learning-based anomaly detection (ML-based AD) has been successful at detecting DDoS events in the lab. However published evaluations of ML-based AD have only had limited data and have not provided insight into why it works. To address limited evaluation against real-world data, we apply autoencoder, an existing ML-AD model, to 57 DDoS atta...
Conference Paper
Full-text available
DNS depends on extensive caching for good performance, and every DNS zone owner must set Time-to-Live (TTL) values to control their DNS caching. Today there is relatively little guidance backed by research about how to set TTLs, and operators must balance conflicting demands of caching against agility of configuration. Exactly how TTL value choices...
Article
With vast amount of content online, it is not surprising that unscrupulous entities "borrow" from the web to provide content for advertisements, link farms, and spam. Our insight is that cryptographic hashing and fingerprinting can efficiently identify content reuse for web-size corpora. We develop two related algorithms, one to automatically *disc...
Presentation
Full-text available
The PAADDoS project’s goal is to bring DDoS defense with reach and to democratize anycast as a defense. Anycast uses Internet routing to associate users with geographically close sites of a replicated service. During DDoS, anycast sites can provide the capacity to absorb an attack, and they can be used to isolate the attack to part of the network.
Conference Paper
DNS has evolved over the last 20 years, improving in security and privacy and broadening the kinds of applications it supports. However, this evolution has been slowed by the large installed base and the wide range of implementations. The impact of changes is difficult to model due to complex interactions between DNS optimizations, caching, and dis...
Conference Paper
The Internet's Domain Name System (DNS) is a frequent target of Distributed Denial-of-Service (DDoS) attacks, but such attacks have had very different outcomes---some attacks have disabled major public websites, while the external effects of other attacks have been minimal. While on one hand the DNS protocol is relatively simple, the system has man...
Conference Paper
DNS backscatter detects internet-wide activity by looking for common reverse DNS lookups at authoritative DNS servers that are high in the DNS hierarchy. Both DNS backscatter and monitoring unused address space (darknets or network telescopes) can detect scanning in IPv4, but with IPv6's vastly larger address space, darknets become much less effect...
Conference Paper
Recent IoT-based DDoS attacks have exposed how vulnerable the Internet can be to millions of insufficiently secured IoT devices. To understand the risks of these attacks requires learning about these IoT devices---where are they, how many are there, how are they changing? In this paper, we propose a new method to find IoT devices in Internet to beg...
Conference Paper
Today's malware often relies on DNS to enable communication with command-and-control (C&C). As defenses that block C&C traffic improve, malware use sophisticated techniques to hide this traffic, including "fast flux" names and Domain-Generation Algorithms (DGAs). Detecting this kind of activity requires analysis of DNS queries in network traffic, y...
Chapter
ICMP active probing is the center of many network measurements. Rate limiting to ICMP traffic, if undetected, could distort measurements and create false conclusions. To settle this concern, we look systematically for ICMP rate limiting in the Internet. We create FADER, a new algorithm that can identify rate limiting from user-side traces with mini...
Article
Anycast-based services today are widely used commercially, with several major providers serving thousands of important websites. However, to our knowledge, there has been only limited study of how often anycast fails because routing changes interrupt connections between users and their current anycast site. While the commercial success of anycast C...
Conference Paper
Full-text available
IP anycast provides DNS operators and CDNs with automatic fail-over and reduced latency by breaking the Internet into catchments, each served by a different anycast site. Unfortunately, understanding and predicting changes to catchments as anycast sites are added or removed has been challenging. Current tools such as RIPE Atlas or commercial equiva...
Conference Paper
In Internet Domain Name System (DNS), services operate authoritative name servers that individuals query through recursive resolvers. Operators strive to provide reliability by operating multiple name servers (NS), each on a separate IP address, and by using IP anycast to allow NSes to provide service from many physical locations. To meet their goa...
Conference Paper
Accurate information about address and block usage in the Internet has many applications in planning address allocation, topology studies, and simulations. Prior studies used active probing, sometimes augmented with passive observation, to study macroscopic phenomena, such as the overall usage of the IPv4 address space. This paper instead studies t...
Technical Report
Full-text available
IP anycast provides DNS operators and CDNs with automatic fail-over and reduced latency by breaking the Inter-net into catchments, each served by a different anycast site. Unfortunately, understanding and predicting changes to catchments as sites are added or removed has been challenging. Current tools such as RIPE Atlas or commercial equivalents m...
Conference Paper
Full-text available
Anycast is widely used today to provide important services such as DNS and Content Delivery Networks (CDNs). An anycast service uses multiple sites to provide high availability, capacity and redundancy. BGP routing associates users to sites, defining the catchment that each site serves. Although prior work has studied how users associate with anyca...
Conference Paper
Full-text available
Distributed Denial-of-Service (DDoS) attacks continue to be a major threat in the Internet today. DDoS attacks overwhelm target services with requests or other traffic, causing requests from legitimate users to be shut out. A common defense against DDoS is to replicate the service in multiple physical locations or sites. If all sites announce a com...
Technical Report
Full-text available
Anycast is widely used today to provide important services including naming and content, with DNS and Content Delivery Networks (CDNs). An anycast service uses multiple sites to provide high availability, capacity and redundancy, with BGP routing associating users to nearby anycast sites. Routing defines the catchment of the users that each site se...
Conference Paper
Today, Transport-Layer Security (TLS) is the bedrock of Internet security for the web and web-derived applications. TLS depends on the X.509 Public Key Infrastructure (PKI) to authenticate endpoint identity. An essential part of a PKI is the ability to quickly revoke certificates, for example, after a key compromise. Today the Online Certificate St...
Conference Paper
Network-wide activity is when one computer (the originator) touches many others (the targets). Motives for activity may be benign (mailing lists, CDNs, and research scanning), malicious (spammers and scanners for security vulnerabilities), or perhaps indeterminate (ad trackers). Knowledge of malicious activity may help anticipate attacks, and under...
Article
The Domain Name System (DNS) seems ideal for connectionless UDP, yet this choice results in challenges of eavesdropping that compromises privacy, source-address spoofing that simplifies denial-of-service (DoS) attacks on the server and third parties, injection attacks that exploit fragmentation, and reply-size limits that constrain key sizes and po...
Conference Paper
Large web services employ CDNs to improve user performance. CDNs improve performance by serving users from nearby Front-End (FE) Clusters. They also spread users across FE Clusters when one is overloaded or unavailable and others have unused capacity. Our paper is the first to study the dynamics of the user-to-FE Cluster mapping for Google and Akam...
Conference Paper
The DANE (DNS-based Authentication of Named Entities) framework uses DNSSEC to provide a source of trust, and with TLSA it can serve as a root of trust for TLS certificates. This serves to complement traditional certificate authentication methods, which is important given the risks inherent in trusting hundreds of organizations—risks already demons...
Article
As the Internet matures, policy questions loom larger in its operation. When should an ISP, city, or government invest in infrastructure? How do their policies affect use? In this work, we develop a new approach to evaluate how policies, economic conditions and technology correlates with Internet use around the world. First, we develop an adaptive...
Article
DNS is the canonical protocol for connectionless UDP. Yet DNS today is challenged by eavesdropping that compromises privacy, source-address spoofing that results in denial-of-service (DoS) attacks on the server and third parties, injection attacks that exploit fragmentation, and size limitations that constrain policy and operational choices. We pro...
Article
DNS is the canonical protocol for connectionless UDP. Yet DNS today is challenged by eavesdropping that compromises privacy, source-address spoofing that results in denial-of-service (DoS) attacks on the server and third parties, injection attacks that exploit fragmentation, and size limitations that constrain policy and operational choices. We pro...
Conference Paper
People’s computing lives are moving into the cloud, making understanding cloud availability increasingly critical. Prior studies of Internet outages have used ICMP-based pings and traceroutes. While these studies can detect network availability, we show that they can be inaccurate at estimating cloud availability. Without care, ICMP probes can unde...
Article
The large volume of data associated with social networks hinders the unaided user from interpreting network content in real time. This problem is compounded by the fact that there are limited tools available for enabling robust visual social network ...
Article
To understand network behavior, researchers and enterprise network operators must interpret large amounts of network data. To understand and manage network events such as outages, route instability, and spam campaigns, they must interpret data that covers a range of networks and evolves over time. We propose a simple clustering algorithm that helps...
Conference Paper
In this paper we present tools and methods to integrate attack measurements from the Internet with controlled experimentation on a network testbed. We show that this approach provides greater fidelity than synthetic models. We compare the statistical properties of real-world attacks with synthetically generated constant bit rate attacks on the test...
Conference Paper
Modern content-distribution networks both provide bulk content and act as "serving infrastructure" for web services in order to reduce user-perceived latency. Serving infrastructures such as Google's are now critical to the online economy, making it imperative to understand their size, geographic distribution, and growth strategies. To this end, we...
Conference Paper
Natural and human factors cause Internet outages---from big events like Hurricane Sandy in 2012 and the Egyptian Internet shutdown in Jan. 2011 to small outages every day that go unpublicized. We describe Trinocular, an outage detection system that uses active probing to understand reliability of edge networks. Trinocular is principled: deriving a...
Conference Paper
Natural and human factors cause Internet outages---from big events like Hurricane Sandy in 2012 and the Egyptian Internet shutdown in Jan. 2011 to small outages every day that go unpublicized. We describe Trinocular, an outage detection system that uses active probing to understand reliability of edge networks. Trinocular is principled: deriving a...
Conference Paper
IP anycast is a central part of production DNS. While prior work has explored proximity, affinity and load balancing for some anycast services, there has been little attention to third-party discovery and enumeration of components of an anycast service. Enumeration can reveal abnormal service configurations, benign masquerading or hostile hijacking...
Conference Paper
End-to-end reachability is a fundamental service of the Internet. We study network outages caused by natural disasters [2,5], and political upheavals [8]. We propose a new approach to outage detection using active probing. Like prior outage detection methods [3,4], our method uses ICMP echo requests (“pings”) to detect outages, but we probe with gr...
Article
Full-text available
The principles of sensor networks—low-power, wireless, in-situ sensing with many inexpensive sensors—are only recently penetrating into underwater research. Acoustic communication is best suited for underwater communication, with much lower attenuation than RF, but acoustic propagation is five orders-of-magnitude slower than RF, so propagation time...
Conference Paper
Previous measurement-based IP geolocation algorithms have focused on accuracy, studying a few targets with increasingly sophisticated algorithms taking measurements from tens of vantage points (VPs). In this paper, we study how to scale up existing measurement-based geolocation algorithms like Shortest Ping and CBG to cover the whole Internet. We s...
Article
This chapter surveys network-level approaches to conserve energy in sensor networks. We consider protocols for transmission power control, media access control, topology control, and energy-aware routing, surveying relevant literature and describing approaches that have been considered.
Article
Sensor networks have seen a new focus of development by DARPA, NSF, academia and industry since 1999. The key benefits of terrestrial sensor networks stem from inexpensive nodes, placed near what is being sensed, sharing a short-range wireless network. By comparison, underwater sensing today is often expensive, sparsely deployed, and wired, or with...
Article
Full-text available
The underwater acoustic channel has many fundamentally different properties to the conventional radio-based wireless channel. Communication is changed drastically by acous- tic propagation speeds that are five orders of magnitude slower than radio. This, coupled with bandwidth limitations, high transmit energy cost, complex multi-path effects, and...
Article
This paper examines the main approaches and challenges in the design and implementation of underwater wireless sensor networks. We summarize key applications and the main phenomena related to acoustic propagation, and discuss how they affect the design and operation of communication systems and networking protocols at various layers. We also provid...
Conference Paper
Full-text available
As desktops and servers become more complicated, they employ an increasing amount of automatic, non-user initiated communication. Such communication can be good (OS updates, RSS feed readers, and mail polling), bad (keyloggers, spyware, and botnet command-and-control), or ugly (adware or unauthorized peer-to-peer applications). Communication in the...
Article
This paper develops parametric methods to detect network anomalies using only aggregate traffic statistics, in contrast to other works requiring flow separation, even when the anomaly is a small fraction of the total traffic. By adopting simple statistical models for anomalous and background traffic in the time domain, one can estimate model parame...
Article
This paper develops parametric methods to detect network anomalies using only aggregate traffic statistics in contrast to other works requiring flow separation, even when the anomaly is a small fraction of the total traffic. By adopting simple statistical models for anomalous and background traffic in the time-domain, one can estimate model paramet...
Conference Paper
An understanding of Internet topology is central to answer various questions ranging from network resilience to peer selection or data center location. While much of prior work has examined AS-level connectivity, meaningful and relevant results from such an abstract view of Internet topology have been limited. For one, semantically, AS relationship...
Conference Paper
An Internet hitlist is a set of addresses that cover and can represent the the Internet as a whole. Hitlists have long been used in studies of Internet topology, reachability, and performance, serving as the destinations of traceroute or performance probes. Most early topology studies used manually generated lists of prominent addresses, but evolut...
Conference Paper
Although the Internet is widely used today, we have little information about the edge of the network. Decentralized management, firewalls, and sensitivity to probing prevent easy answers and make measurement difficult. Building on frequent ICMP probing of 1% of the Internet address space, we develop clustering and analysis methods to estimate how I...
Conference Paper
It is well known that spam bots mostly utilize compromised machines with certain address characteristics, such as dynamically allocated addresses, machines in specific geographic areas and IP ranges from AS' with more tolerant spam policies. Such machines tend to be less diligently administered and may exhibit less stability, more volatility, and s...
Article
Full-text available
The key aspect in the design of any contention-based medium access control (MAC) protocol is the mechanism to measure and resolve simultaneous contention. Generally, terrestrial wireless MACs can only observe success or collision of a con-tention attempt through carrier sense. An implicit estimate of the number of contenders occurs through repeated...
Conference Paper
Prior studies of Internet traffic have considered traffic at different resolutions and time scales: packets and flows for hours or days, aggregate packet statistics for days or weeks, and hourly trends for months. However, little is known about the long-term behavior of individual flows. In this paper, we study individual flows (as defined by the 5...
Article
Today many individual deployments of sensornets are successful, but they will have much greater impact when, rather than standing alone, they share data across deploy-ments so each can build upon the others. We expect data to be shared over the Internet, and as the number of pro-cessing and reprocessing steps grows, timely data synchro-nization is...
Article
Although the Internet is widely used today, there are few sound estimates of network demographics. Decentralized network management means questions about Internet use cannot be answered by a central authority, and firewalls and sensitivity to probing means that active measurements must be done carefully and validated against known data. Build-ing o...
Article
Network datasets are necessary for many types of network research. While there has been significant discussion about specific datasets, there has been less about the overall state of network data collection. The goal of this paper is to explore the research questions facing the Internet today, the datasets needed to answer those questions, and the...
Article
Persistently saturated links are abnormal conditions that indicate bottlenecks in Internet traffic. Network operators are interested in detecting such links for troubleshooting, to improve capacity planning and traffic estimation, and to detect denial-of-service attacks. Currently bottleneck links can be detected either locally, through SNMP inform...
Article
Full-text available
The 13 papers in this special issue focus on underwater wireless communication networks.
Article
Full-text available
This paper introduces T-Lohi, a new class of distributed and energy-efficient media-access protocols (MAC) for underwater acoustic sensor networks (UWSN). MAC design for UWSN faces significant challenges. For example, acoustic communication suffers from latencies five orders-of-magnitude larger than radio communication, so a naive CSMA MAC would re...
Article
A new class of sensor network applications is mostly-off. Exemplified by Intel’s FabApp, in these applications the network alternates between being off for hours or weeks, then activating to collect data for a few minutes. While configuration of traditional sensornet applications is occasional and so need not be optimized, these applications may sp...
Article
Full-text available
A true hacker is an individual who can achieve miracles by appropriating, modifying, or "kludging" existing resources (devices, hardware, software, or anything within reach) to suit other purposes, often in an ingenious fashion. Gifted hackers can be thought of as transmutation or collage artists. They take everything from streamlined commercial pr...
Conference Paper
This paper develops two parametric methods to detect low-rate denial-of-service attacks and other similar near-periodic traffic, without the need for flow separation. The first method, the periodic attack detector, is based on a previous approach that exploits the near-periodic nature of attack traffic in aggregate traffic by modeling the peak freq...
Article
Full-text available
Most wireless medium access control (MAC) protocols today prevent nearby concurrent communication due to con-cern that it would corrupt ongoing communication. While re-cent work has demonstrated that channel capture allows suc-cessful concurrent communication, MAC protocols to date have not exploited this approach to improve performance. In this pa...