Alan Mislove's research while affiliated with Northeastern University and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (126)
Targeted advertising remains an important part of the free web browsing experience, where advertisers' targeting and personalization algorithms together find the most relevant audience for millions of ads every day. However, given the wide use of advertising, this also enables using ads as a vehicle for problematic content, such as scams or clickba...
Machines powered by artificial intelligence increasingly mediate our social, cultural, economic, and political interactions. This chapter frames and surveys the emerging interdisciplinary field of machine behaviour: the scientific study of behaviour exhibited by intelligent machines. It outlines the key research themes, questions, and landmark rese...
Most public blockchain protocols, including the popular Bitcoin and Ethereum blockchains, do not formally specify the order in which miners should select transactions from the pool of pending (or uncommitted) transactions for inclusion in the blockchain. Over the years, informal conventions or "norms" for transaction ordering have, however, emerged...
Blockchain protocols’ primary security goal is consensus: one version of the global ledger that everyone in the network agrees on. Their proofs of security depend on assumptions on how well their peer-to-peer (P2P) overlay networks operate. Yet, surprisingly, little is understood about what factors influence the P2P network properties. In this work...
Every second, the thoughts and feelings of millions of people across the world are recorded in the form of 140-character tweets using Twitter. However, despite the enormous potential presented by this remarkable data source, we still do not have an understanding of the Twitter population itself: Who are the Twitter users? How representative of the...
User tracking has become ubiquitous practice on the Web, allowing services to recommend behaviorally targeted content to users. In this article, we design Alibi, a system that utilizes such readily available personalized content, generated by recommendation engines in real time, as a means to tame Sybil attacks. In particular, by using ads and othe...
The Transport Layer Security (TLS) Public Key Infrastructure (PKI) is essential to the security and privacy of users on the Internet. Despite its importance, prior work from the mid-2010s has shown that mismanagement of the TLS PKI often led to weakened security guarantees, such as compromised certificates going unrevoked and many internet devices...
Today, algorithmic models are shaping important decisions in domains such as credit, employment, or criminal justice. At the same time, these algorithms have been shown to have discriminatory effects. Some organizations have tried to mitigate these effects by removing demographic features from an algorithm's inputs. If an algorithm is not provided...
Political campaigns are increasingly turning to digital advertising to reach voters. These platforms empower advertisers to target messages to platform users with great precision, including through inferences about those users' political affiliations. However, prior work has shown that platforms' ad delivery algorithms can selectively deliver ads w...
The enormous financial success of online advertising platforms is partially due to the precise targeting features they offer. Although researchers and journalists have found many ways that advertisers can target---or exclude---particular groups of users seeing their ads, comparatively little attention has been paid to the implications of the platfo...
The public key infrastructure (PKI) provides the fundamental property of authentication: the means by which users can know with whom they are communicating online. The PKI ensures end-to-end authenticity insofar as it verifies a chain of certificates, but the true final step in end-to-end authentication comes when the user verifies that the website...
Despite its critical role in Internet connectivity, the Border Gateway Protocol (BGP) remains highly vulnerable to attacks such as prefix hijacking, where an Autonomous System (AS) announces routes for IP space it does not control. To address this issue, the Resource Public Key Infrastructure (RPKI) was developed starting in 2008, with deployment b...
Net neutrality has been the subject of considerable public debate over the past decade. Despite the potential impact on content providers and users, there is currently a lack of tools or data for stakeholders to independently audit the net neutrality policies of network providers. In this work, we address this issue by conducting a one-year study o...
Google's Quick UDP Internet Connections (QUIC) protocol, which implements TCP-like properties at the application layer atop a UDP transport, is now used by the vast majority of Chrome clients accessing Google properties but has no formal state machine specification, limited analysis, and ad-hoc evaluations based on snapshots of the protocol impleme...
In this work, we introduce a novel metric for auditing group fairness in ranked lists. Our approach offers two benefits compared to the state of the art. First, we offer a blueprint for modeling of user attention. Rather than assuming a logarithmic loss in importance as a function of the rank, we can account for varying user behaviors through param...
Data brokers such as Acxiom and Experian are in the business of collecting and selling data on people; the data they sell is commonly used to feed marketing as well as political campaigns. Despite the ongoing privacy debate, there is still very limited visibility into data collection by data brokers. Recently, however, online advertising services s...
The Domain Name System (DNS) is the naming system on the Internet. With the DNS Security Extensions (DNSSEC) operators can protect the authenticity of their domain using public key cryptography. DNSSEC, however, can be difficult to configure and maintain: operators need to replace keys to upgrade their algorithm, react to security breaches or follo...
The enormous financial success of online advertising platforms is partially due to the precise targeting features they offer. Although researchers and journalists have found many ways that advertisers can target---or exclude---particular groups of users seeing their ads, comparatively little attention has been paid to the implications of the platfo...
Machines powered by artificial intelligence increasingly mediate our social, cultural, economic and political interactions. Understanding the behaviour of artificial intelligence systems is essential to our ability to control their actions, reap their benefits and minimize their harms. Here we argue that this necessitates a broad scientific researc...
In this work we introduce a novel metric for verifying group fairness in ranked lists. Our approach relies on measuring the amount of attention given to members of a protected group and comparing it to that group's representation in the investigated population. It offers two major developments compared to the state of the art. First, rather than as...
The popularity of online advertising-now with aggregate revenues in the hundreds of billions of dollars each year-is strongly driven by targeting, or the ability of an advertising platform to help an advertiser select exactly which users should see their ad. To enable such targeting, advertising platforms routinely collect detailed data on users, a...
Online advertising platforms such as those of Facebook and Google collect detailed data about users, which they leverage to allow advertisers to target ads to users based on various pieces of user information. While most advertising platforms have transparency mechanisms in place to reveal this collected information to users, these often present an...
Ethereum is the second most valuable cryptocurrency today, with a current market cap of over $68B. What sets Ethereum apart from other cryptocurrencies is that it uses the blockchain to not only store a record of transactions, but also smart contracts and a history of calls made to those contracts. Thus, Ethereum represents a new form of distribute...
TLS, the de facto standard protocol for securing communications over the Internet, relies on a hierarchy of certificates that bind names to public keys. Naturally, ensuring that the communicating parties are using only valid certificates is a necessary first step in order to benefit from the security of TLS. To this end, most certificates and clien...
Transportation network companies (TNCs) provide vehicle-for-hire services. They are distinguished from taxis primarily by the presumption that vehicles are privately owned by drivers. Unlike taxis, which must hold one of approximately 1,800 medallions licensed by the San Francisco Municipal Transportation Agency (SFMTA) to operate in San Francisco,...
In this work, we propose an automated method to find attacks against TCP congestion control implementations that combines the generality of implementation-agnostic fuzzing with the precision of runtime analysis. It uses a model-guided approach to generate abstract attack strategies by leveraging a state machine model of congestion control to find v...
Web security has been and remains a highly relevant field of security research, which has seen many additional features standardiazed at IETF over the past years.
This talk covers two papers, which in sum provide a conprehensive survey of quantity and quality of adoption of such new security extensions by HTTPS web servers.
The protocols covered ar...
Shaken by severe compromises, the Web’s Public Key Infrastructure has seen the addition of several security mechanisms over recent years. One such mechanism is the Certification Authority Authorization (CAA) DNS record, that gives domain name holders control over which Certification Authorities (CAs) may issue certificates for their domain. First d...
Ridesharing services such as Uber and Lyft have become an important part of the Vehicle For Hire (VFH) market, which used to be dominated by taxis. Unfortunately, ridesharing services are not required to share data like taxi services, which has made it challenging to compare the competitive dynamics of these services, or assess their impact on citi...
A properly managed public key infrastructure (PKI) is critical to ensure secure communication on the Internet. Surprisingly, some of the most important administrative steps---in particular, reissuing new X.509 certificates and revoking old ones---are manual and remained unstudied, largely because it is difficult to measure these manual processes at...
Recently, online targeted advertising platforms like Facebook have been criticized for allowing advertisers to discriminate against users belonging to sensitive groups, i.e., to exclude users belonging to a certain race or gender from receiving their ads. Such criticisms have led, for instance, Facebook to disallow the use of attributes such as eth...
As blockchain technologies and cryptocurrencies increase in popularity, their decentralization poses unique challenges in network partitions. In traditional distributed systems, network partitions are generally a result of bugs or connectivity failures; the typical goal of the system designer is to automatically recover from such issues as seamless...
The Domain Name System (DNS) provides a scalable, flexible name resolution service. Unfortunately, its unauthenticated architecture has become the basis for many security attacks. To address this, DNS Security Extensions (DNSSEC) were introduced in 1997. DNSSEC's deployment requires support from the top-level domain (TLD) registries and registrars,...
Google's QUIC protocol, which implements TCP-like properties at the application layer atop a UDP transport, is now used by the vast majority of Chrome clients accessing Google properties but has no formal state machine specification, limited analysis, and ad-hoc evaluations based on snapshots of the protocol implementation in a small number of envi...
Middleboxes implement a variety of network management policies (e.g., prioritizing or blocking traffic) in their networks. While such policies can be beneficial (e.g., blocking malware) they also raise issues of network neutrality and freedom of speech when used for application-specific differentiation and censorship. There is a poor understanding...
The Domain Name System (DNS) is part of the core of the Internet. Over the past decade, much-needed security features were added to this protocol, with the introduction of the DNS Security Extensions. DNSSEC adds authenticity and integrity to the protocol using digital signatures, and turns the DNS into a public key infrastructure (PKI). At the top...
NLP tasks are often limited by scarcity of manually annotated data. In social media sentiment analysis and related tasks, researchers have therefore used binarized emoticons and specific hashtags as forms of distant supervision. Our paper shows that by extending the distant supervision to a more diverse set of noisy labels, the models can learn ric...
Web search is an integral part of our daily lives. Recently, there has been a trend of personalization in Web search, where different users receive different results for the same search query. The increasing level of personalization is leading to concerns about Filter Bubble effects, where certain users are simply unable to access information that...
Online freelancing marketplaces have grown quickly in recent years. In theory, these sites offer workers the ability to earn money without the obligations and potential social biases associated with traditional employment frameworks. In this paper, we study whether two prominent online freelance marketplaces - TaskRabbit and Fiverr - are impacted b...
SSL and TLS are used to secure the most commonly used Internet protocols. As a result, the ecosystem of SSL certificates has been thoroughly studied, leading to a broad understanding of the strengths and weaknesses of the certificates accepted by most web browsers. Prior work has naturally focused almost exclusively on "valid" certificates--those t...
A variety of network management practices, from bandwidth management to zero-rating, use policies that apply selectively to different categories of Internet traffic (e.g., video, P2P, VoIP). These policies are implemented by middleboxes that must, in real time, assign traffic to a category using a classifier. Despite their important implications fo...
Detecting violations of application-level end-to-end connectivity on the Internet is of significant interest to researchers and end users; recent studies have revealed cases of HTTP ad injection and HTTPS man-in-the-middle attacks. Unfortunately, detecting such end-to-end violations at scale remains difficult, as it generally requires having the co...
The semantics of online authentication in the web are rather straightforward: if Alice has a certificate binding Bob's name to a public key, and if a remote entity can prove knowledge of Bob's private key, then (barring key compromise) that remote entity must be Bob. However, in reality, many websites' and the majority of the most popular ones-are...
The popularity of mobile devices for ubiquitous Internet access has led to exploding demand for relatively scarce cellular bandwidth. As a result, cellular operators increasingly turn to creative ways to manage their customers’ demand on capacity, using traffic shaping and transcoding and zero-rating. With zero-rating, Internet Service Providers (I...
Cloud computing has evolved to meet user demands, from arbitrary VMs offered by IaaS to the narrow application interfaces of PaaS. Unfortunately, there exists an intermediate point that is not well met by today's offerings: users who wish to run arbitrary, already available binaries (as opposed to rewriting their own application for a PaaS) yet exp...
The rise of e-commerce has unlocked practical applications for algorithmic pricing (also called dynamic pricing algorithms), where sellers set prices using computer algorithms. Travel websites and large, well known e-retailers have already adopted algorithmic pricing strategies, but the tools and techniques are now available to small-scale sellers...
Popular social and e-commerce sites increasingly rely on crowd computing to rate and rank content, users, products and businesses. Today, attackers who create fake (Sybil) identities can easily tamper with these computations. Existing defenses that largely focus on detecting individual Sybil identities have a fundamental limitation: Adaptive attack...
Users today access a multitude of online services---among the most popular of which are online social networks (OSNs)---via both web sites and dedicated mobile applications (apps), using a range of devices (traditional PCs, tablets, and smartphones) that are connected via a variety of networks. The resulting infrastructure makes these services conv...
Traffic differentiation---giving better (or worse) performance to certain classes of Internet traffic---is a well-known but poorly understood traffic management policy. There is active discussion on whether and how ISPs should be allowed to differentiate Internet traffic, but little data about current practices to inform this discussion. Previous w...
Recently, Uber has emerged as a leader in the "sharing economy". Uber is a "ride sharing" service that matches willing drivers with customers looking for rides. However, unlike other open marketplaces (e.g., AirBnB), Uber is a black-box: they do not provide data about supply or demand, and prices are set dynamically by an opaque "surge pricing" alg...
To cope with the immense amount of content on the web, search engines often use complex algorithms to personalize search results for individual users. However, personalization of search results has led to worries about the Filter Bubble Effect, where the personalization algorithm decides that some useful information is irrelevant to the user, and t...
Knowing the physical location of a mobile device is crucial for a number of context-aware applications. This information is usually obtained using the Global Positioning System (GPS), or by calculating the position based on proximity of WiFi access points with known location (where the position of the access points is stored in a database at a cent...
Critical to the security of any public key infrastructure (PKI) is the ability to revoke previously issued certificates. While the overall SSL ecosystem is well-studied, the frequency with which certificates are revoked and the circumstances under which clients (e.g., browsers) check whether certificates are revoked are still not well-understood.
I...
The past two decades have seen an upsurge of interest in the collective behaviors of complex systems composed of many agents entrained to each other and to external events. In this paper, we extend the concept of entrainment to the dynamics of human collective attention. We conducted a detailed investigation of the unfolding of human entrainment—as...
Today, many e-commerce websites personalize their content, including Netflix (movie recommendations), Amazon (product suggestions), and Yelp (business reviews). In many cases, personalization provides advantages for users: for example, when a user searches for an ambiguous query such as ``router,'' Amazon may be able to suggest the woodworking tool...
Central to the secure operation of a public key infrastructure (PKI) is the ability to revoke certificates. While much of users' security rests on this process taking place quickly, in practice, revocation typically requires a human to decide to reissue a new certificate and revoke the old one. Thus, having a proper understanding of how often syste...
Advertising is ubiquitous on the Web; numerous ad networks serve billions of ads daily via keyword or search term auctions. Recently, online social networks (OSNs) such as Facebook have created site-specific ad services that differ from traditional ad networks by letting advertisers bid on users rather than keywords. With Facebook's annual ad reven...
Not all of the over one billion users of online social networks (OSNs) are equally valuable to the OSNs. The current business model of monetizing advertisements targeted to users does not appear to be based on any visible grouping of the users. The primary metrics remain CPM (cost per mille-i.e., thousand impressions) and CPC (cost per click) of ad...
The goal of this research is to detect traffic differentiation in cellular data networks. We define service differentiation as any attempt to change the performance of network traffic traversing an ISP's boundaries. ISPs may implement differentiation policies for a number of reasons, including load balancing, bandwidth management, or business reaso...
The microblogging site Twitter is now one of the most popular Web destinations. Due to the relative ease of data access, there has been significant research based on Twitter data, ranging from measuring the spread of ideas through society to predicting the behavior of real-world phenomena such as the stock market. Unfortunately, relatively little w...
Today, it is the norm for online social (OSN) users to have accounts on multiple services. For example, a recent study showed that 34% of all Twitter users also use Pinterest. This situation leads to interesting questions such as: Are the activities that users perform on each site disjoint Alternatively, if users perform the same actions on multipl...
In one embodiment, program code is added to a social network's web pages or site such that the content a first user accesses is locally stored at the first user's system. When another user, who is a friend of the first user, as defined by the social networking site, browses to that same content, the program code fetches it from the first user, inst...
Online social network (OSN) users upload millions of pieces of content to share with others every day. While a significant portion of this content is benign (and is typically shared with all friends or all OSN users), there are certain pieces of content that are highly privacy sensitive. Sharing such sensitive content raises significant privacy con...
With the increasing popularity of Web-based services, users today have access to a broad range of free sites, including social networking, microblogging, and content sharing sites. In order to offer a service for free, service providers typically monetize user content, selling results to third parties such as advertisers. As a result, users have li...
Online content ratings services allow users to find and share content ranging from news articles (Digg) to videos (YouTube) to businesses (Yelp). Generally, these sites allow users to create accounts, declare friendships, upload and rate content, and locate new content by leveraging the aggregated ratings of others. These services are becoming incr...