
Andrew Granville West- PhD, Univ. of Pennsylvania
- Researcher at Verisign Labs (Verisign Inc.)
Andrew Granville West
- PhD, Univ. of Pennsylvania
- Researcher at Verisign Labs (Verisign Inc.)
About
39
Publications
12,835
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
852
Citations
Introduction
Skills and Expertise
Current institution
Verisign Labs (Verisign Inc.)
Current position
- Researcher
Additional affiliations
January 2013 - July 2017

Independent Researcher
Position
- Researcher
June 2013 - present
Verisign Labs (Verisign Inc.)
Position
- Researcher
August 2007 - May 2013
Education
August 2007 - May 2013
August 2003 - June 2007
Publications
Publications (39)
Web identifiers such as usernames, hashtags, and domain names serve important roles in online navigation, communication, and community building. Therefore the entities that choose such names must ensure that end-users are able to quickly and accurately enter them in applications. Uniqueness requirements, a desire for short strings, and an absence o...
Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of health care information by both professionals and the lay public.
This paper quantifies the production and consumption of Wikipedia's medical content along 4 dimensions. First, we measured the amount of...
Using runtime execution artifacts to identify malware and its associated 'family' is an established technique in the security domain. Many papers in the literature rely on explicit features derived from network, file system, or registry interaction. While effective, use of these fine-granularity data points makes these techniquse computationally ex...
Publicly posted URLs sometimes contain a wealth of information about the identities and activities of the users who share them. URLs often utilize query strings -- that is, key-value pairs appended to the URL path -- to pass session parameters and form data. Although often benign and necessary to render the Web page, query strings sometimes contain...
Networked machines serving as binary distribution points, C&C channels, or drop sites are a ubiquitous aspect of malware infrastructure. By sandboxing malcode one can extract the network endpoints (i.e., domains and URL paths) contacted during execution. Some endpoints are benign, e.g., connectivity tests. Exclusively malicious destinations, howeve...
Using runtime execution artifacts to identify whether code is malware, and to which malware family it belongs, is an established technique in the security domain. Traditionally, literature has relied on explicit features derived from network, file system, or registry interaction [1]. While effective, the collection and analysis of these fine-granul...
Malicious webpages are a prevalent and severe threat in the Internet security landscape. This fact has motivated numerous static and dynamic techniques for their accurate and efficient detection. Building on this existing literature, this work introduces ADAM, a system that uses machine-learning over network metadata derived from the sandboxed exec...
Being the backbone routing system of the Internet, the operational aspect of the inter-domain routing is highly complex. Building a trustworthy ecosystem for inter-domain routing requires the proper maintenance of trust relationships among tens of thousands of peer IP domains called Autonomous Systems (ASes). ASes today implicitly trust any routing...
Web-based malware is a growing threat to today's Internet security. Attacks of this type are prevalent and lead to serious security consequences. Millions of malicious URLs are used as distribution channels to propagate malware all over the Web. After being infected, victim systems fall in the control of attackers, who can utilize them for various...
In this paper we present Preventive Spatio-Temporal Aggregation (PRESTA), a reputation model that combines spatial and temporal features to produce values that are behavior predictive and useful in partial-knowledge situations. To evaluate its effectiveness, we applied PRESTA in the domain of spam detection. Studying the temporal properties of IP b...
The current design of BGP implicitly assumes the existence of trust between ASes with respect to exchanging valid BGP updates. This assumption of complete trust is problematic given the frequent announcement of invalid -- inaccurate or unnecessary -- updates. This paper presents AS-CRED, a reputation service for ASes which quantifies the level of t...
Collaborative functionality is changing the way information is amassed, refined, and disseminated in online environments. A subclass of these systems characterized by "open collaboration" uniquely allow participants to *modify* content with low barriers-to-entry. A prominent example and our case study, English Wikipedia, exemplifies the vulnerabili...
Reputation management (RM) is employed in distributed and peer-to-peer networks to help users compute a measure of trust in other users based on initial belief, observed behavior, and run-time feedback. These trust values influence how, or with whom, a user will interact. Existing literature on RM focuses primarily on algorithm development, not com...
As evidenced by SourceForge and GitHub, code repositories now integrate Web 2.0 functionality that enables global participation with minimal barriers-to-entry. To prevent detrimental contributions enabled by crowdsourcing, reputation is one proposed solution. Fortunately this is an issue that has been addressed in analogous version control systems...
Spam and other electronic abuses have long been a focus of computer security research. However, recent work in the domain has emphasized an economic analysis of these operations in the hope of understanding and disrupting the profit model of attackers. Such studies do not lend themselves to passive measurement techniques. Instead, researchers have...
When URLs containing application parameters are posted in public settings privacy can be compromised if the those arguments contain personal or tracking data. To this end we describe a privacy aware link shortening service that attempt to strip sensitive and non-essential parameters based on difference algorithms and human feedback. Our implementat...
Collaborative models (e.g., wikis) are an increasingly prevalent Web technology. However, the open-access that defines such systems can also be utilized for nefarious purposes. In particular, this paper examines the use of collaborative functionality to add inappropriate hyperlinks to destinations outside the host environment (i.e., link spam). The...
Collaborative environments, such as Wikipedia, often have low barriers-to-entry in order to encourage participation. This accessibility is frequently abused (e.g., vandalism and spam). However, certain inappropriate behaviors are more threatening than others. In this work, we study contributions which are not simply ``undone'' -- but *deleted* from...
There is much literature on Wikipedia vandalism detection. However, this writing addresses two facets given little treatment to date. First, prior efforts emphasize zero-delay detection, classifying edits the moment they are made. If classification can be delayed (e.g., compiling offline distributions), it is possible to leverage ex post facto evid...
Recent years have seen the emergence of a new programming paradigm for Web applications that emphasizes the reuse of external content, the mashup. Although the mashup paradigm enables the creation of innovative Web applications with emergent features, its openness introduces trust problems. These trust issues are particularly prominent in JavaScrip...
IP blacklists are a well-regarded anti-spam mechanism that capture global spamming patterns. These properties make such lists a practical ground-truth by which to study email spam behaviors. Observing one blacklist for nearly a year-and-a-half, we collected data on roughly *half a billion* listing events. In this paper, that data serves two purpose...
Collaborative functionality is an increasingly prevalent web technology. To encourage participation, these systems usually have low barriers-to-entry and permissive privileges. Unsurprisingly, ill-intentioned users try to leverage these characteristics for nefarious purposes. In this work, a particular abuse is examined -- link spamming -- the addi...
The Border Gateway Protocol (BGP) works by frequently exchanging updates that disseminate reachability information about IP prefixes (i.e., IP address blocks) between Autonomous Systems (ASes) on the Internet. The ideal operation of BGP relies on three major behavioral assumptions (BAs): (1) information contained in the update is legal and correct,...
Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content.
In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia...
IP blacklists are a spam filtering tool employed by a large number of email providers. Centrally maintained and well regarded, blacklists can filter 80+% of spam without having to perform computationally expensive content-based filtering. However, spammers can vary which hosts send spam (often in intelligent ways), and as a result, some percentage...
Collaborative functionality is increasingly prevalent in Internet applications. Such functionality permits individuals to add -- and sometimes modify -- web content, often with minimal barriers to entry. Ideally, large bodies of knowledge can be amassed and shared in this manner. However, such software also provides a medium for biased individuals,...
The bulk of Wikipedia anti-vandalism tools require natural language processing over the article or diff text. However, our prior work demonstrated the feasibility of using spatio-temporal properties to locate malicious edits. STiki is a real-time, on-Wikipedia tool leveraging this technique.
The associated poster reviews STiki's methodology and per...
STiki is an anti-vandalism tool for Wikipedia. Unlike similar tools, STiki does not rely on natural language processing (NLP) over the article or diff text to locate vandalism. Instead, STiki leverages spatio-temporal properties of revision metadata. The feasibility of utilizing such properties was demonstrated in our prior work, which found they p...
Blatantly unproductive edits undermine the quality of the collaboratively-edited encyclopedia, Wikipedia. They not only disseminate dishonest and offensive content, but force editors to waste time undoing such acts of vandalism. Language- processing has been applied to combat these malicious edits, but as with email spam, these filters are evadable...
In this paper we present Preventive Spatio-Temporal Aggregation (PRESTA), a reputation model that combines spatial and temporal features to produce values that are behavior predictive and useful in partial-knowledge situations. To evaluate its effectiveness, we applied PRESTA in the domain of spam detection. Studying the temporal properties of IP b...
The current design of BGP implicitly assumes the existence of trust between ASes with respect to exchanging valid BGP updates. This assumption of complete trust is problematic given the frequent announcement of invalid -- inaccurate or unnecessary -- updates. This paper presents AS-CRED, a reputation service for ASes which quantifies the level of t...
Border Gateway Protocol (BGP) works by frequently exchanging updates which, disseminate reachability information (RI) about IP prefixes (i.e., address blocks) between Autonomous Systems (ASes) on the Internet. The current operation of BGP implicitly trusts the ASes to disseminate valid—accurate, stable and routing policy compliant — RI. This assump...
TALK OUTLINE: Introducing QTM * QuanTM [1] Model -- TDG: Glue between security and reputation * Fundamentals of Reputation Management * PreSTA [3] Reputation Model __ Partial QTM use-case -- Applicable for fighting spam, Wikipedia * Conclusions.
FUTURE: WIKIPEDIA -- PURPOSE: Build a blacklist of user-names/IPs based on the probability they will vandalize. TEMPORAL: Straightforward, vandals are probably repeat offenders * Registered users have IDs indicating when they joined, are new users more likely to vandalize? SPATIAL: Geographical -- Based on user location (i.e., Washington, DC) * Top...
Reputation management (RM) is employed in distributed and peer-to-peer networks to help users compute a measure of trust in other users based on initial belief, observed behavior, and run-time feedback. These trust values influence how, or with whom, a user will interact. Existing literature on RM focuses primarily on algorithm development, not com...
Quantitative Trust Management (QTM) provides a dynamic interpretation of authorization policies for access control decisions based on upon evolving reputations of the entities involved. QuanTM, a QTM system, selectively combines elements from trust management and reputation management to create a novel method for policy evaluation. Trust management...
In this paper we present Preventive Spatio-Temporal Aggregation (PRESTA), a reputation model that combines spatial and temporal features to produce values that are behavior predictive and are useful in partial-knowledge situations where entity-specific data may be unknown or incomplete. To evaluate its effectiveness, we applied PRESTA in the domain...