Martín Abadi's research while affiliated with Google Inc. and other places

Publications (120)

Article
Full-text available
Automatic differentiation plays a prominent role in scientific computing and in modern machine learning, often in the context of powerful programming systems. The relation of the various embodiments of automatic differentiation to the mathematical notion of derivative is not always entirely clear---discrepancies can arise, sometimes inadvertently....
Preprint
Automatic differentiation plays a prominent role in scientific computing and in modern machine learning, often in the context of powerful programming systems. The relation of the various embodiments of automatic differentiation to the mathematical notion of derivative is not always entirely clear---discrepancies can arise, sometimes inadvertently....
Conference Paper
Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditional execution, and other features that call for dynamic control flow. These applications benefit from...
Article
Full-text available
The recent, remarkable growth of machine learning has led to intense interest in the privacy of the data on which machine learning relies, and to new techniques for preserving privacy. However, older ideas about privacy may well remain valid and useful. This note reviews two recent works on privacy in the light of the wisdom of some of the early li...
Conference Paper
TensorFlow is a powerful, programmable system for machine learning. This paper aims to provide the basics of a conceptual framework for understanding the behavior of TensorFlow models during training and inference: it describes an operational semantics, of the kind common in the literature on programming languages. More broadly, the paper suggests...
Article
Learning a natural language interface for database tables is a challenging task that involves deep language understanding and multi-step reasoning. The task is often approached by mapping natural language queries to logical forms or programs that provide the desired response when executed on the database. To our knowledge, this paper presents the f...
Article
Full-text available
Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its training data; careful analysis of the model may therefore reveal sensitive information. To address this problem, we demonstrate a generally applicable...
Article
We describe the timely dataflow model for distributed computation and its implementation in the Naiad system. The model supports stateful iterative and incremental computations. It enables both low-latency stream processing and high-throughput batch processing, using a new approach to coordination that combines asynchronous and fine-grained synchro...
Article
Full-text available
We study the interaction of the programming construct “new,” which generates statically scoped names, with communication via messages on channels. This interaction is crucial in security protocols, which are the main motivating examples for our work; it also appears in other programming-language contexts. We define the applied pi calculus, a simple...
Conference Paper
Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop n...
Article
Full-text available
TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices...
Article
Full-text available
TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hund...
Article
While groups are generally helpful for the definition of authorization policies, their use in distributed systems is not straightforward. This paper describes a design for authorization in distributed systems that treats groups as formal languages. The design supports forms of delegation and negative clauses in authorization policies. It also consi...
Conference Paper
This paper studies timely dataflow, a model for data-parallel computing in which each communication event is associated with a virtual time. It defines and investigates the could-result-in relation which is central to this model, then the semantics of timely dataflow graphs.
Conference Paper
This paper presents a formal description and analysis of a technique for distributed rollback recovery. The setting for this work is a model for data-parallel computation with a notion of virtual time. The technique allows the selective undo of work at particular virtual times. A refinement theorem ensures the consistency of rollbacks.
Conference Paper
Differential dataflow is a recent approach to incremental computation that relies on a partially ordered set of differences. In the present paper, we aim to develop its foundations. We define a small programming language whose types are abelian groups equipped with linear inverses, and provide both a standard and a differential denotational semanti...
Conference Paper
We study information flow in a model for data-parallel computing. We show how an extant notion of virtual time can help guarantee information-flow properties. For this purpose, we introduce functions that express dependencies between inputs and outputs at each node in a dataflow graph. Each node may operate over a distinct set of virtual times—so,...
Patent
Search result poisoning attacks may be automatically detected by identifying groups of suspicious uniform resource locators (URLs) containing multiple keywords and exhibiting patterns that deviate from other URLs in the same domain without crawling and evaluating the actual contents of each web page. Suspicious websites are identified and lexical f...
Article
We present a new model for rollback recovery in distributed dataflow systems. We explain existing rollback schemes by assigning a logical time to each event such as a message delivery. If some processors fail during an execution, the system rolls back by selecting a set of logical times for each processor. The effect of events at times within the s...
Technical Report
TensorFlow [1] is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of...
Conference Paper
TypeScript is an extension of JavaScript intended to enable easier development of large-scale JavaScript applications. While every JavaScript program is a TypeScript program, TypeScript offers a module system, classes, interfaces, and a rich gradual type system. The intention is that TypeScript provides a smooth transition for JavaScript programmer...
Patent
Full-text available
Trusted user accounts of an application provider are determined. Graphs, such as trees, are created with each node corresponding to a trusted account. Each of the nodes is associated with a vouching quota, or the nodes may share a vouching quota. Untrusted user accounts are determined. For each of these untrusted accounts, a trusted user account th...
Conference Paper
Full-text available
A string of recent attacks against the global public key infrastructure (PKI) has brought to light weaknesses in the certification authority (CA) system. In response, the CA/Browser Forum, a consortium of certification authorities and browser vendors, published in 2011 a set of requirements applicable to all certificates intended for use on the Web...
Patent
Full-text available
A system to automatically classify types of IP addresses associated with a user. Information, such as user names, machine information, IP address, etc., may be obtained from logs. For each user or host in the logs, home IP addresses are identified from IP addresses where the user or host shows a predetermined level of activity. Travel IP addresses...
Conference Paper
We investigate possible improvements in online fraud detection based on information about users and their interactions. We develop, apply, and evaluate our methods in the context of Skype. Specifically, in Skype, we aim to provide tools that identify fraudsters that have eluded the first line of detection systems and have been active for months. Ou...
Conference Paper
Naiad is a distributed system for executing data parallel, cyclic dataflow programs. It offers the high throughput of batch processors, the low latency of stream processors, and the ability to perform iterative and incremental computations. Although existing systems offer some of these features, applications that require all three have relied on mu...
Article
Full-text available
In security, layout randomization is a popular, effective attack mitigation technique. Recent work has aimed to explain it rigorously, focusing on deterministic systems. In this paper, we study layout randomization in the presence of nondeterministic choice. We develop a semantic approach based on denotational models and the induced notions of cont...
Article
Motivated by the problem of avoiding duplication in storage systems, Bellare, Keelveedhi, and Ristenpart have recently put forward the notion of Message-Locked Encryption (MLE) schemes which subsumes convergent encryption and its variants. Such schemes do not rely on permanent secret keys, but rather encrypt messages using keys derived from the mes...
Patent
Full-text available
A system is disclosed for creating and implementing an access control policy framework in a weakly coherent distributed collection. A collection manager may sign certificates forming equivalence classes of replicas that share a specific authority. The collection manager and/or certain privileged replicas may issue certificates that delegate authori...
Patent
Full-text available
A framework identifies malicious queries contained in search logs to uncover relationships between the malicious queries and the potential attacks launched by attackers submitting the malicious queries. A small seed set of malicious queries may be used to identify an IP address in the search logs that submitted the malicious queries. The seed set m...
Conference Paper
Full-text available
With the advent in the 1980's of truly global hierarchical naming (via the Domain Name Service), security researchers realized that the trust relationships needed to authenticate principals would often not follow the naming hierarchy [1,13]. The most successful non-hierarchical authentication schemes are based on X.509 and RFC 5280, as used for exa...
Conference Paper
In this paper, we present a framework, SocialWatch, to detect attacker-created accounts and hijacked accounts for online services at a large scale. SocialWatch explores a set of social graph properties that effectively model the overall social activity and connectivity patterns of online users, including degree, PageRank, and social affinity featur...
Patent
Full-text available
Detection of user accounts associated with spammer attacks may be performed by constructing a social graph of email users. Biggest connected components (BCC) of the social graph may be used to identify legitimate user accounts, as the majority of the users in the biggest connected components are legitimate users. BCC users may be used to identify m...
Conference Paper
Low-level attacks often rely on guessing absolute or relative memory addresses. Layout randomization aims to thwart such attacks. In this paper, we study layout randomization in a setting in which arrays and functions can be stored in memory. Our results relate layout randomization to language-level protection mechanisms, namely to the use of abstr...
Patent
Full-text available
An IP (Internet Protocol) address is a directly observable identifier of host network traffic in the Internet and a host's IP address can dynamically change. Analysis of traffic (e.g., network activity or application request) logs may be performed and a host tracking graph may be generated that shows hosts and their bindings to IP addresses over ti...
Article
In this paper, we present a unified declarative platform for specify-ing, implementing, analyzing and auditing large-scale secure infor-mation systems. Our proposed system builds upon techniques from logic-based trust management systems, declarative networking, and data analysis via provenance. First, we propose the Secure Network Datalog (SeNDlog)...
Chapter
Tracking the progress of computations can be both important and delicate in distributed systems. In a recent distributed algorithm for this purpose, each processor maintains a delayed view of the pending work, which is represented in terms of points in virtual time. This paper presents a formal specification of that algorithm in the temporal logic...
Article
This paper presents the design and implementation of Souche, a system that recognizes legitimate users early in online services. This early recognition contributes to both usability and security. Souche leverages social connections established over time. Legitimate users help identify other legitimate users through an implicit vouching process, str...
Conference Paper
This paper introduces AC, a set of language constructs for composable asynchronous IO in native languages such as C/C++. Unlike traditional synchronous IO interfaces, AC lets a thread issue multiple IO requests so that they can be serviced concurrently, and so that long-latency operations can be overlapped with computation. Unlike traditional async...
Conference Paper
This paper introduces AC, a set of language constructs for composable asynchronous IO in native languages such as C/C++. Unlike traditional synchronous IO interfaces, AC lets a thread issue multiple IO requests so that they can be serviced concurrently, and so that long-latency operations can be overlapped with computation. Unlike traditional async...
Article
We investigate the integration of two approaches to information security: information flow analysis, in which the dependence between secret inputs and public outputs is tracked through a program, and differential privacy, in which a weak dependence between input and output is permitted but provided only through a relatively small set of known diffe...
Conference Paper
Full-text available
Many malicious activities on the Web today make use of compromised Web servers, because these servers often have high pageranks and provide free resources. Attackers are therefore constantly searching for vulnerable servers. In this work, we aim to understand how attackers find, compromise, and misuse vulnerable servers. Specifically, we present he...
Article
Full-text available
We perform an in-depth study of SEO attacks that spread malware by poisoning search results for popular queries. Such attacks, although recent, appear to be both widespread and effective. They compromise legitimate Web sites and generate a large number of fake pages targeting trendy keywords. We first dissect one exam-ple attack that affects over 5...
Conference Paper
Full-text available
Search engines not only assist normal users, but also pro- vide information that hackers and other malicious enti- ties can exploit in their nefarious activities. With care- fully crafted search queries, attackers can gather infor- mation such as email addresses and misconfigured or even vulnerable servers. We present SearchAudit, a framework that...
Article
We develop a model of concurrent imperative programming with threads. We focus on a small imperative language with cooperative threads which execute without interruption until they terminate or explicitly yield control. We define and study a trace-based denotational semantics for this language; this semantics is fully abstract but mathematically el...
Conference Paper
Layout randomization is a powerful, popular technique for software protection. We present it and study it in programming-language terms. More specifically, we consider layout randomization as part of an implementation for a high-level programming language; the implementation translates this language to a lower-level language in which memory address...
Conference Paper
Full-text available
Today's Internet services increasingly use IP-based geolocation to specialize the content and service provisioning for each user. However, these systems focus almost exclusively on the current position of users and do not attempt to infer or exploit any qualitative context about the location's relationship with the user (e.g., is the user at home?...
Article
Current software attacks often build on exploits that subvert machine-code execution. The enforcement of a basic safety property, control-flow integrity (CFI), can prevent such attacks from arbitrarily controlling program behavior. CFI enforcement is simple and its guarantees can be established formally, even with respect to powerful adversaries. M...
Conference Paper
We examine the role of transactional memory from two perspectives: that of a programming language with atomic actions and that of implementations of the language. We argue that it is difficult to formulate a clean, separate, and generally useful definition of transactional memory. In both programming-language semantics and implementations, the trea...
Conference Paper
Today's Internet is open and anonymous. While it permits free traffic from any host, attackers that generate malicious traffic cannot typically be held accountable. In this paper, we present a system called HostTracker that tracks dynamic bindings between hosts and IP addresses by leveraging application-level data with unreliable IDs. Using a month...
Conference Paper
This paper discusses progress in the verification of security protocols. Focusing on a small, classic example, it stresses the use of program-like represen- tations of protocols, and their automatic analysis in symbolic and computational models.
Conference Paper
We introduce the design and implementation of dynamic sep- aration (DS) as a programming discipline for using transactional memory. Our approach is based on the programmer indicating which objects can be updated in transactions, which can be updated outside transactions, and which are read-only. We introduce explicit operations that identify tran-...
Conference Paper
Abstract—We present a unified declarative platform for specify- ing, implementing, and analyzing secure networked information systems. Our work,builds upon,techniques from logic-based trust management systems, declarative networking, and data analysis via provenance. We make the following contributions. First, we,propose,the Secure Network,Datalog...
Conference Paper
This paper introduces a new way to provide strong atomicity in an implementation of transactional memory. Strong atomicity lets us offer clear semantics to programs, even if they access the same locations inside and outside transactions. It also avoids differ- ences between hardware-implemented transactions and software- implemented ones. Our appro...
Article
We develop a model of concurrent imperative programming with threads. We focus on a small imperative language with cooperative threads which execute without interruption until they terminate or explicitly yield control. We define and study a trace-based denotational semantics for this language; this semantics is fully abstract but mathematically el...
Conference Paper
In authorization, there is often a wish to shift the burden of proof to those making requests, since they may have more resources and more specific knowledge to construct the required proofs. We introduce an extreme instance of this approach, which we call Code-Carrying Authorization (CCA). With CCA, access-control decisions can partly be delegated...
Conference Paper
Dynamic separation is a new programming discipline for systems with transactional memory. We study it formally in the setting of a small calculus with transactions. We provide a precise formulation of dynamic separation and compare it with other programming disciplines. Furthermore, exploiting dynamic separation, we investigate some possible implem...
Article
Some promising recent schemes for XML access control employ encryption for implementing security policies on published data, avoiding data duplication. In this article, we study one such scheme, due to Miklau and Suciu [2003]. That scheme was introduced with some intuitive explanations and goals, but without precise definitions and guarantees for t...
Conference Paper
Software Transactional Memory (STM) is an attractive basis for the development of language features for concurrent programming. However, the semantics of these features can be delicate and problematic. In this paper we explore the tradeoffs between semantic simplicity, the viability of efficient implementation strategies, and the flexibilityof lang...
Article
In this paper, we present a unified declarative platform for specifying, implementing, analyzing and auditing large-scale secure information systems. Our proposed system builds upon techniques from logic-based trust management systems, declarative networking, and data analysis via provenance. First, we propose the Secure Network Datalog (SeNDlog) l...
Conference Paper
Both proofs and trust relations play a role in security deci- sions, in particular in determining whether to execute a piece of code. We have developed a language, called BCIC, for policies that combine proofs and trusted assertions about code. In this paper, using BCIC, we suggest an approach to code auditing that bases auditing decisions on logic...
Article
JFK is a recent, attractive protocol for fast key establishment as part of securing IP communication. In this paper, we formally analyze this protocol in the applied pi calculus (partly in terms of observational equivalences and partly with the assistance of an automatic protocol verifier). We treat JFK's core security properties and also other pro...
Conference Paper
Full-text available
We describe a new design for authorization in operating systems in which applications are first-class entities. In this design, principals reflect application identities. Access control lists are patterns that recognize principals. We present a security model that embodies this design in an experimental operating system, and we describe the impleme...
Article
In this paper, we present a declarative language and system for describing and implementing secure networks. Our proposed language, SeNDlog, is an attempt at unifying Binder, a logic-based language for access control in distributed systems, and Network Datalog (NDlog), a database query language for declarative networks. The contributions of this pa...
Article
There is a new class of attacks (in particular against AES encryption) that allows a party that is able to execute code on a hardware system to learn critical data (such as AES encryption keys) used by other users of the system. The attacks come in several subclasses but they all depend on information being leaked through the timing behavior of mem...
Article
The analysis of security protocols requires precise formulations of the knowledge of protocol participants and attackers. In formal approaches, this knowledge is often treated in terms of message deducibility and indistinguishability relations. In this paper we study the decidability of these two relations. The messages in question may employ funct...
Conference Paper
We define and study a distributed cryptographic implementation for an asynchronous pi calculus. At the source level, we adapt simple type systems designed for establishing formal secrecy properties. We show that those secrecy properties have counterparts in the implementation, not formally but at the level of bitstrings, and with respect to probabi...
Conference Paper
We model networked storage systems with distributed, cryptographi- cally enforced file-access control in an applied pi calculus. The calculus contains cryptographic primitives and supports file-system constructs, including access re- vocation. We establish that the networked storage systems implement simpler, centralized storage specifications with...
Conference Paper
Full-text available
The indistinguishability of two pieces of data (or two lists of pieces of data) can be represented formally in terms of a relation called static equivalence. Static equivalence depends on an underlying equa- tional theory. The choice of an inappropriate equational theory can lead to overly pessimistic or overly optimistic notions of indistinguishab...
Article
This article presents a static race-detection analysis for multithreaded shared-memory programs, focusing on the Java programming language. The analysis is based on a type system that captures many common synchronization patterns. It supports classes with internal synchronization, classes that require client-side synchronization, and thread-local c...
Conference Paper
XFI is a comprehensive protection system that offers both flexible access control and fundamental integrity guarantees, at any privilege level and even for legacy code in commodity systems. For this purpose, XFI com-bines static analysis with inline software guards and a two-stack execution model. We have implemented XFI for Windows on the x86 arch...
Conference Paper
Control-Flow Integrity (CFI) is a property that guarantees program control flow cannot be subverted by a malicious ad- versary, even if the adversary has complete control of data memory. We have shown in prior work how CFI can be en- forced by using inlined software guards that perform safety checks. The first part of this paper shows how modest In...
Conference Paper
Secrecy properties can he guaranteed through a combination of static and dynamic checks. The static checks may include the application of special type systems with notions of secrecy. The dynamic checks can be of many different kinds; in practice, the most important are access-control checks, often ones based on ACLs (access-control lists). In this...
Conference Paper
Control-Flow Integrity (CFI) means that the execution of a program dynamically follows only certain paths, in accordance with a static policy. CFI can prevent attacks that, by exploiting buer over- flows and other vulnerabilities, attempt to control program behavior. This paper develops the basic theory that underlies two practical tech- niques for...
Article
Full-text available
Singularity is a research project in Microsoft Research that started with the question: what would a software platform look like if it was designed from scratch with the primary goal of dependability? Singularity is working to answer this question by building on advances in programming languages and tools to develop a new system architecture and op...
Article
We present the formalization and verification of a recent cryptographic protocol for certified email. Relying on a tool for automatic protocol analysis, we establish the key security properties of the protocol. This case study explores the use of general correspondence assertions in automatic proofs, and aims to demonstrate the considerable power o...
Conference Paper
Full-text available
In the analysis of security protocols, methods and tools for reasoning about protocol behaviors have been quite effective. We aim to expand the scope of those methods and tools. We focus on proving equivalences P≈Q in which P and Q are two processes that differ only in the choice of some terms. These equivalences arise often in applications. We sho...
Conference Paper
In the analysis of security protocols, the knowledge of attackers is often described in terms of message deducibility and indistinguishability relations. In this paper, we pursue the study of these two relations. We establish general decidability theorems for both. These theorems require only loose, abstract conditions on the equational theory for...
Conference Paper
Full-text available
The use of passwords in security protocols is particularly delicate be- cause of the possibility of off-line guessing attacks. We study password-based protocols in the context of a recent line of research that aims to justify symbolic models in terms of more concrete, computational ones. We offer two models for reasoning about the concurrent use of...
Article
Full-text available
A resource may be abused if its users incur little or no cost. For example, e-mail abuse is rampant because sending an e-mail has negligible cost for the sender. It has been suggested that such abuse may be discouraged by introducing an artificial cost in the form of a moderately expensive computation. Thus, the sender of an e-mail might be require...
Conference Paper
Security of mobile code is a major issue in today’s global computing environment. When you download a program from an untrusted source, how can you be sure it will not do something undesirable? In this paper I will discuss a particular approach to this problem called language-based security. In this approach, security information is derived from a...
Conference Paper
Some promising recent schemes for XML access control employ encryption for implementing security policies on published data, avoiding data duplication. In this paper we study one such scheme, due to Miklau and Suciu. That scheme was introduced with some intuitive explanations and goals, but without precise definitions and guarantees for the use of...
Conference Paper
Current software attacks often build on exploits that subvert machine-code execution. The enforcement of a basic safety property, Control-Flow Integrity (CFI), can prevent such attacks from arbitrarily controlling program behavior. CFI enforcement is simple, and its guarantees can be established formally even with respect to powerful adversaries. M...