David Donald Jensen

David Donald Jensen
  • Doctor of Science
  • Professor at University of Massachusetts Amherst

About

186
Publications
57,103
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
9,741
Citations
Current institution
University of Massachusetts Amherst
Current position
  • Professor
Additional affiliations
September 1995 - present
University of Massachusetts Amherst
Position
  • Professor (Associate)

Publications

Publications (186)
Article
Full-text available
Deep reinforcement learning methods have achieved remarkable performance on challenging control tasks. Observations of the resulting behavior give the impression that the agent has constructed a generalized representation that supports insightful action decisions. We re‐examine what is meant by generalization in RL, and propose several definitions...
Article
Online experiments are an integral part of the design and evaluation of software infrastructure at Internet firms. To handle the growing scale and complexity of these experiments, firms have developed software frameworks for their design and deployment. Ensuring that the results of experiments in these frameworks are trustworthy---referred to as in...
Article
The ability to learn and reason with causal knowledge is a key aspect of intelligent behavior. In contrast to mere statistical association, knowledge of causation enables reasoning about the effects of actions. Causal reasoning is vital for autonomous agents and for a range of applications in science, medicine, business, and government. However, cu...
Preprint
Full-text available
This paper introduces a procedure for testing the identifiability of Bayesian models for causal inference. Although the do-calculus is sound and complete given a causal graph, many practical assumptions cannot be expressed in terms of graph structure alone, such as the assumptions required by instrumental variable designs, regression discontinuity...
Preprint
Full-text available
The ubiquity of mobile devices has led to the proliferation of mobile services that provide personalized and context-aware content to their users. Modern mobile services are distributed between end-devices, such as smartphones, and remote servers that reside in the cloud. Such services thrive on their ability to predict future contexts to pre-fetch...
Preprint
Methods that infer causal dependence from observational data are central to many areas of science, including medicine, economics, and the social sciences. A variety of theoretical properties of these methods have been proven, but empirical evaluation remains a challenge, largely due to the lack of observational data sets for which treatment effect...
Preprint
Full-text available
Latent confounders---unobserved variables that influence both treatment and outcome---can bias estimates of causal effects. In some cases, these confounders are shared across observations, e.g. all students taking a course are influenced by the course's difficulty in addition to any educational interventions they receive individually. This paper sh...
Preprint
Many applications of computational social science aim to infer causal conclusions from non-experimental data. Such observational data often contains confounders, variables that influence both potential causes and potential effects. Unmeasured or latent confounders can bias causal estimates, and this has motivated interest in measuring potential con...
Preprint
Full-text available
Saliency maps have been used to support explanations of deep reinforcement learning (RL) agent behavior over temporally extended sequences. However, their use in the community indicates that the explanations derived from saliency maps are often unfalsifiable and can be highly subjective. We introduce an empirical approach grounded in counterfactual...
Preprint
Full-text available
Causal inference can be formalized as Bayesian inference that combines a prior distribution over causal models and likelihoods that account for both observations and interventions. We show that it is possible to implement this approach using a sufficiently expressive probabilistic programming language. Priors are represented using probabilistic pro...
Preprint
Causal inference is central to many areas of artificial intelligence, including complex reasoning, planning, knowledge-base construction, robotics, explanation, and fairness. An active community of researchers develops and enhances algorithms that learn causal models from data, and this work has produced a series of impressive technical advances. H...
Article
Online experiments have become a ubiquitous aspect of design and engineering processes within Internet firms. As the scale of experiments has grown, so has the complexity of their design and implementation. In response, firms have developed software frameworks for designing and deploying online experiments. Ensuring that experiments in these framew...
Preprint
Full-text available
Online experiments are ubiquitous. As the scale of experiments has grown, so has the complexity of their design and implementation. In response, firms have developed software frameworks for designing and deploying online experiments. Ensuring that experiments in these frameworks are correctly designed and that their results are trustworthy---referr...
Preprint
Full-text available
Evaluation of deep reinforcement learning (RL) is inherently challenging. In particular, learned policies are largely opaque, and hypotheses about the behavior of deep RL agents are difficult to test in black-box environments. Considerable effort has gone into addressing opacity, but almost no effort has been devoted to producing high quality envir...
Preprint
Full-text available
Reproducibility in reinforcement learning is challenging: uncontrolled stochasticity from many sources, such as the learning algorithm, the learned policy, and the environment itself have led researchers to report the performance of learned agents using aggregate metrics of performance over multiple random seeds for a single environment. Unfortunat...
Article
This is a contribution to the discussion of the paper by Dorie et al. (Statist. Sci. 34 (2019) 43-68), which reports the lessons learned from 2016 Atlantic Causal Inference Conference Competition. My comments strongly support the authors' focus on empirical evaluation, using examples and experience from machine learning research, particularly focus...
Preprint
Full-text available
Deep reinforcement-learning methods have achieved remarkable performance on challenging control tasks. Observations of the resulting behavior give the impression that the agent has constructed a generalized representation that supports insightful action decisions. We re-examine what is meant by generalization in RL, and propose several definitions...
Article
Full-text available
The predominant method for evaluating the quality of causal models is to measure the graphical accuracy of the learned model structure. We present an alternative method for evaluating causal models that directly measures the accuracy of estimated interventional distributions. We contrast such distributional measures with structural measures, such a...
Conference Paper
We present Relational Covariate Adjustment (RCA), a general method for estimating causal effects in relational data. Relational Covariate Adjustment is implemented through two high-level operations: identification of an adjustment set and relational regression adjustment. The former is achieved through an extension of Pearl's back-door criterion to...
Conference Paper
Full-text available
Concerns over personalization in IR have sparked an interest in detection and analysis of controversial topics. Accurate detection would enable many beneficial applications, such as alerting search users to controversy. Wikipedia's broad coverage and rich metadata offer a valuable resource for this problem. We hypothesize that intensities of contro...
Article
Full-text available
Yield and quality improvement is of paramount importance to any manufacturing company. One of the ways of improving yield is through discovery of the root causal factors affecting yield. We propose the use of data-driven interpretable causal models to identify key factors affecting yield. We focus on factors that are measured in different stages of...
Article
Full-text available
The overarching goal of music theory is to explain the inner workings of a musical composition by examining the structure of the composition. Schenkerian music theory supposes that Western tonal compositions can be viewed as hierarchies of musical objects. The process of Schenkerian analysis reveals this hierarchy by identifying connections between...
Article
Many real-world domains are inherently relational and temporal-they consist of heterogeneous entities that interact with each other over time. Effective reasoning about causality in such domains requires representations that explicitly model relational and temporal dependence. In this work, we provide a formalization of temporal relational models....
Article
Full-text available
With the proliferation of network data, researchers are increasingly focusing on questions investigating phenomena occurring on networks. This often includes analysis of peer-effects, i.e., how the connections of an individual affect that individual's behavior. This type of influence is not limited to direct connections of an individual (such as fr...
Conference Paper
Full-text available
Density estimation methods are often regarded as unsuitable for anomaly detection in high-dimensional data due to the difficulty of estimating multivariate probability distributions. Instead, the scores from popular distance- and local-density-based methods, such as local outlier factor (LOF), are used as surrogates for probability densities. We qu...
Article
Full-text available
Recommendation systems for online dating have recently attracted much attention from the research community. In this paper we proposed a two-side matching framework for online dating recommendations and design an LDA model to learn the user preferences from the observed user messaging behavior and user profile features. Experimental results using d...
Article
Propensity score matching (PSM) is a widely used method for performing causal inference with observational data. PSM requires fully specifying the set of confounding variables of treatment and outcome. In the case of relational data, this set may include nonintuitive relational variables, i.e., variables derived from the relational structure of the...
Article
Full-text available
The PC algorithm learns maximally oriented causal Bayesian networks. However, there is no equivalent complete algorithm for learning the structure of relational models, a more expressive generalization of Bayesian networks. Recent developments in the theory and representation of relational models support lifted reasoning about conditional independe...
Conference Paper
Full-text available
This paper reports on methods and results of an applied research project by a team consisting of SAIC and four universities to develop, integrate, and evaluate new approaches to detect the weak signals characteristic of insider threats on organizations' information systems. Our system combines structural and semantic information from a real corpora...
Conference Paper
Full-text available
In this paper, we analyze the task of inferring rare links between pairs of entities that seem too similar to have occurred by chance. Variations of this task appear in such diverse areas as social network analysis, security, fraud detection, and entity resolution. To address the task in a general form, we propose a simple, flexible mixture model i...
Article
Full-text available
Bayesian networks leverage conditional independence to compactly encode joint probability distributions. Many learning algorithms exploit the constraints implied by observed conditional independencies to learn the structure of Bayesian networks. The rules of d-separation provide a theoretical and algorithmic framework for deriving conditional indep...
Article
Full-text available
Advances in technology have made it possible to collect data about individuals and the connections between them, such as email correspondence and friendships. Agencies and researchers who have collected such social network data often have a compelling interest in allowing others to analyze the data. However, in many cases the data describes relatio...
Article
Full-text available
The rules of d-separation provide a framework for deriving conditional independence facts from model structure. However, this theory only applies to simple directed graphical models. We introduce relational d-separation, a theory for deriving conditional independence in relational models. We provide a sound, complete, and computationally efficient...
Article
Full-text available
This paper introduces constraint relaxation, a new strategy for learning the structure of Bayesian networks. Constraint relaxation identifies and "relaxes" possibly inaccurate independence constraints on the structure of the model. We describe a heuristic al-gorithm for constraint relaxation that com-bines greedy search in the space of undirected s...
Article
Full-text available
Blocking is a technique commonly used in manual statistical analysis to account for confounding variables. However, blocking is not currently used in automated learning algorithms. These algorithms rely solely on statistical conditioning as an operator to identify conditional independence. In this work, we present relational blocking as a new opera...
Article
The ability to discover low-cost paths in networks has practical consequences for knowledge discovery and social network analysis tasks. Many analytic techniques for networks require finding low-cost paths, but exact methods for search become prohibitive for large networks, and data sets are steadily increasing in size. Short paths can be found eff...
Conference Paper
Full-text available
Hierarchical music analysis, as exemplified by Schenkerian analysis, describes the structure of a musical composition by a hierarchy among its notes. Each analysis defines a set of prolongations, where musical objects persist in time even though others are present. We present a formal model for representing hierarchical music analysis, probabilisti...
Article
We identify privacy risks associated with releasing network datasets and provide an algorithm that mitigates those risks. A network dataset is a graph representing entities connected by edges representing relations such as friendship, communication or shared activity. Maintaining privacy when publishing a network dataset is uniquely challenging bec...
Conference Paper
Full-text available
Testing for marginal and conditional independence is a common task in machine learning and knowledge discovery applications. Prior work has demonstrated that conventional independence tests suffer from dramatically increased rates of Type I errors when naively applied to relational data. We use graphical models to specify the conditions under which...
Conference Paper
Full-text available
Methods for discovering causal knowledge from observa- tional data have been a persistent topic of AI research for sev- eral decades. Essentially all of this work focuses on knowl- edge representations for propositional domains. In this paper, we present several key algorithmic and theoretical innova- tions that extend causal discovery to relationa...
Article
Full-text available
Social media systems have become increasingly attractive to both users and companies providing those systems. Effi-cient management of these systems is essential and requires knowledge of cause-and-effect relationships within the sys-tem. Online experimentation can be used to discover causal knowledge; however, this ignores the observational data t...
Article
Full-text available
Causal knowledge is frequently pursued by researchers in many fields, such as medicine, economics, and social science, yet very little research in knowledge discovery fo-cuses on discovering causal knowledge. Those researchers rely on a set of methods, called experimental and quasi-experimental designs, that exploit the ontological structure of the...
Article
Methods for discovering causal knowledge from observational data have been a persistent topic of AI research for several decades. Essentially all of this work focuses on knowledge representations for propositional domains. In this paper, we present several key algorithmic and theoretical innovations that extend causal discovery to relational domain...
Conference Paper
Full-text available
We apply statistical relational learning to a database of criminal and terrorist activity to predict attributes and event outcomes. The database stems from a collection of news articles and court records which are carefully annotated with a variety of variables, including categorical and continuous fields. Manual analysis of this data can help info...
Article
Full-text available
Effective and proactive decisions about intelligence gathering depend on accurate models of an adversary. Specifically, such models need to accurately reflect the cause-and-effect dependencies within the systemic behavior of the adversary. Such models can be created based entirely on the knowledge of experts, or they can be created or augmented bas...
Conference Paper
Full-text available
We describe an efficient algorithm for releasing a provably private estimate of the degree distribution of a network. The algorithm satisfies a rigorous property of differential privacy, and is also extremely efficient, running on networks of 100 million nodes in a few seconds. Theoretical analysis shows that the error scales linearly with the numb...
Article
Full-text available
This is a draft copy of a chapter in the forthcoming book Privacy-Aware Knowledge Dis-covery: Novel Applications and New Techniques to be published by Chapman & Hall/CRC Press.
Article
Full-text available
A persistent goal of research in artificial intelligence has been to enable learning and reasoning with probabilistic models in complex domains. Much of this work has been directed toward systems that complement, rather than replace, human abilities and knowledge. Models that fuse engineering knowledge (knowledge from human sources) with learned in...
Article
Of particular importance is how IRT tests models 1. The approach estimates the probability that apparent improvement in the accuracy of a given model is due to chance alone. These estimates, provided by randomization testing, protect against constructing models of inappropriate complexity. Below, I discuss the inadequacies of existing machine learn...
Conference Paper
Full-text available
Collective classification techniques jointly infer all class labels of a relational data set, using the inferences about one class label to influence inferences about related class labels. Kou and Cohen recently introduced an efficient relational model based on stacking that, despite its simplicity, has equivalent accuracy to more sophisticated joi...
Article
Full-text available
Bias/variance analysis is a useful tool for investigating the performance of machine learning algorithms. Conventional analysis decomposes loss into errors due to aspects of the learning process, but in relational domains, the inference process used for prediction introduces an additional source of error. Collective inference techniques introduce a...
Article
Many large distributed systems can be characterized as networks where short paths exist between nearly every pair of nodes. These include social, biological, communication, and distribution networks, which often display power-law or small-world structure. A central challenge of distributed systems is directing messages to specific nodes through a s...
Conference Paper
By now, online social networks have become an indispensable part of both online and offline lives of human beings. A large fraction of time spent online by a user is directly influence by the social networks to which he/she belongs. This calls for a deeper examination of social networks as large-scale dynamic objects that foster efficient person-pe...
Conference Paper
Full-text available
Researchers in the social and behavioral sciences routinely rely on quasi-experimental designs to discover knowledge from large databases. Quasi-experimental designs (QEDs) exploit fortuitous circumstances in non-experimental data to identify situations (sometimes called "natural experiments") that provide the equivalent of experimental control and...
Article
Full-text available
We identify privacy risks associated with releasing network data sets and provide an algorithm that mitigates those risks. A network consists of entities connected by links representing relations such as friendship, communication, or shared activity. Maintaining privacy when publishing networked data is uniquely challenging because an individual's...
Article
This paper introduces a simple evaluation function for multiple instance learning that admits an optimistic pruning strategy. We demonstrate comparable results to state-of-the-art methods using significantly fewer computational resources.
Conference Paper
Full-text available
Research over the past several decades in learning logical and probabilistic models has greatly increased the range of phenomena that machine learning can address. Recent work has extended these boundaries even further by unifying these two powerful learning frameworks. However, new frontiers await. Current techniques are capable of learning only a...
Article
Full-text available
We briefly describe recent research on the automatic identi- fication of quasi-experimental designs, a family of methods used in the medical, social, and economic sciences to dis- cover causal knowledge from observational data. These methods are widely used for manual discovery, but recent advances in knowledge representation and databases have mad...
Article
ative structure captures abstract knowledge about the task; e.g. to pick up an object, we must first find the object, reach to it, and then grasp it. The procedural structure captures knowledge about how to instantiate the abstract policy in a particular setting; e.g. in this case, we must use our left hand to pick up the object and use an envelopi...
Conference Paper
Full-text available
Active inference seeks to maximize classification performance while minimizing the amount of data that must be labeled ex ante. This task is particularly relevant in the context of relational data, where statistical dependencies among instances can be exploited to improve classification accuracy. We show that efficient methods for indexing network...
Chapter
Advanced statistical modeling and knowledge representation techniques for a newly emerging area of machine learning and probabilistic reasoning; includes introductory material, tutorials for different proposed approaches, and applications. Handling inherent uncertainty and exploiting compositional structure are fundamental to understanding and desi...
Article
Full-text available
Commercial datasets are often large, relational, and dynamic. They contain many records of people, places, things, events and their interactions over time. Such datasets are rarely structured appropriately for knowledge discovery, and they often contain variables whose meanings change across different subsets of the data. We describe how these chal...
Conference Paper
Full-text available
We present a family of algorithms to uncover tribes—groups of individuals who share unusual sequences of affiliations. While much work inferring community structure describes large-scale trends, we instead search for small groups of tightly linked individuals who behave anomalously with respect to those trends. We apply the algorithms to a large te...
Conference Paper
Full-text available
Substantial eort is wasted in scientific circles by researchers who rediscover ideas that have already been published in the literature. This problem has been alleviated somewhat by the availability of recent academic work online. However, the kinds of text search systems in pop- ular use today are poor at handling vocabulary mismatch, so a researc...
Conference Paper
Full-text available
Graph clustering has become ubiquitous in the study of relational data sets. We ex- amine two simple algorithms: a new graph- ical adaptation of the k-medoids algorithm and the Girvan-Newman method based on edge betweenness centrality. We show that they can be eective at discovering the la- tent groups or communities that are defined by the link st...
Conference Paper
Full-text available
Bias/variance analysis [1] is a useful tool for investigating the performance of machine learning algorithms. Conventional analysis decomposes loss into errors due to aspects of the learning process with an underlying assumption that there is no variation in model predictions due to the inference process used for prediction. This assumption is ofte...
Article
Full-text available
Recently, Tsamardinos et al. [2006] presented an algorithm for Bayesian network structure learn-ing that outperforms many state-of-the-art algorithms in terms of efficiency, structure similarity and likelihood. The Max-Min Hill Climbing algorithm is a hybrid of constraint-based and search-and-score techniques, using greedy hill climbing to search a...
Article
Full-text available
Recent work on graphical models for relational data has demonstrated significant improvements in classification and inference when models represent the dependencies among instances. Despite its use in conventional statistical models, the assumption of instance independence is contradicted by most relational data sets. For example, in citation data...
Article
Full-text available
Statistics on networks have become vital to the study of relational data drawn from areas including bibliometrics, fraud detection, bioinformatics, and the Internet. Calculating many of the most important measures—such as betweenness centrality, closeness centrality, and graph diameter—requires identifying short paths in these networks. However, fi...
Article
A wide variety of data sets produced by individual investigators are now synthesized to address ecological questions that span a range of spatial and temporal scales. It is important to facilitate such syntheses so that "consumers" of data sets can be confident that both input data sets and synthetic products are reliable. Necessary documentation t...
Article
Full-text available
Statistical analysis of relational data is a fundamental and novel problem in machine learning and data mining. Such analysis constructs useful statistical models from data about complex relationships among people, places, things, and events. Supported by this research contract, we uncovered fundamental challenges of statistical learning and infere...
Article
Decision analysis is used to examine whether residential smoke detectors should be required by law. Statistics pertaining to fire incidence, fire death, smoke detector efficacy, and the consequences of fire are examined for accuracy and availability and combined in a decision model. A sensitivity analysis is performed on the model inputs. Included...
Conference Paper
Full-text available
Disruption-tolerant networks (DTNs) attempt to route network messages via intermittently connected nodes. Routing in such environments is difficult because peers have little information about the state of the partitioned network and transfer opportunities between peers are of limited duration. In this paper, we propose MaxProp, a protocol for effec...
Article
Full-text available
In this paper, we describe the syntax and semantics for a probabilistic relational language (PRL). PRL is a recasting of recent work in Probabilistic Relational Models (PRMs) into a logic programming framework. We show how to represent varying degrees ...
Conference Paper
Full-text available
Several information organization, access, and filtering systems can benefit from different kind of document representations than those used in traditional Information Retrieval (IR). Topic Detection and Tracking (TDT) is an example of such an application. In this pa- per we demonstrate that named entities serve as better choices of units for docume...
Article
Full-text available
The interactions of professional football coaches and teams in the National Football League (NFL) form a complex social network. This network provides a great opportunity to ana-lyze the influence that coaching mentors have on their pro-tegés. In this paper, we use this social network to identify notable coaches and characterize championship coache...
Conference Paper
Full-text available
The presence of autocorrelation provides a strong motivation for using relational learning and inference techniques. Autocorrelation is a statistical dependence between the values of the same variable on related entities and is a nearly ubiquitous characteristic of relational data sets. Recent research has explored the use of collective inference t...
Article
Full-text available
In this paper, we describe the challenges inherent to the task of link prediction, and we analyze one reason why many link predic- tion models perform poorly. Specifically, we demonstrate the effects of the extremely large class skew associated with the link prediction task. We then present an alternate task — anomalous link discovery (ALD) — and q...
Conference Paper
Full-text available
We use knowledge discovery techniques to guide the creation of efficient overlay networks for peer-to-peer file sharing. An overlay network specifies the logical connections among peers in a network and is distinct from the physical connections of the network. It determines the order in which peers will be queried when a user is searching for a spe...
Conference Paper
Full-text available
We describe an application of relational knowledge discovery to a key regulatory mission of the National Association of Securities Dealers (NASD). NASD is the world's largest private-sector securities regulator, with responsibility for preventing and discovering misconduct among securities brokers. Our goal was to help focus NASD's limited regulato...
Article
Full-text available
This research developed techniques for data analysis of multi-agent systems. The research focused on how to analyze relational data that represent sets of interconnected agents, resources, and locations, as well as the attributes of these objects. Results of the research include new algorithms for knowledge discovery, fundamental discoveries about...
Conference Paper
Full-text available
Encrypting traffic does not prevent an attacker from performing some types of traffic analysis. We present a straightforward traffic analysis attack against encrypted HTTP streams that is surprisingly effective in identifying the source of the traffic. An attacker starts by creating a profile of the statistical characteristics of web requests from...
Article
Full-text available
In this paper, a novel representation is proposed in which experience is summarized by a wealth of control and perception primitives that can be mined to learn combi-nations of which features are most predictive of task success. Exploiting the inherent relational structure of these primitives and the dependencies between them presents a powerful an...
Conference Paper
Full-text available
We propose a new algorithm for finding a target node in a network whose topology is known only locally. We formulate this task as a problem of de- cision making under uncertainty and use the statis- tical properties of the graph to guide this decision. This formulation uses the homophily and degree structure of the network simultaneously, different...
Technical Report
Full-text available
Many routing algorithms for both traditional and ad hoc networks require a complete and contemporaneous path of peers from source to destination. Disruption Tolerant Networks (DTNs) attempt to deliver messages despite a frequently disconnected link layer (e.g., due to peer mobility, limited communication range, and power management limitations). Wh...
Conference Paper
Full-text available
Instance independence is a critical assumption of traditional machine learning methods contradicted by many relational datasets. For example, in scientific literature datasets, there are dependencies among the references of a paper. Recent work on graphical models for relational data has demonstrated significant performance gains for models that ex...
Conference Paper
Full-text available
Procedures for collective inference make simultaneous statistical judgments about the same variables for a set of related data instances. For example, collective inference could be used to simultaneously classify a set of hyperlinked documents or infer the legitimacy of a set of related financial transactions. Several recent studies indicate that c...
Article
Full-text available
An algorithm is presented for fitting an expression composed of continuous and discontinuous primitive functions to real-valued data points. The data modeling problem comes from the need to infer task structure for making coordination decisions for multi-agent systems. The presence of discontinuous primitive functions requires a novel approach.
Article
Knowledge discovery in databases (KDD) is an increasingly widespread activity. KDD processes may entail the use of a large number of data manipulation and analysis techniques, and new techniques are being developed on an ongoing basis. A challenge for the effective use of KDD is coordinating the use of these techniques, which may be highly speciali...

Network

Cited By