Article

Causal Inference with Networked Treatment Diffusion

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Treatment interference (i.e., one unit’s potential outcomes depend on other units’ treatment) is prevalent in social settings. Ignoring treatment interference can lead to biased estimates of treatment effects and incorrect statistical inferences. Some recent studies have started to incorporate treatment interference into causal inference. But treatment interference is often assumed to follow a simple structure (e.g., treatment interference exists only within groups) or measured in a simplistic way (e.g., only based on the number of treated friends). In this paper, I highlight the importance of collecting data on actual treatment diffusion in order to more accurately measure treatment interference. Furthermore, I show that with accurate measures of treatment interference, we can identify and estimate a series of causal effects that are previously unavailable, including the direct treatment effect, treatment interference effect, and treatment effect on interference. I illustrate the methods through a case study of a social network–based smoking prevention intervention.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... In such settings, one unit's potential outcome is a function of not only its treatment status but also the treatment status of other related units (Aronow & Samii 2017, Athey et al. 2018a, Tchetgen Tchetgen & VanderWeele 2012, VanderWeele 2015. Such interferences, or interactions, are prevalent in social settings (An 2018, An & VanderWeele 2022, Egami 2021. For example, encouraging an individual to vote through some intervention can increase the turnout for household members (Imai & Jiang 2020). ...
... In some cases, such interactions are the focus of analysis; in other cases, they are considered a nuisance to estimating treatment effects (given the assumption of no interference) (Hong & Raudenbush 2015, Ogburn et al. 2022). Yet ignoring interference can lead to biased estimates of causal effects and incorrect statistical inferences (An 2018, Basse & Airoldi 2018, Lee & Ogburn 2021. ...
... • Causality and Machine Learning methods for assessing the potential influence of unobserved networks on causal findings. Relatedly, An (2018) emphasizes the importance of collecting data on treatment diffusion to measure treatment interference properly and then to estimate the direct treatment effect, treatment interference effect, and treatment effect on interference. ...
Article
This article reviews recent advances in causal inference relevant to sociology. We focus on a selective subset of contributions aligning with four broad topics: causal effect identification and estimation in general, causal effect heterogeneity, causal effect mediation, and temporal and spatial interference. We describe how machine learning, as an estimation strategy, can be effectively combined with causal inference, which has been traditionally concerned with identification. The incorporation of machine learning in causal inference enables researchers to better address potential biases in estimating causal effects and uncover heterogeneous causal effects. Uncovering sources of effect heterogeneity is key for generalizing to populations beyond those under study. While sociology has long emphasized the importance of causal mechanisms, historical and life-cycle variation, and social contexts involving network interactions, recent conceptual and computational advances facilitate more principled estimation of causal effects under these settings. We encourage sociologists to incorporate these insights into their empirical research. Expected final online publication date for the Annual Review of Sociology, Volume 49 is July 2023. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
... Notes 1. To be sure, recent studies propose identification and estimation for network spillover effects (An 2018;Tchetgen, Fulcher, and Shpitser 2021). These studies examine how networks diffuse treatment statuses, but they do not examine indirect network selection effects on higher-order outcomes. ...
... Whereas stochastic actor-oriented models enables evaluation of the coevolution of networks and behaviors, our approach is concerned with estimating the indirect effect of a network selection process on an individual outcome due to a change in structural network context. 3. Potential outcomes have received limited attention in the social networks literature, but recent studies have discussed the utilities of potential outcomes perspectives for networks research (An 2018;An et al. 2022;VanderWeele and An 2013). I build on these approaches to link contemporary mediation analysis methods, most of which emphasize potential outcomes, to the statistical network modeling literature. ...
Article
Full-text available
Mediation analysis is increasingly used in the social sciences. Extension to social network data, however, has proved difficult because statistical network models are formulated at a lower level of analysis (the dyad) than many outcomes of interest. This study introduces a general approach for micro-macro mediation analysis in social networks. The author defines the average mediated micro effect (AMME) as the indirect effect of a network selection process on an individual, group, or organizational outcome through its effect on an intervening network variable. The author shows that the AMME can be nonparametrically identified using a wide range of common statistical network and regression modeling strategies under the assumption of conditional independence among multiple mediators. Nonparametric and parametric algorithms are introduced to generically estimate the AMME in a multitude of research designs. The author illustrates the utility of the method with an applied example using cross-sectional National Longitudinal Study of Adolescent to Adult Health data to examine the friendship selection mechanisms that indirectly shape adolescent school performance through their effect on network structure.
... More work is also needed to identify the mechanisms underlying peer effects (Lin 2001, DiMaggio & Garip 2012, An 2015a. In terms of methodological progress, more work is needed to provide standard errors that can account for outcome dependence across units beyond social contagion (Advani & Malde 2018b, An 2018, Lee & Ogburn 2020). Finally, more research is also needed to explore nonparametric identification of network effects (Egami 2020;Ogburn et al. 2020a,b). ...
... In particular, contextual confounding and reverse causality (i.e., individuals affect the network structures) are of concern. To improve statistical inferences, one may use a multivariate regression model to account for outcome correlations across units (An 2018. ...
Article
Fueled by recent advances in statistical modeling and the rapid growth of network data, social network analysis has become increasingly popular in sociology and related disciplines. However, a significant amount of work in the field has been descriptive and correlational, which prevents the findings from being more rigorously translated into practices and policies. This article provides a review of the popular models and methods for causal network analysis, with a focus on causal inference threats (such as measurement error, missing data, network endogeneity, contextual confounding, simultaneity, and collinearity) and potential solutions (such as instrumental variables, specialized experiments, and leveraging longitudinal data). It covers major models and methods for both network formation and network effects and for both sociocentric networks and egocentric networks. Lastly, this review also discusses future directions for causal network analysis. Expected final online publication date for the Annual Review of Sociology, Volume 48 is July 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
... Modeling contagion effects in social networks using human subject designs is often particularly challenging (see Galea et al., 2009;Hunter-Reel et al., 2009;Latkin & Knowlton, 2015;Leonard, 2015). Although there are methods available to identify potential contagion effects, they often require large samples, complex designs, and/or causal effect analyses with strong assumptions that may be difficult to meet in real-world studies (e.g., An, 2018;Barnett et al., 2019;VanderWeele & Tchetgen, 2011). Moreover, researchers can only control certain variables experimentally, such as whether certain individuals receive alcohol interventions, but they cannot control other important mechanisms that could potentially moderate those effects, such as the level of social influence or social selection present in the networks being studied. ...
... There is a growing body of literature that articulates methods for modeling the effects of interventions on non-targeted social network members in observational studies, cluster randomized trials, and agent-based modeling studies (An, 2018;Benjamin-Chung et al., 2018;Kang & Keele, 2018;Marshall & Galea, 2015). Computer simulations of social networks can potentially complement this work by testing whether alcohol interventions are likely to affect the drinking of non-targeted individuals through effects that are typically observed in social networks, such as drinkingrelated social influence or social selection in adolescent social networks. ...
Article
Full-text available
Objective: Adolescents' drinking is influenced by their friends' drinking. However, it is unclear whether individually-targeted alcohol interventions reduce drinking in the friends of individuals who receive the intervention. This study used simulations of drinking in simulated longitudinal social networks to test whether individually-targeted alcohol interventions may be expected to spread to non-targeted individuals. Method: Stochastic actor-based models simulated longitudinal social networks where changes in drinking and friendships were modeled using parameters from a meta-analysis of high school 10th grade social networks. Social influence (i.e., how much one's friends' drinking affects their own drinking) and social selection (i.e., how much one's drinking affects who they select as friends) were manipulated at several levels. At the midpoint of each simulation, a randomly-selected heavy-drinking individual was experimentally assigned to an intervention (changing their drinking status to non-drinking) or a control condition (no change in drinking status) and the drinking statuses of that individual's friends were recorded at the end of the simulation. Results: Friends of individuals who received the intervention significantly reduced their drinking, with higher reductions occurring in networks with greater social influence. However, all effect sizes were small (e.g., average per-friend reduction of .07 on a 5-point drinking scale). Conclusions: Individually-targeted alcohol interventions may have small effects on reducing the drinking of non-targeted adolescents, with social influence being a mechanism that drives such effects. Due to small effect sizes, many adolescents may need to receive alcohol interventions to produce measurable effects on drinking outcomes for non-targeted individuals. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
... The most typical examples include vaccine interventions with effects propagating across infection networks of individuals who come into contact with one another and informational interventions on individuals connected through their social network. Methods for causal inference with interference have motivated much recent work (Hudgens and Halloran, 2008;Bowers et al., 2013;Liu and Hudgens, 2014;Aronow and Samii, 2017;Karwa and Airoldi, 2018;Tchetgen and VanderWeele, 2012;Liu and Hudgens, 2014;Liu et al., 2016;Forastiere et al., 2018Forastiere et al., , 2020Sävje et al., 2017;An, 2018;Papadogeorgou et al., 2019;An and VanderWeele, 2019). ...
... Unlike the common setting where interference arises due to unit-to-unit outcome dependencies (e.g., one person's vaccination status impacts another person's infection risk), interference in this case arises due to complex exposure patterns governed by the movement of air pollution from an originating source (a power plant) across long distances towards impact on populations. Interference due to complex exposure patterns (or treatment diffusion) has been considered in the causal inference literature, albeit with less frequency than settings of interference due to unit-to-unit outcome dependencies and only emerging focus on explicitly spatial data (Verbitsky-Savitz and Raudenbush, 2012;Graham et al., 2013;An, 2018;An and VanderWeele, 2019;Giffin et al., 2020). To characterize the structure of interference, we deploy a newly-developed reduced-complexity atmospheric model, called HYSPLIT Average Dispersion (HyADS), to model the movement of pollution through space and time (Henneman et al., 2019a). ...
Preprint
Full-text available
Evaluating air quality interventions is confronted with the challenge of interference since interventions at a particular pollution source likely impact air quality and health at distant locations and air quality and health at any given location are likely impacted by interventions at many sources. The structure of interference in this context is dictated by complex atmospheric processes governing how pollution emitted from a particular source is transformed and transported across space, and can be cast with a bipartite structure reflecting the two distinct types of units: 1) interventional units on which treatments are applied or withheld to change pollution emissions; and 2) outcome units on which outcomes of primary interest are measured. We propose new estimands for bipartite causal inference with interference that construe two components of treatment: a "key-associated" (or "individual") treatment and an "upwind" (or "neighborhood") treatment. Estimation is carried out using a semi-parametric adjustment approach based on joint propensity scores. A reduced-complexity atmospheric model is deployed to characterize the structure of the interference network by modeling the movement of air parcels through time and space. The new methods are deployed to evaluate the effectiveness of installing flue-gas desulfurization scrubbers on 472 coal-burning power plants (the interventional units) in reducing Medicare hospitalizations among 22,603,597 Medicare beneficiaries residing across 23,675 ZIP codes in the United States (the outcome units).
... Therefore, we believe that network interference should be considered by researchers. Although there is a robust literature about network interference (e.g., see An, 2018;Eckles et al., 2016;Forastiere et al., 2021), there are relatively few sources on how to account for network interference in social science interventions. Thus for the time being, our recommendation is to qualitatively consider the impacts of potential network interference when designing studies. ...
Article
Full-text available
Social network analysis involves the study of relationships among individuals and includes methods that uncover why or how individuals interact or form relationships and how those relationships impact other outcomes. Despite the breadth of methods available to address psychological research questions, social network analysis is not yet a standard practice in psychological research. To promote the use of social network analysis in psychological research, we present an overview of network methods, situating each method within the context of research studies and questions in psychology.
... (1) How to optimize diffusion (Aral, Muchnik, & Sundararajan, 2013;Basse & Airoldi, 2015), which involves whom to choose (centrality and incentive), how many to choose, whether and how much to train the seeds. (2) How to make better causal inference under interference, namely, how to cleanly separate contagion, spillover, and recursive effects (An, 2018). (3) Ethical issues, such as how to make targeted marketing both more accurate and ethical, unintended tie decay, etc. ...
... We imagine extensions to this model would then account for network interference/peer influence and align with current research in that area (e.g. An, 2018;VanderWeele & An, 2013;Ogburn, VanderWeele et al., 2017;Manski, 2000). ...
Article
Full-text available
For interventions that affect how individuals interact, social network data may aid in understanding the mechanisms through which an intervention is effective. Social networks may even be an intermediate outcome observed prior to end of the study. In fact, social networks may also mediate the effects of the intervention on the outcome of interest, and Sweet (2019) introduced a statistical model for social networks as mediators in network-level interventions. We build on their approach and introduce a new model in which the network is a mediator using a latent space approach. We investigate our model through a simulation study and a real-world analysis of teacher advice-seeking networks.
Preprint
Full-text available
Social network analysis can answer research questions such as why or how individuals interact or form relationships and how those relationships impact other outcomes. Despite the breadth of methods available to address psychological research questions, social network analysis is not yet a standard practice in psychological research. To promote the use of social network analysis in psychological research, we present an overview of network methods, situating each method within the context of research studies and questions in psychology.
Article
Full-text available
Social theories posit that peers affect students’ academic self-concept (ASC). Most prominently, Big-Fish-Little-Pond, invidious comparison, and relative deprivation theories predict that exposure to academically stronger peers decreases students’ ASC, and exposure to academically weaker peers increases students’ ASC. These propositions have not yet been tested experimentally. We executed a large and pre-registered field experiment that randomized students to deskmates within 195 classrooms of 41 schools (N = 3,022). Our primary experimental analysis found no evidence of an effect of peer achievement on ASC in either direction. Exploratory analyses hinted at a subject-specific deskmate effect on ASC in verbal skills, and that sitting next to a lower-achieving boy increased girls’ ASC (but not that sitting next to a higher-achieving boy decreased girls’ ASC). Critically, however, none of these group-specific results held up to even modest corrections for multiple hypothesis testing. Contrary to theory, our randomized field experiment thus provides no evidence for an effect of peer achievement on students’ ASC.
Preprint
In randomized experiments, interactions between units might generate a treatment diffusion process. This is common when the treatment of interest is an actual object or product that can be shared among peers (e.g., flyers, booklets, videos). For instance, if the intervention of interest is an information campaign realized through the distribution of a video to targeted individuals, some of these treated individuals might share the video they received with their friends. Such a phenomenon is usually unobserved, causing a misallocation of individuals in the two treatment arms: some of the initially untreated units might have actually received the treatment by diffusion. Treatment misclassification can, in turn, introduce a bias in the estimation of the causal effect. Inspired by a recent field experiment on the effect of different types of school incentives aimed at encouraging students to attend cultural events, we present a novel approach to deal with a hidden diffusion process on observed or partially observed networks.Specifically, we develop a simulation-based sensitivity analysis that assesses the robustness of the estimates against the possible presence of a treatment diffusion. We simulate several diffusion scenarios within a plausible range of sensitivity parameters and we compare the treatment effect which is estimated in each scenario with the one that is obtained while ignoring the diffusion process. Results suggest that even a treatment diffusion parameter of small size may lead to a significant bias in the estimation of the treatment effect.
Article
Full-text available
Social influence occurs when an individual's outcome is affected by another individual's actions. Current approaches in psychology that seek to examine social influence have focused on settings where individuals are nested in predefined groups and do not interact across groups. Such study designs permit using standard estimation methods such as multilevel models for estimating treatment effects but restrict social influence to originate only from individuals within the same group. In more general settings, such as social networks where an individual is free to interact with any other individual, the absence of discernible clusters or scientifically meaningful groups precludes existing estimation methods. In this article, we introduce a new class of methods for assessing social influence in social networks in the context of randomized experiments in psychology. Our proposal builds on the potential outcomes framework from the causal inference literature. In particular, we exploit the concept of (treatment) interference, which occurs between individuals when one individual's outcome is affected by other individuals' treatments. Estimation proceeds using randomization-based approaches that are established in other disciplines and guarantee valid inference by construction. We compared the proposed methods with standard methods empirically using Monte Carlo simulation studies. We illustrated the method using publicly available data from an experiment assessing the effects of an anticonflict intervention among students' peer networks. The R scripts used to implement the proposed methods in the simulation studies and the applied example are freely available online. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Article
Full-text available
Using panel data of school-class networks of 11–13-year-old students, this study investigates effects of schoolwork collaboration-networks on grades and school-related well-being. It suggests propensity score weighting-regression as a method of causal inference for data collected in social contexts, and in studies analyzing node-attributes as outcomes of interest. It will argued that this alternative approach is useful when stochastic actor-based models (SAOMs) show convergence problems in sparse networks. Three methods of causal analysis dealing with the problems of endogeneity bias and interference between observations will be discussed in this study: first, SAOMs for the co-evolution of networks and behavior/attitudes will be estimated, but this results in a systematic loss of data. Second, propensity score matching compares treated cases with untreated nearest neighbors. However, the stable-unit-treatment-value assumption (SUTVA) requires that the analysis controls for network embeddedness in the final analysis. This is possible by using propensity score weighting-regression, which is a flexible approach to capture treatment diffusion via multiplex networks.
Article
Causal inference under treatment interference is a challenging but important problem. Past studies usually make strong assumptions on the structure of treatment interference in order to estimate causal treatment effects while accounting for the effect of treatment interference. In this article, we view treatment diffusion as a concrete form of treatment interference that is prevalent in social settings and also as an outcome of central interest. Specifically, we analyze data from a smoking prevention intervention conducted with 4,094 students in six middle schools in China. We measure treatment interference by tracing how the distributed intervention brochures are shared by students, which provides information to construct the so-called treatment diffusion networks. Besides providing descriptive analyses, we use exponential random graph models to model the treatment diffusion networks in order to reveal covariates and network processes that significantly correlate with treatment diffusion. We show that the findings provide an empirical basis to evaluate previous assumptions on the structure of treatment interference, are informative for imputing treatment diffusion data that is crucial for making causal inference under treatment interference, and shed light on how to improve designs of future interventions that aim to optimize treatment diffusion.
Article
Full-text available
In this paper we consider how to assign treatment in a randomized experiment in which the correlation among the outcomes is informed by a network available pre-intervention. Working within the potential outcome causal framework, we develop a class of models that posit such a correlation structure among the outcomes.We use these models to develop restricted randomization strategies for allocating treatment optimally, by minimizing the mean squared error of the estimated average treatment effect. Analytical decompositions of the mean squared error, due both to the model and to the randomization distribution, provide insights into aspects of the optimal designs. In particular, the analysis suggests new notions of balance based on specific network quantities, in addition to classical covariate balance. The resulting balanced optimal restricted randomization strategies are still design-unbiased when the model used to derive them does not hold.We illustrate how the proposed treatment allocation strategies improve on allocations that ignore the network structure.
Article
Full-text available
This article reviews and comments on three major expansions of propensity score methods in recent decades. First, how to use generalized propensity scores to tackle multi-categorical or continuous treatment variables is shown in procedures of propensity score regression adjustment and propensity score weighting. Second, the counterfactual framework of causal inference in the analysis of mediation mechanisms is reviewed and the decomposition of the causal relationship between variables into causal direct effects and causal indirect effects is illustrated. Third, the heterogeneous treatment effect across the distribution of propensity score values is discussed in the framework of the stratification-multilevel model. For each methodological breakthrough, this article comments on potential issues which deserve serious attention in the practical application of these methods.
Article
Full-text available
Unmeasured confounding may undermine the validity of causal inference with observational studies. Sensitivity analysis provides an attractive way to partially circumvent this issue by assessing the potential influence of unmeasured confounding on the causal conclusions. However, previous sensitivity analysis approaches often make strong and untestable assumptions such as having a confounder that is binary, or having no interaction between the effects of the exposure and the confounder on the outcome, or having only one confounder. Without imposing any assumptions on the confounder or confounders, we derive a bounding factor and a sharp inequality such that the sensitivity analysis parameters must satisfy the inequality if an unmeasured confounder is to explain away the observed effect estimate or reduce it to a particular level. Our approach is easy to implement and involves only two sensitivity parameters. Surprisingly, our bounding factor, which makes no simplifying assumptions, is no more conservative than a number of previous sensitivity analysis techniques that do make assumptions. Our new bounding factor implies not only the traditional Cornfield conditions that both the relative risk of the exposure on the confounder and that of the confounder on the outcome must satisfy, but also a high threshold that the maximum of these relative risks must satisfy. Furthermore, this new bounding factor can be viewed as a measure of the strength of confounding between the exposure and the outcome induced by a confounder.
Article
Full-text available
We consider the problem of how to assign treatment in a randomized experiment, when the correlation among the outcomes is informed by a network available pre-intervention. Working within the potential outcome causal framework, we develop a class of models that posit such a correlation structure among the outcomes, and a strategy for allocating treatment optimally, for the goal of minimizing the integrated mean squared error of the estimated average treatment effect. We provide insights into features of the optimal designs via an analytical decomposition of the mean squared error used for optimization. We illustrate how the proposed treatment allocation strategy improves on allocations that ignore the network structure, with extensive simulations.
Chapter
Full-text available
This chapter discusses the use of directed acyclic graphs (DAGs) for causal inference in the observational social sciences. It focuses on DAGs’ main uses, discusses central principles, and gives applied examples. DAGs are visual representations of qualitative causal assumptions: They encode researchers’ beliefs about how the world works. Straightforward rules map these causal assumptions onto the associations and independencies in observable data. The two primary uses of DAGs are (1) determining the identifiability of causal effects from observed data and (2) deriving the testable implications of a causal model. Concepts covered in this chapter include identification, d-separation, confounding, endogenous selection, and overcontrol. Illustrative applications then demonstrate that conditioning on variables at any stage in a causal process can induce as well as remove bias, that confounding is a fundamentally causal rather than an associational concept, that conventional approaches to causal mediation analysis are often biased, and that causal inference in social networks inherently faces endogenous selection bias. The chapter discusses several graphical criteria for the identification of causal effects of single, time-point treatments (including the famous backdoor criterion), as well identification criteria for multiple, time-varying treatments.
Article
Full-text available
We study the calculation of exact p-values for a large class of non-sharp null hypotheses about treatment effects in a setting with data from experiments involving members of a single connected network. The class includes null hypotheses that limit the effect of one unit's treatment status on another according to the distance between units; for example, the hypothesis might specify that the treatment status of immediate neighbors has no effect, or that units more than two edges away have no effect. We also consider hypotheses concerning the validity of sparsification of a network (for example based on the strength of ties) and hypotheses restricting heterogeneity in peer effects (so that, for example, only the number or fraction treated among neighboring units matters). Our general approach is to define an artificial experiment, such that the null hypothesis that was not sharp for the original experiment is sharp for the artificial experiment, and such that the randomization analysis for the artificial experiment is validated by the design of the original experiment.
Article
Full-text available
Estimating the effects of interventions in networks is complicated when the units are interacting, such that the outcomes for one unit may depend on the treatment assignment and behavior of many or all other units (i.e., there is interference). When most or all units are in a single connected component, it is impossible to directly experimentally compare outcomes under two or more global treatment assignments since the network can only be observed under a single assignment. Familiar formalism, experimental designs, and analysis methods assume the absence of these interactions, and result in biased estimators of causal effects of interest. While some assumptions can lead to unbiased estimators, these assumptions are generally unrealistic, and we focus this work on realistic assumptions. Thus, in this work, we evaluate methods for designing and analyzing randomized experiments that aim to reduce this bias and thereby reduce overall error. In design, we consider the ability to perform random assignment to treatments that is correlated in the network, such as through graph cluster randomization. In analysis, we consider incorporating information about the treatment assignment of network neighbors. We prove sufficient conditions for bias reduction through both design and analysis in the presence of potentially global interference. Through simulations of the entire process of experimentation in networks, we measure the performance of these methods under varied network structure and varied social behaviors, finding substantial bias and error reductions. These improvements are largest for networks with more clustering and data generating processes with both stronger direct effects of the treatment and stronger interactions between units.
Article
Full-text available
The propensity score is the conditional probability of assignment to a particular treatment given a vector of observed covariates. Both large and small sample theory show that adjustment for the scalar propensity score is sufficient to remove bias due to all observed covariates. Applications include: (i) matched sampling on the univariate propensity score, which is a generalization of discriminant matching, (ii) multivariate adjustment by subclassification on the propensity score where the same subclasses are used to estimate treatment effects for all outcome variables and in all subpopulations, and (iii) visual representation of multivariate covariance adjustment by a two- dimensional plot.
Article
Full-text available
A procedure is described for finding sets of key players in a social network. A key assumption is that the optimal selection of key players depends on what they are needed for. Accordingly, two generic goals are articulated, called KPP-POS and KPP-NEG. KPP- POS is defined as the identification of key players for the purpose of optimally diffusing something through the network by using the key players as seeds. KPP-NEG is defined as the identification of key players for the purpose of disrupting or fragmenting the network by removing the key nodes. It is found that off-the-shelf centrality measures are not optimal for solving either generic problem, and therefore new measures are presented.
Article
Full-text available
An experimental unit is an opportunity to randomly apply or withhold a treatment. There is interference between units if the application of the treatment to one unit may also affect other units. In cognitive neuroscience, a common form of experiment presents a sequence of stimuli or requests for cognitive activity at random to each experimental subject and measures biological aspects of brain activity that follow these requests. Each subject is then many experimental units, and interference between units within an experimental subject is likely, in part because the stimuli follow one another quickly and in part because human subjects learn or become experienced or primed or bored as the experiment proceeds. We use a recent fMRI experiment concerned with the inhibition of motor activity to illustrate and further develop recently proposed methodology for inference in the presence of interference. A simulation evaluates the power of competing procedures.
Article
Full-text available
Interference is said to be present when the exposure or treatment received by one individual may affect the outcomes of other individuals. Such interference can arise in settings in which the outcomes of the various individuals come about through social interactions. When interference is present, causal inference is rendered considerably more complex, and the literature on causal inference in the presence of interference has just recently begun to develop. In this article we summarise some of the concepts and results from the existing literature and extend that literature in considering new results for finite sample inference, new inverse probability weighting estimators in the presence of interference and new causal estimands of interest.
Article
Full-text available
Modern social network analysis---the analysis of relational data arising from social systems---is a computationally intensive area of research. Here, we provide an overview of a software package which provides support for a range of network analytic functionality within the R statistical computing environment. General categories of currently supported functionality are described, and brief examples of package syntax and usage are shown.
Book
In this second edition of Counterfactuals and Causal Inference, completely revised and expanded, the essential features of the counterfactual approach to observational data analysis are presented with examples from the social, demographic, and health sciences. Alternative estimation techniques are first introduced using both the potential outcome model and causal graphs; after which, conditioning techniques, such as matching and regression, are presented from a potential outcomes perspective. For research scenarios in which important determinants of causal exposure are unobserved, alternative techniques, such as instrumental variable estimators, longitudinal methods, and estimation via causal mechanisms, are then presented. The importance of causal effect heterogeneity is stressed throughout the book, and the need for deep causal explanation via mechanisms is discussed.
Article
This paper presents a randomization-based framework for estimating causal effects under interference between units motivated by challenges that arise in analyzing experiments on social networks. The framework integrates three components: (i) an experimental design that defines the probability distribution of treatment assignments, (ii) a mapping that relates experimental treatment assignments to exposures received by units in the experiment, and (iii) estimands that make use of the experiment to answer questions of substantive interest. We develop the case of estimating average unit-level causal effects from a randomized experiment with interference of arbitrary but known form. The resulting estimators are based on inverse probability weighting.We provide randomization-based variance estimators that account for the complex clustering that can occur when interference is present. We also establish consistency and asymptotic normality under local dependence assumptions. We discuss refinements including covariate-adjusted effect estimators and ratio estimation. We evaluate empirical performance in realistic settings with a naturalistic simulation using social network data from American schools. We then present results from a field experiment on the spread of anti-conflict norms and behavior among school students.
Article
Interest in social network analysis has exploded in the past few years, partly thanks to the advancements in statistical methods and computing for network analysis. A wide range of the methods for network analysis is already covered by existent R packages. However, no comprehensive packages are available to calculate group centrality scores and to identify key players (i.e., those players who constitute the most central group) in a network. These functionalities are important because, for example, many social and health interventions rely on key players to facilitate the intervention. Identifying key players is challenging because players who are individually the most central are not necessarily the most central as a group due to redundancy in their connections. In this paper we develop methods and tools for computing group centrality scores and for identifying key players in social networks. We illustrate the methods using both simulated and empirical examples. The package keyplayer providing the presented methods is available from Comprehensive R Archive Network (CRAN).
Article
We consider inference about the causal effect of a treatment or exposure in the presence of interference, i.e., when one individual’s treatment affects the outcome of another individual. In the observational setting where the treatment assignment mechanism is not known, inverse probability-weighted estimators have been proposed when individuals can be partitioned into groups such that there is no interference between individuals in different groups. Unfortunately this assumption, which is sometimes referred to as partial interference, may not hold, and moreover existing weighted estimators may have large variances. In this paper we consider weighted estimators that could be employed when interference is present. We first propose a generalized inverse probability-weighted estimator and two Hájek-type stabilized weighted estimators that allow any form of interference. We derive their asymptotic distributions and propose consistent variance estimators assuming partial interference. Empirical results show that one of the Hájek estimators can have substantially smaller finite-sample variance than the other estimators. The different estimators are illustrated using data on the effects of rotavirus vaccination in Nicaragua.
Article
We consider policy evaluations when the Stable Unit Treatment Value Assumption (SUTVA) is violated due to the presence of interference among units. We propose to explicitly model interference as a function of units’ characteristics. Our approach is applied to the evaluation of a policy implemented in Tuscany (a region in Italy) on small handicraft firms. Results show that the benefits from the policy are reduced when treated firms are subject to high levels of interference. Moreover, the average causal effect is slightly underestimated when interference is ignored. We stress the importance of considering possible interference among units when evaluating and planning policy interventions.
Article
The propensity score is the conditional probability of assignment to a particular treatment given a vector of observed covariates. Previous theoretical arguments have shown that subclassification on the propensity score will balance all observed covariates. Subclassification on an estimated propensity score is illustrated, using observational data on treatments for coronary artery disease. Five subclasses defined by the estimated propensity score are constructed that balance 74 covariates, and thereby provide estimates of treatment effects using direct adjustment. These subclasses are applied within sub-populations, and model-based adjustments are then used to provide estimates of treatment effects within these sub-populations. Two appendixes address theoretical issues related to the application: the effectiveness of subclassification on the propensity score in removing bias, and balancing properties of propensity scores with incomplete data.
Chapter
This chapter reviews theoretical developments and empirical studies related to causal inference on social networks from both experimental and observational studies. Discussion is given to the effect of experimental interventions on outcomes and behaviors and how these effects relate to the presence of social ties, the position of individuals within the network, and the underlying structure and properties of the network. The effects of such experimental interventions on changing the network structure itself and potential feedback between behaviors and network changes are also discussed. With observational data, correlations in behavior or outcomes between individuals with network ties may be due to social influence, homophily, or environmental confounding. With cross-sectional data these three sources of correlation cannot be distinguished. Methods employing longitudinal observational data that can help distinguish between social influence, homophily, and environmental confounding are described, along with their limitations. Proposals are made regarding future research directions and methodological developments that would help put causal inference on social networks on a firmer theoretical footing.
Article
William G. Cochran first presented “observational studies” as a topic defined by principles and methods of statistics. Cochran had been an author of the 1964 United States Surgeon General’s Advisory Committee Report, Smoking and Health, which reviewed a vast literature and concluded: “Cigarette smoking is causally related to lung cancer in men; the magnitude of the effect of cigarette smoking far outweighs all other factors. The data for women, though less extensive, point in the same direction (p. 37).” Though there had been some experiments confined to laboratory animals, the direct evidence linking smoking with human health came from observational or nonexperimental studies.
Book
Did mandatory busing programs in the 1970s increase the school achievement of disadvantaged minority youth? Does obtaining a college degree increase an individual's labor market earnings? Did the use of the butterfly ballot in some Florida counties in the 2000 presidential election cost Al Gore votes? If so, was the number of miscast votes sufficiently large to have altered the election outcome? At their core, these types of questions are simple cause-and-effect questions. Simple cause-and-effect questions are the motivation for much empirical work in the social sciences. This book presents a model and set of methods for causal effect estimation that social scientists can use to address causal questions such as these. The essential features of the counterfactual model of causality for observational data analysis are presented with examples from sociology, political science, and economics.
Article
For two-stage randomized experiments assuming partial interference, exact confidence intervals are proposed for treatment effects on a binary outcome. Empirical studies demonstrate the new intervals have narrower width than previously proposed exact intervals based on the Hoeffding inequality.
Book
An observational study is a nonexperimental investigation of the effects caused by a treatment. Unlike an experiment, in an observational study, the investigator does not control the assignment of treatments, with the consequence that the individuals in different treatment groups may not have been comparable prior to treatment. Analytical adjustments, such as matching, are used to remove overt bias, that is, pretreatment differences that are accurately measured and recorded. There may be pretreatment differences that were not recorded, called hidden biases, and addressing these is a central concern.
Chapter
An observational study is an empirical but nonexperimental investigation of the effects caused by a treatment. In an experiment, such as a clinical trial, the investigator assigns subjects at random to treatment groups ensuring that comparable subjects receive competing treatments. In an observational study, the investigator does not control the assignment of treatments, with the consequence that the individuals in different treatment groups may not have been comparable prior to treatment. Analytical adjustments, such as matching, are used to remove overt bias, that is, pretreatment differences that are accurately measured and recorded. In addition, there may be pretreatment differences that were not recorded, called hidden bias. An observational study is designed to permit detection of the most plausible hidden biases. In addition, the analysis of an observational study investigates the sensitivity of conclusions to hidden biases of plausible magnitudes. Strategies for making studies less sensitive to unmeasured biases are discussed.
Article
In this article, we discuss causal inference when there are multiple versions of treatment. The potential outcomes framework, as articulated by Rubin, makes an assumption of no multiple versions of treatment, and here we discuss an extension of this potential outcomes framework to accommodate causal inference under violations of this assumption. A variety of examples are discussed in which the assumption may be violated. Identification results are provided for the overall treatment effect and the effect of treatment on the treated when multiple versions of treatment are present and also for the causal effect comparing a version of one treatment to some other version of the same or a different treatment. Further identification and interpretative results are given for cases in which the version precedes the treatment as when an underlying treatment variable is coarsened or dichotomized to create a new treatment variable for which there are effectively “multiple versions”. Results are also given for effects defined by setting the version of treatment to a prespecified distribution. Some of the identification results bear resemblance to identification results in the literature on direct and indirect effects. We describe some settings in which ignoring multiple versions of treatment, even when present, will not lead to incorrect inferences.
Article
Estimating peer effects with observational data is very difficult because of contextual confounding, peer selection, simultaneity bias, and measurement error, etc. In this paper, I show that instrumental variables (IVs) can help to address these problems in order to provide causal estimates of peer effects. Based on data collected from over 4,000 students in six middle schools in China, I use the IV methods to estimate peer effects on smoking. My design-based IV approach differs from previous ones in that it helps to construct potentially strong IVs and to directly test possible violation of exogeneity of the IVs. I show that measurement error in smoking can lead to both under- and imprecise estimations of peer effects. Based on a refined measure of smoking, I find consistent evidence for peer effects on smoking. If a student’s best friend smoked within the past 30 days, the student was about one fifth (as indicated by the OLS estimate) or 40 percentage points (as indicated by the IV estimate) more likely to smoke in the same time period. The findings are robust to a variety of robustness checks. I also show that sharing cigarettes may be a mechanism for peer effects on smoking. A 10% increase in the number of cigarettes smoked by a student’s best friend is associated with about 4% increase in the number of cigarettes smoked by the student in the same time period.
Article
Clustered treatment assignment occurs when individuals are grouped into clusters prior to treatment and whole clusters, not individuals, are assigned to treatment or control. In randomized trials, clustered assignments may be required because the treatment must be applied to all children in a classroom, or to all patients at a clinic, or to all radio listeners in the same media market. The most common cluster randomized design pairs 2S clusters into S pairs based on similar pretreatment covariates, then picks one cluster in each pair at random for treatment, the other cluster being assigned to control. Typically, group randomization increases sampling variability and so is less efficient, less powerful, than randomization at the individual level, but it may be unavoidable when it is impractical to treat just a few people within each cluster. Related issues arise in nonrandomized, observational studies of treatment effects, but in this case one must examine the sensitivity of conclusions to bias from nonrandom selection of clusters for treatment. Although clustered assignment increases sampling variability in observational studies, as it does in randomized experiments, it also tends to decrease sensitivity to unmeasured biases, and as the number of cluster pairs increases the latter effect overtakes the former, dominating it when allowance is made for nontrivial biases in treatment assignment. Intuitively, a given magnitude of departure from random assignment can do more harm if it acts on individual students than if it is restricted to act on whole classes, because the bias is unable to pick the strongest individual students for treatment, and this is especially true if a serious effort is made to pair clusters that appeared similar prior to treatment. We examine this issue using an asymptotic measure, the design sensitivity, some inequalities that exploit convexity, simulation, and an application concerned with the flooding of villages in Bangladesh.
Article
Recently, increasing attention has focused on making causal inference when interference is possible. In the presence of interference, treatment may have several types of effects. In this paper, we consider inference about such effects when the population consists of groups of individuals where interference is possible within groups but not between groups. A two stage randomization design is assumed where in the first stage groups are randomized to different treatment allocation strategies and in the second stage individuals are randomized to treatment or control conditional on the strategy assigned to their group in the first stage. For this design, the asymptotic distributions of estimators of the causal effects are derived when either the number of individuals per group or the number of groups grows large. Under certain homogeneity assumptions, the asymptotic distributions provide justification for Wald-type confidence intervals (CIs) and tests. Empirical results demonstrate the Wald CIs have good coverage in finite samples and are narrower than CIs based on either the Chebyshev or Hoeffding inequalities provided the number of groups is not too small. The methods are illustrated by two examples which consider the effects of cholera vaccination and an intervention to encourage voting.
Article
If an experimental treatment is experienced by both treated and control group units, tests of hypotheses about causal effects may be difficult to conceptualize, let alone execute. In this article, we show how counterfactual causal models may be written and tested when theories suggest spillover or other network-based interference among experimental units. We show that the "no interference" assumption need not constrain scholars who have interesting questions about interference. We offer researchers the ability to model theories about how treatment given to some units may come to influence outcomes for other units. We further show how to test hypotheses about these causal effects, and we provide tools to enable researchers to assess the operating characteristics of their tests given their own models, designs, test statistics, and data. The conceptual and methodological framework we develop here is particularly applicable to social networks, but may be usefully deployed whenever a researcher wonders about interference between units. Interference between units need not be an untestable assumption; instead, interference is an opportunity to ask meaningful questions about theoretically interesting phenomena.
Article
Peer influence and social interactions can give rise to spillover effects in which the exposure of one individual may affect outcomes of other individuals. Even if the intervention under study occurs at the group or cluster level as in group-randomized trials, spillover effects can occur when the mediator of interest is measured at a lower level than the treatment. Evaluators who choose groups rather than individuals as experimental units in a randomized trial often anticipate that the desirable changes in targeted social behaviors will be reinforced through interference among individuals in a group exposed to the same treatment. In an empirical evaluation of the effect of a school-wide intervention on reducing individual students' depressive symptoms, schools in matched pairs were randomly assigned to the 4Rs intervention or the control condition. Class quality was hypothesized as an important mediator assessed at the classroom level. We reason that the quality of one classroom may affect outcomes of children in another classroom because children interact not simply with their classmates but also with those from other classes in the hallways or on the playground. In investigating the role of class quality as a mediator, failure to account for such spillover effects of one classroom on the outcomes of children in other classrooms can potentially result in bias and problems with interpretation. Using a counterfactual conceptualization of direct, indirect and spillover effects, we provide a framework that can accommodate issues of mediation and spillover effects in group randomized trials. We show that the total effect can be decomposed into a natural direct effect, a within-classroom mediated effect and a spillover mediated effect. We give identification conditions for each of the causal effects of interest and provide results on the consequences of ignoring "interference" or "spillover effects" when they are in fact present. Our modeling approach disentangles these effects. The analysis examines whether the 4Rs intervention has an effect on children's depressive symptoms through changing the quality of other classes as well as through changing the quality of a child's own class.
Article
The most important political processes in Mexican politics including presidential succession since the 1920s have been conducted within a network whose political rationale has been political stability. All presidential elections have been won by a single political party.We analyzed the role of the network and presidential successions measuring significant relationships using the system UCINET IV. We contrasted computer distributions with historical facts. Applying the structural block model algorithm we found two well differentiated sub-networks, one representing a military-based group and the second a financial-based group.Measuring the network's centrality is one of the main objectives of network analysis for understanding concentrations of power and the distribution of influence in the political system. In this article we evaluate the maximum node and clique network index value concentration for the core of the Mexican network of power.Centrality and power indexes in the network are presented and their results are discussed in connection with cohesiveness.
Article
This paper is about the logic of interpreting recursive causal theories in sociology. We review the distinction between associations and effects and discuss the decomposition of effects into direct and indirect components. We then describe a general method for decomposing effects into their components by the systematic application of ordinary least squares regression. The method involves successive computation of reduced-form equations, beginning with an equation containing only exogenous variables, then computing equations which add intervening variables in sequence from cause to effect. This generates all the information required to decompose effects into their various direct and indirect parts. This method is a substitute for the often more cumbersome computation of indirect effects from the structural coefficients (direct effects) of the causal model. Finally, we present a way of summarizing this information in tabular form and illustrate the procedures using an empirical example.
Article
This article considers the policy of retaining low-achieving children in kindergarten rather than promoting them to first grade. Under the stable unit treatment value assumption (SUTVA) as articulated by Rubin, each child at risk of retention has two potential outcomes: Y(1) if retained and Y(0) if promoted. But SUTVA is questionable, because a child's potential outcomes will plausibly depend on which school that child attends and also on treatment assignments of other children. We develop a causal model that allows school assignment and peer treatments to affect potential outcomes. We impose an identifying assumption that peer effects can be summarized through a scalar function of the vector of treatment assignments in a school. Using a large, nationally representative sample, we then estimate (1) the effect of being retained in kindergarten rather than being promoted to the first grade in schools having a low retention rate, (2) the retention effect in schools having a high retention rate, and (3) the e...
Article
In the Prospect Study, in 10 pairs of two primary-care practices, one practice was picked at random to receive a “depression care manager” to treat its depressed patients. Randomization inference, properly performed, reflects the assignment of practices, not patients, to treatment or control. Yet, pertinent data describe individual patients: depression outcomes, baseline covariates, compliance with treatment. The methods discussed use only (i) the random assignment of clusters to treatment or control and (ii) the hypothesis about effects being tested or inverted for confidence intervals, so they are randomization inferences in Fisher’s strict sense. There is no assumption that the covariance model generated the data, that compliers resemble noncompliers, that dependence is from additive random cluster effects, that individuals in a same cluster do not interfere with one another, or that units are sampled from a population. We contrast methods of covariance adjustment, never assuming the models are “true,” obtaining exact randomization inferences. We consider exact inference about effects proportional to doses with noncompliance and effects whose magnitude varies with the degree of improvement that would occur without treatment. A simulation examines power.
Article
During the past 20 years, social scientists using observational studies have generated a large and inconclusive literature on neighborhood effects. Recent workers have argued that estimates of neighborhood effects based on randomized studies of housing mobility, such as the "Moving to Opportunity" (MTO) demonstration, are more credible. These estimates are based on the implicit assumption of no interference between units; that is, a subject's value on the response depends only on the treatment to which that subject is assigned, not on the treatment assignments of other subjects. For the MTO studies, this assumption is not reasonable. Although little work has been done on the definition and estimation of treatment effects when interference is present, interference is common in studies of neighborhood effects and in many other social settings (e.g., schools and networks), and when data from such studies are analyzed under the "no-interference assumption," very misleading inferences can result. Furthermore, the consequences of interference (e.g., spillovers) should often be of great substantive interest, even though little attention has been paid to this. Using the MTO demonstration as a concrete context, this article develops a framework for causal inference when interference is present and defines a number of causal estimands of interest. The properties of the usual estimators of treatment effects, which are unbiased and/or consistent in randomized studies without interference, are also characterized. When interference is present, the difference between a treatment group mean and a control group mean (unadjusted or adjusted for covariates) estimates not an average treatment effect, but rather the difference between two effects defined on two distinct subpopulations. This result is of great importance, for a researcher who fails to recognize this could easily infer that a treatment is beneficial when in fact it is universally harmful.
Article
Vaccination of one person may prevent the infection of another either because the vaccine prevents the first from being infected and from infecting the second, or because, even if the first person is infected, the vaccine may render the infection less infectious. We might refer to the first of these mechanisms as a contagion effect and the second as an infectiousness effect. In the simple setting of a randomized vaccine trial with households of size two, we use counterfactual theory under interference to provide formal definitions of a contagion effect and an unconditional infectiousness effect. Using ideas analogous to mediation analysis, we show that the indirect effect (the effect of one person's vaccine on another's outcome) can be decomposed into a contagion effect and an unconditional infectiousness effect on the risk difference, risk ratio, odds ratio, and vaccine efficacy scales. We provide identification assumptions for such contagion and unconditional infectiousness effects and describe a simple statistical technique to estimate these effects when they are identified. We also give a sensitivity analysis technique to assess how inferences would change under violations of the identification assumptions. The concepts and results of this paper are illustrated with hypothetical vaccine trial data.
Article
Interference between units may pose a threat to unbiased causal inference in randomized controlled experiments. Although the assumption of no interference is essential for causal inference, few options are available for testing this assumption. This paper presents the first reliable ex post method for detecting interference between units in randomized experiments. Naive estimators of interference that attempt to exploit the proximity of units may be biased because simple randomization of units into treatment does not imply simple randomization of proximity to treated units. However, through a randomization-based approach, the confounding associated with these naive estimators may be circumvented entirely. With a test statistic of the analyst's choice, a conditional randomization test allows for the calculation of the exact significance of the causal dependence of outcomes on the treatment status of other units. The efficacy and robustness of the method is demonstrated through simulation studies and, using this method, interference between units is detected in a field experiment designed to assess the effect of mailings on voter turnout.
Article
The propensity score is the conditional probability of assignment to a particular treatment given a vector of observed covariates. Both large and small sample theory show that adjustment for the scalar propensity score is sufficient to remove bias due to all observed covariates. Applications include: (i) matched sampling on the univariate propensity score, which is a generalization of discriminant matching, (ii) multivariate adjustment by subclassification on the propensity score where the same subclasses are used to estimate treatment effects for all outcome variables and in all subpopulations, and (iii) visual representation of multivariate covariance adjustment by a two-dimensional plot.
Article
We assessed the forgetting of friends and its effects on measuring personal and social network characteristics and properties. All 217 residents of a university residence hall first recalled as many of their friends in the hall as they could. Then, on a complete list of hall residents, residents indicated other friends they forgot to recall. On average, residents forgot 20% of their friends. Residents' demographic characteristics are unrelated to the proportion of friends forgotten. However, the number of friends recalled correlates moderately positively with the number of friends forgotten. Recalled and forgotten friends do not differ appreciably in terms of their individual characteristics, although residents on average had modestly closer relationships with recalled friends than forgotten friends. Forgetting also influenced the measurement of some social network structural properties, such as density, number of cliques, centralization, and individuals' centralities. More research is required to determine whether forgetting distorts measurement of structural properties in other settings.
Article
Part I. Introduction: Networks, Relations, and Structure: 1. Relations and networks in the social and behavioral sciences 2. Social network data: collection and application Part II. Mathematical Representations of Social Networks: 3. Notation 4. Graphs and matrixes Part III. Structural and Locational Properties: 5. Centrality, prestige, and related actor and group measures 6. Structural balance, clusterability, and transitivity 7. Cohesive subgroups 8. Affiliations, co-memberships, and overlapping subgroups Part IV. Roles and Positions: 9. Structural equivalence 10. Blockmodels 11. Relational algebras 12. Network positions and roles Part V. Dyadic and Triadic Methods: 13. Dyads 14. Triads Part VI. Statistical Dyadic Interaction Models: 15. Statistical analysis of single relational networks 16. Stochastic blockmodels and goodness-of-fit indices Part VII. Epilogue: 17. Future directions.
Article
Uncontrolled confounding in observational studies gives rise to biased effect estimates. Sensitivity analysis techniques can be useful in assessing the magnitude of these biases. In this paper, we use the potential outcomes framework to derive a general class of sensitivity-analysis formulas for outcomes, treatments, and measured and unmeasured confounding variables that may be categorical or continuous. We give results for additive, risk-ratio and odds-ratio scales. We show that these results encompass a number of more specific sensitivity-analysis methods in the statistics and epidemiology literature. The applicability, usefulness, and limits of the bias-adjustment formulas are discussed. We illustrate the sensitivity-analysis techniques that follow from our results by applying them to 3 different studies. The bias formulas are particularly simple and easy to use in settings in which the unmeasured confounding variable is binary with constant effect on the outcome across treatment levels.
Article
This paper develops a formal language for study of treatment response with social interactions, and uses it to obtain new findings on identification of potential outcome distributions. Defining a person's treatment response to be a function of the entire vector of treatments received by the population, I study identification when shape restrictions and distributional assumptions are placed on response functions. An early key result is that the traditional assumption of individualistic treatment response (ITR) is a polar case within the broad class of constant treatment response (CTR) assumptions, the other pole being unrestricted interactions. Important non-polar cases are interactions within reference groups and distributional interactions. I show that established findings on identification under assumption ITR extend to assumption CTR. These include identification with assumption CTR alone and when this shape restriction is strengthened to semi-monotone response. I next study distributional assumptions using instrumental variables. Findings obtained previously under assumption ITR extend when assumptions of statistical independence (SI) are posed in settings with social interactions. However, I find that random assignment of realized treatments generically has no identifying power when some persons are leaders who may affect outcomes throughout the population. Finally, I consider use of models of endogenous social interactions to derive restrictions on response functions. I emphasize that identification of potential outcome distributions differs from the longstanding econometric concern with identification of structural functions. This paper is a revised version of CWP01/10
Article
Many questions about the social organization of medicine and health services involve interdependencies among social actors that may be depicted by networks of relationships. Social network studies have been pursued for some time in social science disciplines, where numerous descriptive methods for analyzing them have been proposed. More recently, interest in the analysis of social network data has grown among statisticians, who have developed more elaborate models and methods for fitting them to network data. This article reviews fundamentals of, and recent innovations in, social network analysis using a physician influence network as an example. After introducing forms of network data, basic network statistics, and common descriptive measures, it describes two distinct types of statistical models for network data: individual-outcome models in which networks enter the construction of explanatory variables, and relational models in which the network itself is a multivariate dependent variable. Complexities in estimating both types of models arise due to the complex correlation structures among outcome measures.
Article
Although published works rarely include causal estimates from more than a few model specifications, authors usually choose the presented estimates from numerous trial runs readers never see. Given the often large variation in estimates across choices of control variables, functional forms, and other modeling assumptions, how can researchers ensure that the few estimates presented are accurate or representative? How do readers know that publications are not merely demonstrations that it is possible to find a specification that fits the author's favorite hypothesis? And how do we evaluate or even define statistical properties like unbiasedness or mean squared error when no unique model or estimator even exists? Matching methods, which offer the promise of causal inference with fewer assumptions, constitute one possible way forward, but crucial results in this fast-growing methodological literature are often grossly misinterpreted. We explain how to avoid these misinterpretations and propose a unified approach that makes it possible for researchers to preprocess data with matching (such as with the easy-to-use software we offer) and then to apply the best parametric techniques they would have used anyway. This procedure makes parametric models produce more accurate and considerably less model-dependent causal inferences.