DAG involving time-invariant IV  for causal estimation of  when . The variables  and  () are observed and unobserved individual predictors of , respectively, that may also affect tie-formation. While  can be conditioned on  cannot, necessitating the use of IV-methods. When  (one follow-up period),  instruments ; when  (the case presented here),  instruments both  and ; and so on until  instruments . IV identification is reliant on  being observed so that they can be instrumented (if dim) and  not being causes of  (i.e., they cannot contribute to homophily).

DAG involving time-invariant IV for causal estimation of when . The variables and () are observed and unobserved individual predictors of , respectively, that may also affect tie-formation. While can be conditioned on cannot, necessitating the use of IV-methods. When (one follow-up period), instruments ; when (the case presented here), instruments both and ; and so on until instruments . IV identification is reliant on being observed so that they can be instrumented (if dim) and not being causes of (i.e., they cannot contribute to homophily).

Source publication
Article
Full-text available
The identification of causal peer effects (also known as social contagion or induction) from observational data in social networks is challenged by two distinct sources of bias: latent homophily and unobserved confounding. In this paper, we investigate how causal peer effects of traits and behaviors can be identified using genes (or other structura...

Citations

... Past studies on inferring network effects have attempted to develop a better understanding of the role of homophily in network formation, and to test hypotheses about the relative prevalence of homophily and peer influence in social behavior using techniques such as instrumental variables [89], propensity score matching [90], SAOMs [17], [65]- [69], ERGMs [16], and latent space models [91], [92]. In this section, we will illustrate how our simulation strategy builds upon and extends [17], [93] to obtain estimates of homophily and influence based on network and behavioral characteristics across different sampling conditions. ...
Preprint
Evaluating the impact of policy interventions on respondents who are embedded in a social network is often challenging due to the presence of network interference within the treatment groups, as well as between treatment and non-treatment groups throughout the network. In this paper, we propose a modeling strategy that combines existing work on stochastic actor-oriented models (SAOM) and diffusion contagion models with a novel network sampling method based on the identification of independent sets. By assigning respondents from an independent set to the treatment, we are able to block any direct spillover of the treatment, thereby allowing us to isolate the direct effect of the treatment from the indirect network-induced effects. As a result, our method allows for the estimation of both the direct as well as the net effect of a chosen policy intervention, in the presence of network effects in the population. We perform a comparative simulation analysis to show that the choice of sampling technique leads to significantly distinct estimates for both direct and net effects of the policy, as well as for the relevant network effects, such as homophily. Furthermore, using a modified diffusion contagion model, we show that our proposed sampling technique leads to greater and faster spread of the policy-linked behavior through the network. This study highlights the importance of network sampling techniques in improving policy evaluation studies and has the potential to help researchers and policymakers with better planning, designing, and anticipating policy responses in a networked society.
... It is plausible that there may be cross-trait social genetic effects given large phenotypic and genotypic correlations between BMI, drinking, and smoking (Liu et al. 2019) and the possibility that genes have 'pleiotropic effects' (having effects on multiple phenotypes; Lee et al. 2012;Visscher et al. 2017). A suggestion to study social influence effects using genes as instrumental variables forms another approach that may give insight here, but suffers from stricter assumptions (O'Malley et al. 2014). ...
Article
Full-text available
Partners resemble each other in health behaviors and outcomes such as alcohol use, smoking, physical activity, and obesity. While this is consistent with social contagion theory suggesting partner influence, it is notoriously difficult to establish causality because of assortative mating and contextual confounding. We offer a novel approach to studying social contagion in health in long-term partnerships by combining genetic data of both partners in married/cohabiting couples with longitudinal data on their health behaviors and outcomes. We examine the influence of the partner’s genetic predisposition for three health outcomes and behaviors (BMI, smoking, and drinking) among married/cohabiting couples. We use longitudinal data from the Health and Retirement Study and the English Longitudinal Study of Ageing with data on health outcomes and genotypes for both partners. Results show that changes over time in BMI, smoking, and drinking depend on the partner’s genetic predispositions to these traits. These findings underline the importance of people’s social surroundings for their health and highlight the potential of targeting health interventions at couples. Supplementary Information The online version contains supplementary material available at 10.1007/s10519-023-10147-w.
... Addressing collider bias in the causal modelling of social network data and learning-based hypotheses presents additional challenges (Lyons, 2011;Shalizi & Thomas, 2011). While some traditional approaches for addressing bias may assist with collider bias [i.e., counterfactual (Elwert & Christakis, 2008) or instrumental variable approaches (O'Malley et al., 2014)], several statistical models unique to network analysis have also been introduced as possible solutions. Shalizi and Thomas (2011) propose a model of tie formation and dissolution, aiming to identify instances where behaviors spread in response to tie formation and cease in response to tie dissolution, eliminating instances where an outcome is maintained by similar characteristics rather than social contagion (Krivitsky & Handcock, 2014). ...
Article
Objectives We provide a brief overview of collider bias and its implications for criminological research. Methods Owing to the nature of the topics studied, as well as the common data sources used to carry out much of this research, work in the field may often become vulnerable to a specific methodological problem known as collider bias. Collider bias occurs when exposure variables and outcomes independently cause a third variable, and this variable is included in statistical models. Colliders represent somewhat of a paradox in that there is scholarship discussing the issue, yet it has managed to remain a relatively cryptic threat compared to other sources of bias. Results We argue that, far from being an obscure concern, colliders almost certainly have pervasive impact in criminal justice and criminology. Conclusion We close by offering a general set of strategies for addressing the challenges posed by collider bias. While there is no panacea, there are better practices, many of which are underutilized in the disciplines that study crime and it's attendant topics.
... In sociology and social sciences, the purpose of causal inference is to examine the association between social network and behaviors, also known as peer effects, social contagion or induction [173]- [175]. The peer effect means that the behavior, traits, or characteristics of an individual's peers (those he is connected to or alters) would affect his behavior [175]. ...
... In sociology and social sciences, the purpose of causal inference is to examine the association between social network and behaviors, also known as peer effects, social contagion or induction [173]- [175]. The peer effect means that the behavior, traits, or characteristics of an individual's peers (those he is connected to or alters) would affect his behavior [175]. Due to contextual confounding, peer selection, simultaneity bias and measurement error, [176] points that it is very difficult to estimate the peer effects from observational data but instrumental variables (IVs) can help to address these problems. ...
Preprint
Full-text available
Causal inference is the process of using assumptions, study designs, and estimation strategies to draw conclusions about the causal relationships between variables based on data. This allows researchers to better understand the underlying mechanisms at work in complex systems and make more informed decisions. In many settings, we may not fully observe all the confounders that affect both the treatment and outcome variables, complicating the estimation of causal effects. To address this problem, a growing literature in both causal inference and machine learning proposes to use Instrumental Variables (IV). This paper serves as the first effort to systematically and comprehensively introduce and discuss the IV methods and their applications in both causal inference and machine learning. First, we provide the formal definition of IVs and discuss the identification problem of IV regression methods under different assumptions. Second, we categorize the existing work on IV methods into three streams according to the focus on the proposed methods, including two-stage least squares with IVs, control function with IVs, and evaluation of IVs. For each stream, we present both the classical causal inference methods, and recent developments in the machine learning literature. Then, we introduce a variety of applications of IV methods in real-world scenarios and provide a summary of the available datasets and algorithms. Finally, we summarize the literature, discuss the open problems and suggest promising future research directions for IV methods and their applications. We also develop a toolkit of IVs methods reviewed in this survey at https://github.com/causal-machine-learning-lab/mliv.
... Individual outcomes can also be influenced by the macro features of a network, such as cohesion, hierarchy, clustering, and composition. Cohesion describes how densely connected a network is The instrumental variables approach is discussed by Bramoullé et al. (2009), O'Malley et al. (2014, and An (2015a). Randomized peer treatment is discussed by An ( , 2015b. ...
Article
Fueled by recent advances in statistical modeling and the rapid growth of network data, social network analysis has become increasingly popular in sociology and related disciplines. However, a significant amount of work in the field has been descriptive and correlational, which prevents the findings from being more rigorously translated into practices and policies. This article provides a review of the popular models and methods for causal network analysis, with a focus on causal inference threats (such as measurement error, missing data, network endogeneity, contextual confounding, simultaneity, and collinearity) and potential solutions (such as instrumental variables, specialized experiments, and leveraging longitudinal data). It covers major models and methods for both network formation and network effects and for both sociocentric networks and egocentric networks. Lastly, this review also discusses future directions for causal network analysis. Expected final online publication date for the Annual Review of Sociology, Volume 48 is July 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
... We found the effect of an omitted confounder would need to have a combined relationship with the outcome (employees' purchases at transaction t) and predictor (coworkers' purchase with the employee at t − 1) larger than any measured for the observed covariates in our models before our statistically significant measures were rendered non-significant (Supplementary Table 10). Finally, we conducted an instrumental variables analysis using purchases by a second-degree coworker (the coworker's coworker) as a putatively exogenous source of variation in the coworker's purchasing, thus eliminating concerns of homophily to the extent that it manifests as unmeasured confounding ( Supplementary Fig. 9) 47 . For these analyses, we began with the same subset of data used above in Fig. 3 involving the strictest definition of a social tie (tie probability ≥0.6), but to justify the instrumental variable exclusion restriction, we excluded any second-degree alters of a given ego that were also first-degree alters of that same ego during the study period, as well second-degree alters that were egos in the analysis themselves. ...
... The exception to that caveat comes from our instrumental variables analysis that, if its assumptions hold, would remove all bias due to homophily, confounding or simultaneity. We reiterate our caution in accepting that our available instrument satisfies the exclusion restriction in the presence of homophily, confounding or simultaneity and the independence assumption required for valid instrumental variables estimation 47 . Future work could consider using different instrumental variables (including the special case of randomized assignment to peers/networks) in a completely identified, closed network, although these are challenging methods to execute in real-world scenarios. ...
... The data setup for each stage (that is, the timing of purchases included as dependent and independent variables) was identical to that used in the GEE analyses. In the first stage regression, purchases by second-degree alters (the primary alters' alters, Supplementary Fig. 9) were used as an instrument for primary alters' purchases 47 . To strengthen the plausibility of the exclusion restriction, only purchases by secondary alters who did not appear elsewhere in the data as either first-degree alters or egos were included as instruments. ...
Article
Full-text available
Unhealthy food choice is an important driver of obesity, but research examining the relationship of food choices and social influence has been limited. We sought to assess associations in the healthfulness of workplace food choices among a large population of diverse employees whose food-related social connections were identified using passively collected data in a validated model. Data were drawn from 3 million encounters where pairs of employees made purchases together in 2015–2016. The healthfulness of food items was defined by ‘traffic light’ labels. Cross-sectional simultaneously autoregressive models revealed that proportions of both healthy and unhealthy items purchased were positively associated between connected employees. Longitudinal generalized estimating equation models also found positive associations between an employee’s current food purchase and the most recent previous food purchase a coworker made together with the employee. These data indicate that workplace interventions to promote healthy eating and reduce obesity should test peer-based strategies.
... Early work by Manski (53) demonstrates unidentifiability of the peer effects in the presence of social confounders under linear outcome models. Recent works in causal inference show that confounding social covariates lead to unidentifiability and biased estimates of causal effects (64), especially on social networks (65,66), and how longitudinal studies (67,68) and design of experiment for specific peer effects (69) provide a way forward. ...
Article
Full-text available
Significance Hostile influence operations (IOs) that weaponize digital communications and social media pose a rising threat to open democracies. This paper presents a system framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing, machine learning, graph analytics, and network causal inference to quantify the impact of individual actors in spreading the IO narrative. We present a classifier that detects reported IO accounts with 96% precision, 79% recall, and 96% AUPRC, demonstrated on real social media data collected for the 2017 French presidential election and known IO accounts disclosed by Twitter. Our system also discovers salient network communities and high-impact accounts that are independently corroborated by US Congressional reports and investigative journalism.
... Although Instrumental Variable analysis (21)(22)(23)(24), Difference-in-Differences (25, 26) and ...
Article
Full-text available
The initial aim of environmental epidemiology is to estimate the causal effects of environmental exposures on health outcomes. However, due to lack of enough covariates in most environmental datasets, the current methods without enough adjustments for confounders inevitably lead to residual confounding. We proposed using Negative Control Exposure (NCE) model in time-series studies to effectively eliminate unobserved confounders using a postoutcome exposure as an auxiliary variable. Furthermore, we provided a serious theoretical justifications and simulated evidence on the validity and efficacy of the NCE model including continuous and categories variables. Finally, the potential of the NCE model was illustrated in two challenging application. We find that living in areas with higher levels of surrounding greenness after six months has less risks in stroke-specific mortality. Furthermore, the widelyestablished negative association between temperatures and majority of cancer risks, which is low average air temperatures lead to higher risk for cancers, is actually caused by numbers of unobserved confounders. The proposed strategy is implemented in an R package called PAV, freely available on Github.
... For example, Duncan et al. used a friend's intelligence as an instrument for the friend's occupational and educational aspirations to estimate peer effects on an individual's aspirations [49]. O'Malley et al. used genetic alleles as IVs for friends' BMI to estimate social contagion effects on weight status [50]. An used friends' family smoking status as an IV for friends' smoking status to estimate peer effects on smoking [51]. ...
Article
Full-text available
Contagion effects, sometimes referred to as spillover or influence effects, have long been central to the study of human disease and health networks. Accurate estimation and identification of contagion effects are important in terms of understanding the spread of human disease and health behavior, and they also have various implications for designing effective public health interventions. However, many challenges remain in estimating contagion effects and it is often unclear when it is difficult to correctly estimate contagion effects, or why a particular method would need to be applied. In this review we explain the challenges in estimating contagion effects, and how they can be framed as an omitted variable bias problem. We then discuss how such challenges have been addressed in randomized experiments and traditional statistical analyses, as well as several state-of-the-art statistical methods. Finally, we conclude by summarizing recent advancements and noting remaining challenges, as well as appropriate next steps.
... Early work by Manski (53) demonstrates unidentifiability of the peer effects in the presence of social confounders under linear outcome models. Recent works in causal inference show that confounding social covariates lead to unidentifiability and biased estimates of causal effects (64), especially on social networks (65,66), and how longitudinal studies (67,68) and design of experiment for specific peer effects (69) provide a way forward. ...
Preprint
Full-text available
The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IO). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing, machine learning, graph analytics, and a novel network causal inference approach to quantify the impact of individual actors in spreading IO narratives. We demonstrate its capability on real-world hostile IO campaigns with Twitter datasets collected during the 2017 French presidential elections, and known IO accounts disclosed by Twitter. Our system detects IO accounts with 96% precision, 79% recall, and 96% area-under-the-PR-curve, maps out salient network communities, and discovers high-impact accounts that escape the lens of traditional impact statistics based on activity counts and network centrality. Results are corroborated with independent sources of known IO accounts from U.S. Congressional reports, investigative journalism, and IO datasets provided by Twitter.