Article

Causality: Models, Reasoning and Inference

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Under the setting of a causal DAG model, causal effects are defined via external interventions, which can be directly estimated from experimental data. When the underlying DAG is given, we are also able to obtain consistent estimates of causal effects from observational data via backdoor adjustment [14]. As the DAG is unknown, we take an additional step to identify candidate backdoor adjustment sets using both experimental and observational data. ...
... In a causal graphical model [14], a DAG G models the causal relationships among a set of random variables X " tX 1 ,¨¨¨, X p u. A directed edge from X i to X j in the causal DAG G means that X i is a direct cause of X j , in which case X i is called a parent of X j and X j is called a child of X i . ...
... where S satisfies the backdoor criterion relative to pX i , Y q [14] and β Xi pY " X i`S q denotes the regression coefficient corresponding to the term X i when regressing Y on pX i , Sq, i.e., EpY |X i , Sq " β Xi X i`β J S S. Therefore, the causal effect of X i on Y can be estimated using not only the experimental data under intervention on X i but also observational data as long as a backdoor adjustment set S is provided. ...
Preprint
Full-text available
The causal bandit problem aims to sequentially learn the intervention that maximizes the expectation of a reward variable within a system governed by a causal graph. Most existing approaches assume prior knowledge of the graph structure, or impose unrealistically restrictive conditions on the graph. In this paper, we assume a Gaussian linear directed acyclic graph (DAG) over arms and the reward variable, and study the causal bandit problem when the graph structure is unknown. We identify backdoor adjustment sets for each arm using sequentially generated experimental and observational data during the decision process, which allows us to estimate causal effects and construct upper confidence bounds. By integrating estimates from both data sources, we develop a novel bandit algorithm, based on modified upper confidence bounds, to sequentially determine the optimal intervention. We establish both case-dependent and case-independent upper bounds on the cumulative regret for our algorithm, which improve upon the bounds of the standard multi-armed bandit algorithms. Our empirical study demonstrates its advantage over existing methods with respect to cumulative regret and computation time.
... The goal of CST is to be both an actionable and meaningful framework. It combines (structural) 1 counterfactuals (Pearl, 2009) with situation testing (Thanh et al., 2011). Counterfactuals answer to counterfactual queries, such as the one motivating the CMD, and are generated via structural causal models. ...
... Such an approach goes against how most social scientists interpret the protected attribute and its role as a social construct when proving discrimination (Bonilla-Silva, 1997;Hanna et al., 2020;Rose, 2022;Sen & Wasow, 2016). It is in that regard where structural causal models (Pearl, 2009) and their ability to generate counterfactuals via the abduction, action, and prediction steps (e.g., Chiappa (2019) and Yang et al. (2021)), including counterfactual fairness , have an advantage. This advantage is overlooked by critics of counterfactual reasoning (Hu & Kohler-Hausmann, 2020;Kasirzadeh & Smart, 2021): generating counterfactuals, as long as the structural causal model is properly specified, accounts for the effects of changing the protected attribute on all other attributes. ...
... A structural causal model (SCM) (Pearl, 2009) M = {S, P U } describes how the set of p variables W = X ∪ A is determined based on corresponding sets of structural equations S, and p latent variables U with prior distribution P U . Each W j ∈ W is assigned a value through a deterministic function f j ∈ S of its causal parents W pa(j) ⊆ W \ {W j } and latent variable U j with distribution P (U j ) ∈ P U . ...
Preprint
Full-text available
We present counterfactual situation testing (CST), a causal data mining framework for detecting individual discrimination in a dataset of classifier decisions. CST answers the question "what would have been the model outcome had the individual, or complainant, been of a different protected status?" It extends the legally-grounded situation testing (ST) of Thanh et al. (2011) by operationalizing the notion of fairness given the difference via counterfactual reasoning. ST finds for each complainant similar protected and non-protected instances in the dataset; constructs, respectively, a control and test group; and compares the groups such that a difference in outcomes implies a potential case of individual discrimination. CST, instead, avoids this idealized comparison by establishing the test group on the complainant's generated counterfactual, which reflects how the protected attribute when changed influences other seemingly neutral attributes of the complainant. Under CST we test for discrimination for each complainant by comparing similar individuals within each group but dissimilar individuals across groups. We consider single (e.g., gender) and multidimensional (e.g., gender and race) discrimination testing. For multidimensional discrimination we study multiple and intersectional discrimination and, as feared by legal scholars, find evidence that the former fails to account for the latter kind. Using a k-nearest neighbor implementation, we showcase CST on synthetic and real data. Experimental results show that CST uncovers a higher number of cases than ST, even when the model is counterfactually fair. In fact, CST extends counterfactual fairness (CF) of Kusner et al. (2017) by equipping CF with confidence intervals.
... ii) Angrist et al. [2] introduced instrumental variables 1 based on the Neyman-Rubin framework. iii) Working at about the same time as Rubin, Pearl started focusing on structural (graph) models [35], which culminated in his celebrated techniques for rigorously analyzing causal effects and confounding Pearl [36]. ...
... We stress that we interpret Figure 1a's equations as capturing the causal dependencies among variables: X and ϵ are drawn randomly (and independent of each other), and their random values determine ("cause") the value of Y in each draw. Correspondingly, the DAG (directed acyclic graph [36]) in Figure 1a captures this causal relation between X and Y in a qualitative way: an edge connects X to Y to denote that the values of X and Y are related; furthermore, the edge is directed from X to Y to denote that changing X directly affects Y, that is, it causes Y to change. Process p 2 in Figure 1b involves a third variable Z. ...
... • In this paper, we use "confounding" to denote bias in the estimate of a causal effect-as we demonstrated in a nutshell in the previous sections. This notion of confounding is customary in modern causal analysis [36], and in related approaches to mitigate confounding, such as instrumental variables [2]. ...
Preprint
Omitted variable bias occurs when a statistical model leaves out variables that are relevant determinants of the effects under study. This results in the model attributing the missing variables' effect to some of the included variables -- hence over- or under-estimating the latter's true effect. Omitted variable bias presents a significant threat to the validity of empirical research, particularly in non-experimental studies such as those prevalent in empirical software engineering. This paper illustrates the impact of omitted variable bias on two case studies in the software engineering domain, and uses them to present methods to investigate the possible presence of omitted variable bias, to estimate its impact, and to mitigate its drawbacks. The analysis techniques we present are based on causal structural models of the variables of interest, which provide a practical, intuitive summary of the key relations among variables. This paper demonstrates a sequence of analysis steps that inform the design and execution of any empirical study in software engineering. An important observation is that it pays off to invest effort investigating omitted variable bias before actually executing an empirical study, because this effort can lead to a more solid study design, and to a significant reduction in its threats to validity.
... Causal Discovery is a field of machine learning and statistics aiming to induce causal knowledge from data [23,36]. There is a large corpus of algorithms and methodologies in the field, spanning tasks like learning causal models, estimating causal effects, and determining optimal interventions. ...
... Two types of graphs are commonly used to represent the causal relationships, the Directed Acyclic Graph (DAG) [23] and the Maximal Ancestral Graph (MAG) [29]. When a DAG is annotated with conditional probability densities, it becomes a quantitative model, namely a Bayesian Network (BN) [23]. ...
... Two types of graphs are commonly used to represent the causal relationships, the Directed Acyclic Graph (DAG) [23] and the Maximal Ancestral Graph (MAG) [29]. When a DAG is annotated with conditional probability densities, it becomes a quantitative model, namely a Bayesian Network (BN) [23]. A DAG contains only directed edges and if it is interpreted causally, a direct edge represents direct causality, i.e., if X −→ Y then X is a direct cause of Y (direct in the context of the observed variables). ...
Chapter
Full-text available
We introduce the concept of Automated Causal Discovery (AutoCD), defined as any system that aims to fully automate the application of causal discovery and causal reasoning methods. AutoCD’s goal is to deliver all causal information that an expert human analyst would provide and answer user’s causal queries. To this goal, we introduce ETIA, a system that performs dimensionality reduction, causal structure learning, and causal reasoning. We present the architecture of ETIA, benchmark its performance on synthetic data sets, and present a use case example. The system is general and can be applied to a plethora of causal discovery problems.
... Inferring causality from observational data circumvents these experimental limitations and offers the potential to develop functional models that accurately depict causal relationships from cause to effect. Consequently, these approaches are of considerable practical importance and applicability (Spirtes et al., 2000;Pearl, 2000). ...
... In the literature, directed acyclic graph (DAG) plays a crucial role in causal inference (Spirtes et al., 2000;Pearl, 2000), which leads to applications in the areas such as fairness and accountability (Kusner et al., 2017), medicine (Heckerman et al., 1992), and finance (Sanford & Moosa, 2012). Constraint-based methods are a kind of methods for causal discovery (Spirtes et al., 2000;Pearl, 2000), which conduct independence tests and determine edge directions. ...
... In the literature, directed acyclic graph (DAG) plays a crucial role in causal inference (Spirtes et al., 2000;Pearl, 2000), which leads to applications in the areas such as fairness and accountability (Kusner et al., 2017), medicine (Heckerman et al., 1992), and finance (Sanford & Moosa, 2012). Constraint-based methods are a kind of methods for causal discovery (Spirtes et al., 2000;Pearl, 2000), which conduct independence tests and determine edge directions. Another kind of methods is score-based methods (Chickering, 2003;Nandy et al., 2018;Tsamardinos et al., 2006;Chickering et al., 2004). ...
Article
Full-text available
Discovering causal graphs from observational data is a challenging problem, which has garnered significant attention due to its crucial role in understanding causal relationships. In recent advancements, this problem is cast as a continuous optimization task with structural constraints, through which the great power of gradient-based methods can be exploited to address the causal discovery problem. Despite their statistical validity, these approaches return causal graphs with spurious edges in the presence of latent variables. In this paper, we generalize the gradient-based method to accommodate the existence of latent confounders and latent intermediate variables. Specifically, we propose a causal discovery method based on latent variable reconstruction. This method primarily consists of two stages. In the first stage, we propose a series of causal models that includes latent variables, which can be applied to different data assumptions. However, due to the influence of latent variables, the causal graph inevitably contains reversed edges. In light of this fact, we propose the method to correct these reversed edges on the second stage via variational autoencoder. Theoretical results show that under some mild conditions, our method can correctly identify the causal relations. Experiments on both synthetic and real datasets demonstrate the superiority of our method to existing gradient-based learning algorithms in the presence of latent variables.
... For example, another qualitative aspect of a probability distribution is that of functional dependence, which is also exploited across computer science to enable compact representations and simplify probabilistic analysis. Acyclic causal models, for instance, specify a distribution via a probability over contexts (the values of variables whose causes are viewed as outside the model), and a collection of equations (i.e., functional dependencies) [18]. And in deep learning, a popular class of models called normalizing flows [25,12] specify a distribution by composing a fixed distribution over some latent space, say a standard normal distribution, with a function (i.e., a functional dependence) fit to observational data. ...
... The idea is to specify the inputs and outputs of a set of independent mechanisms: processes by which some target variables T are determined as a (possibly randomized) function of some source variables S. This idea generalizes intuition going back to Pearl [18] by allowing, for example, two mechanisms to share a target variable. So at a qualitative level, the modeler specifies not a (directed) graph, but a (directed) hypergraph. ...
... Indeed, compatibility lets us go well beyond capturing dependence and independence. The fact that Pearl [18] views causal models as representing independent mechanisms suggests that there might be a connection to causality. In fact, there is. ...
Preprint
Full-text available
We define what it means for a joint probability distribution to be compatible with a set of independent causal mechanisms, at a qualitative level -- or, more precisely, with a directed hypergraph A{\mathcal{A}}, which is the qualitative structure of a probabilistic dependency graph (PDG). When A{\mathcal{A}} represents a qualitative Bayesian network, QIM-compatibility with A{\mathcal{A}} reduces to satisfying the appropriate conditional independencies. But giving semantics to hypergraphs using QIM-compatibility lets us do much more. For one thing, we can capture functional dependencies. For another, we can capture important aspects of causality using compatibility: we can use compatibility to understand cyclic causal graphs, and to demonstrate structural compatibility, we must essentially produce a causal model. Finally, QIM-compatibility has deep connections to information theory. Applying our notion to cyclic structures helps to clarify a longstanding conceptual issue in information theory.
... The predominant approach to analyzing verbs of causing has been to argue that they convey some version of SUFFICIENCY, which is measured given parameters of a causal situation (Nadathur & Lauer 2020, Lauer & Nadathur 2018, Glass 2023, Schulz 2011. Notably, some of this work has leveraged structural causal models (SCMs; Pearl, 2009) to model and make predictions about how we use these verbs (Baglini & Siegal 2021, Nadathur & Siegal 2022, Schulz 2011. In this paper, we argue that the semantics of causing verbs encode not only sufficiency but also intention and the number of feasible alternative actions. ...
... Notably, much of this aforementioned work makes use of the logics of structural causal models (SCMs) from Pearl (2009), which has previously been used to model causal relations between events as well as their counterfactual values. In our paper, we focus on the constructions X caused/made/forced Y (to) Z and argue that the relationship between the verbs cause, make, and force is structured not by sufficiency, intentionality, or alternatives alone, but by some interactions of (at least) these three. ...
... In our paper, we focus on the constructions X caused/made/forced Y (to) Z and argue that the relationship between the verbs cause, make, and force is structured not by sufficiency, intentionality, or alternatives alone, but by some interactions of (at least) these three. In order to support our argument, we run an experiment in which participants' judgements of when the three verbs are appropriate in describing tic-tac-toe sequences is predicted by measures defined using the logics of structural causal models (Pearl 2009 It appears that the acceptability of the causal verb being used is modulated by several attributes of the causal relata. Specifically, the examples give rise to the question -what are the significant differences between the causing events of (5-a) "mentioning how the habit has helped me", (5-b) "criticizing her physical appearance", and (5-c) "holding her child hostage"? ...
Article
Full-text available
We investigate the semantics of the causal verbs cause, make, and force as used in the construction X {caused/made/forced} Y (to) Z. The predominant approach to analyzing verbs of causing has been to argue that they convey some version of SUFFICIENCY, but it has also been suggested that INTENTION or possible ALTERNATIVES may also factor into the semantics of the verbs. Using sequences of tic-tac-toe states as experimental stimuli, we measure the three possible contributing factors in each stimuli and ask participants whether each verb is appropriate for describing the sequence. We find experimental support for a differentiating semantics of these verbs, in which no single predictor is the sole factor in when each verb is appropriate.
... Causal discovery -the process of identifying cause-and-effect relationships from observational data-is a pivotal challenge in artificial intelligence (AI) and machine learning. Unveiling causal structures enables robust predictions, facilitates counterfactual reasoning, and enhances decision-making processes in complex systems [1]. Traditional methods for causal discovery often rely on statistical tests for independence and structural equation modeling, which may not scale efficiently with high-dimensional data or effectively capture intricate non-linear relationships [2,3]. ...
... [cs.LG] 22 Jan 2025 impact on the model's predictions but does not necessarily imply a direct causal relationship with the target variable [6,7]. Moreover, correlations captured by the model may be confounded by hidden variables or represent spurious associations [1]. Therefore, leveraging Shapley values for causal discovery requires careful consideration to avoid misleading inferences. ...
... Structural causal models (SCMs) provide a framework for representing and estimating causal relationships through equations that describe how variables influence one another [1]. Within this category, the LiNGAM algorithm [13] exploits non-Gaussianity in data to identify causal structures in linear models, assuming linear, non-Gaussian, acyclic relationships. ...
Preprint
Full-text available
Explainability techniques hold significant potential for enhancing the causal discovery process, which is crucial for understanding complex systems in areas like healthcare, economics, and artificial intelligence. However, no causal discovery methods currently incorporate explainability into their models to derive causal graphs. Thus, in this paper we explore this innovative approach, as it offers substantial potential and represents a promising new direction worth investigating. Specifically, we introduce REX, a causal discovery method that leverages machine learning (ML) models coupled with explainability techniques, specifically Shapley values, to identify and interpret significant causal relationships among variables. Comparative evaluations on synthetic datasets comprising continuous tabular data reveal that REX outperforms state-of-the-art causal discovery methods across diverse data generation processes, including non-linear and additive noise models. Moreover, REX was tested on the Sachs single-cell protein-signaling dataset, achieving a precision of 0.952 and recovering key causal relationships with no incorrect edges. Taking together, these results showcase REX's effectiveness in accurately recovering true causal structures while minimizing false positive predictions, its robustness across diverse datasets, and its applicability to real-world problems. By combining ML and explainability techniques with causal discovery, REX bridges the gap between predictive modeling and causal inference, offering an effective tool for understanding complex causal structures. REX is publicly available at https://github.com/renero/causalgraph.
... Causal inference [1] is re-emerging as an important tool in the domain of health sciences for informatics work such as finding effects of a drug or risk factors for a disease. Causality has traditionally been a core concept across all branches of medical science and considered when diagnosing patients based on their symptoms, effects of treatment, and years of historical evidence [2]. ...
... These associations can be easily mistaken as causation, making us susceptible to logical fallacies without knowing the real underlying cause. Causal inference is the science of learning cause from effect [1]. It is an important field of research because it helps us eradicate spurious correlation [7,8]. ...
... Causal inference can be either discovered through observational measurements (seeing) or from measurements after performing some external manipulation/intervention (doing). A causal network [1,9] can be represented with a directed acyclic graph ...
Article
Full-text available
Electronic health records (EHRs) provide a rich source of observational patient data that can be explored to infer underlying causal relationships. These causal relationships can be applied to augment medical decision-making or suggest hypotheses for healthcare research. In this study, we explored a large-scale EHR dataset on patients with asthma or related conditions (N = 14,937). The dataset included integrated data on features representing demographic factors, clinical measures, and environmental exposures. The data were accessed via a service named the Integrated Clinical and Environmental Service (ICEES). We estimated underlying causal relationships from the data to identify significant predictors of asthma attacks. We also performed simulated interventions on the inferred causal network to detect the causal effects, in terms of shifts in probability distribution for asthma attacks.
... Damos o nome de causalidade ao relacionamento de causa e efeito entre eventos ou fatores [Pearl 2009]. Uma causa, nesse contexto, é um evento que antecipa outro, de modo que o último não ocorre caso o primeiro não aconteça. ...
... Este tipo de representação, mapeado para estruturas de dados específicas, é uma alternativa para uso em nossa proposta. O modelo de equações estruturais [Pearl 2009, Halpern 2015, por outro lado, não possui uma representação gráfica específica. Este modelo caracteriza o mundo por meio de variáveis e seus valores, e a influência entre variáveis -e a sua consequente relação causal -é representada por meio de equações. ...
Conference Paper
Raciocinar sobre relações de causa e efeito é um passo recorrente no processo humano de tomada de decisão. Não é raro que enfrentemos problemas que são consequências de outros problemas, e que só podem ser considerados solucionados quando suas causas também o forem. Em Ciência da Computação, diversas instâncias desse cenário podem ser identificadas. Apesar de propostas existentes serem capazes de solucionar consequências, bem como endereçar causas adequadamente, o raciocínio sobre relações causais é implementado manualmente e de forma específica para cada aplicação. Neste artigo, discutimos o uso de agentes de software como uma alternativa possível capaz de lidar com tal cenário, e apresentamos uma proposta para uma solução independente de domínio utilizando a arquitetura BDI (Belief-Desire-Intention).
... The Causal-Effect Score (CES) can be traced back to causality in observational studies (Rubin, 1974;Holland, 1986), where one usually cannot predefine and build control groups, but they have to be recovered from the available data (Gelman and Hill, 2007;Pearl, 2009;Roy and Salimi, 2023). The CES is also based on counterfactual interventions. ...
... 1. It is common to apply interventions on variables of a causal model (Pearl, 2009). In the case of databases, to the lineage of the query, that can act as the model, with random propositional variables Xτ , which are made true or false via do(Xτ = 1) or do(Xτ = 0) (Salimi et. ...
Preprint
The Causal Effect (CE) is a numerical measure of causal influence of variables on observed results. Despite being widely used in many areas, only preliminary attempts have been made to use CE as an attribution score in data management, to measure the causal strength of tuples for query answering in databases. In this work, we introduce, generalize and investigate the so-called Causal-Effect Score in the context of classical and probabilistic databases.
... This high-level design naturally aligns with causal effect estimation (Pearl, 2009;Pearl et al., 2016), which evaluates the interventional effects of an action in sequential decision-making. Appendix A provides further discussion on the connection between causal effect estimation and our approach. ...
... From a causal perspective, given a conversation history t h j at turn j, the causal effect of a model response m j on the final conversation trajectory can be expressed using front-door adjustment (Pearl, 2009;Pearl et al., 2016): ...
Preprint
Large Language Models are typically trained with next-turn rewards, limiting their ability to optimize for long-term interaction. As a result, they often respond passively to ambiguous or open-ended user requests, failing to help users reach their ultimate intents and leading to inefficient conversations. To address these limitations, we introduce CollabLLM, a novel and general training framework that enhances multiturn human-LLM collaboration. Its key innovation is a collaborative simulation that estimates the long-term contribution of responses using Multiturn-aware Rewards. By reinforcement fine-tuning these rewards, CollabLLM goes beyond responding to user requests, and actively uncovers user intent and offers insightful suggestions-a key step towards more human-centered AI. We also devise a multiturn interaction benchmark with three challenging tasks such as document creation. CollabLLM significantly outperforms our baselines with averages of 18.5% higher task performance and 46.3% improved interactivity by LLM judges. Finally, we conduct a large user study with 201 judges, where CollabLLM increases user satisfaction by 17.6% and reduces user spent time by 10.4%.
... Gaining insight into the dynamics of complex real-world processes is closely tied to understanding the causal mechanisms underlying their function. Causal models (Pearl 2009) allow reasoning about the effect of intervention and distributional change, which makes them especially useful for modeling systems under changing environments or over time. At odds with this, however, is that established methods for discovering causal networks from data (Spirtes et al. 2000;Chickering 2002) often make the simplifying assumption that all samples have a fixed data-generating process. ...
... We make the standard assumptions of Causal Markov Condition, Faithfulness, and Sufficiency (Pearl 2009). ...
Preprint
Full-text available
Understanding causality is challenging and often complicated by changing causal relationships over time and across environments. Climate patterns, for example, shift over time with recurring seasonal trends, while also depending on geographical characteristics such as ecosystem variability. Existing methods for discovering causal graphs from time series either assume stationarity, do not permit both temporal and spatial distribution changes, or are unaware of locations with the same causal relationships. In this work, we therefore unify the three tasks of causal graph discovery in the non-stationary multi-context setting, of reconstructing temporal regimes, and of partitioning datasets and time intervals into those where invariant causal relationships hold. To construct a consistent score that forms the basis of our method, we employ the Minimum Description Length principle. Our resulting algorithm SPACETIME simultaneously accounts for heterogeneity across space and non-stationarity over time. Given multiple time series, it discovers regime changepoints and a temporal causal graph using non-parametric functional modeling and kernelized discrepancy testing. We also show that our method provides insights into real-world phenomena such as river-runoff measured at different catchments and biosphere-atmosphere interactions across ecosystems.
... The Bayesian network approach with qualitative data has limitations, such as the estimation of probability values being subjective and less accurate because it depends on the interpretation of the experts who estimate [62,63]. In addition, qualitative data are not easy to measure and tend to be challenging in determining the probability distribution [64]. Several assumptions are used in the Bayesian network, such as conditional independence, conditional probability, prior probability, graphical structure, etc. [64,65]. ...
... In addition, qualitative data are not easy to measure and tend to be challenging in determining the probability distribution [64]. Several assumptions are used in the Bayesian network, such as conditional independence, conditional probability, prior probability, graphical structure, etc. [64,65]. ...
... Bayesian networks are one popular type of causal model, which allow for consistent probabilistic reasoning (Pearl, 2009;Spirtes et al., 2000). A Bayesian network represents a probabilistic system using a directed acyclic graph (DAG). ...
... Two d-separated variables in a joint probability distribution might still be numerically independent given some other variables. See Pearl (2009) for further details. We say that X and Y are d-separated by Z if there are no unblocked undirected paths through G that connect them. ...
Article
There are myriad techniques industry actors use to shape the public understanding of science. While a naive view might assume these techniques typically involve fraud or outright deception, the truth is more nuanced. This paper analyzes industrial distraction, a common technique where industry actors fund and share research that is accurate, often high quality, but nonetheless misleading on important matters of fact. This involves reshaping causal understanding of phenomena with distracting information. Using case studies and causal models, we illustrate how this impacts belief and decision making even for rational learners, informing science policy and debates about misleading content.
... The previous work by Spirtes and Scheines has argued that composite variables do not have clear intervention semantics [6]. Pearl appears to have advanced a similar view: "there is no way a model can predict the effect of an action unless one specifies which variables are affected by the action and how." [7,Ch. 11] This is a strict principle which seems to rule out common patterns of causal reasoning. ...
... We refer to structural causal models in several places. The theory of structural models that we use can all be found in Chapter 1 of [7]. We use the notion of an intervention and d-separation (which we represent with the ⊥ symbol), as well as elementary properties of directed graphical models like parents. ...
Article
Full-text available
We can use structural causal models (SCMs) to help us evaluate the consequences of actions given data. SCMs identify actions with structural interventions. A careful decision maker may wonder whether this identification is justified. We seek such a justification. We begin with decision models, which map actions to distributions over outcomes but avoid additional causal assumptions. We then examine assumptions that could justify causal interventions, with a focus on symmetry. First, we introduce conditionally independent and identical responses (CIIR), a generalisation of the IID assumption to decision models. CIIR justifies identifying actions with interventions, but is often an implausible assumption. We consider an alternative: precedent is the assumption that “what I can do has been done before, and its consequences observed,” and is generally more plausible than CIIR. We show that precedent together with independence of causal mechanisms (ICM) and an observed conditional independence can justify identifying actions with causal interventions. ICM has been proposed as an alternative foundation for causal modelling, but this work suggests that it may in fact justify the interventional interpretation of causal models.
... Such associative relationships are between two variables where a change in one variable is associated with a change in another. In contrast, causal relationships describe the connection between a cause and its effect, where the cause is an event that contributes to the production of another event, the effect [1]. ...
... Specifically, we realize this by employing the LiNGAM causal discovery algorithm. 1 The method is then also extended to tackle the cases where the discovered BP model does not coincide with the CBP model, prescribing how to systematically annotate such discrepancies over the discovered BP model. In our illustrative example this could be the case where the executions of the Email and Archive tasks are both dependent on the execution of the Accept task (as shown in Fig. 7 and detailed henceforth), and the process discovery algorithms yield a result as in Fig. 2, where Email directlyprecedes Archive and directly-follows Accept. ...
Article
Full-text available
Unraveling the causal relationships among the execution of process activities is a crucial element in predicting the consequences of process interventions and making informed decisions regarding process improvements. Process discovery algorithms exploit time precedence as their main source of model derivation. Hence, a causal view can supplement process discovery, being a new perspective in which relations reflect genuine cause-effect dependencies among the tasks. This calls for faithful new techniques to discover the causal execution dependencies among the tasks in the process. To this end, our work offers a systematic approach to the unveiling of the causal business process by leveraging an existing causal discovery algorithm over activity timing. In addition, this work delves into a set of conditions under which process mining discovery algorithms generate a model that is incongruent with the causal business process model, and shows how the latter model can be methodologically employed for a sound analysis of the process. Our methodology searches for such discrepancies between the two models in the context of three causal patterns, and derives a new view in which these inconsistencies are annotated over the mined process model. We demonstrate our methodology employing two open process mining algorithms, the IBM Process Mining tool, and the LiNGAM causal discovery technique. We apply it to a synthesized dataset and two open benchmark datasets.
... Separability refers to statistical independence. Under the faithfulness assumption [12], one variable cannot be causally related to another while maintaining independence. Thus, non-separability means that we cannot remove information about some variables from others without affecting the dynamics. ...
... In addition, we find that the value of DMPM rises more at the beginning of the sequence length increase, and tends to stabilize when L larger than 3400. In the Y to X direction, the changes of TE and DMPM are smoother and show a slow decreasing Dispersion multiple pattern matching for measuring causality Figures 11,12,13,14,15,16 show the results of the logistic model. Since the model is set as a bidirectional coupling model, we should detect bidirectional causality in the actual results. ...
Article
Full-text available
The study of causal relationships among complex systems helps to deepen the understanding of the real world. In this paper, we propose a new method to quantify the causal relationships of complex systems, named dispersion multiple pattern matching (DMPM). On the basis of phase space reconstruction, we not only consider the performance of the point under the current moment, but also more comprehensively consider the change trend of its neighbourhood point set in a certain time range, so as to construct the weighted change pattern matrix. Through the idea of cross-mapping, we obtain the predicted values of the above matrix. In order to measure the similarity between the true and predicted change patterns, the multiple matching method is used to consider both the change amplitude and rate, and finally the causality between the two is quantified. Another advantage of the DMPM is that it not only detects causal relationships in nonlinear systems, but also categorises the causal relationships into positive and negative causality, which helps us to further understand the interactions of the complex systems. We verified the good performance of DMPM after extensive numerical simulations and applied it to multichannel EEG data from attention deficit and hyperactivity disorder (ADHD) and normal subjects, aiming to investigate the differences in causal connectivity between the channels of them. The method deepens our knowledge of ADHD and also provides new ideas for the study of causality in complex systems.
... This means that a bias is induced in estimating the relationships using standard Random Forests versions by not observing and including covariates that correlate with the other covariates and the response. In the setting of causality, this can be viewed as a confounded causal relationship with unobserved confounders (Pearl, 1980;Peters et al., 2016). A popular approach to deal with unobserved confounding is to use instrumental variables (IV) regression techniques (Bowden et al., 1990;Angrist et al., 1996;Stock & Trebbi, 2003). ...
Preprint
Full-text available
We introduce a modification of Random Forests to estimate functions when unobserved confounding variables are present. The technique is tailored for high-dimensional settings with many observed covariates. We use spectral deconfounding techniques to minimize a deconfounded version of the least squares objective, resulting in the Spectrally Deconfounded Random Forests (SDForests). We show how the omitted variable bias gets small given some assumptions. We compare the performance of SDForests to classical Random Forests in a simulation study and a semi-synthetic setting using single-cell gene expression data. Empirical results suggest that SDForests outperform classical Random Forests in estimating the direct regression function, even if the theoretical assumptions, requiring linear and dense confounding, are not perfectly met, and that SDForests have comparable performance in the non-confounded case.
... Structural assumptions on the shift. Robustness from the lens of causality takes a step further, by assuming a structural causal model [50] generating the observed data (X, Y ). Infinite robustness methods: The motivation of causal methods for robustness is that the causal function is worst-case optimal to predict the response under interventions of arbitrary direction and strength on the covariates [8,9]. ...
Preprint
In safety-critical applications, machine learning models should generalize well under worst-case distribution shifts, that is, have a small robust risk. Invariance-based algorithms can provably take advantage of structural assumptions on the shifts when the training distributions are heterogeneous enough to identify the robust risk. However, in practice, such identifiability conditions are rarely satisfied -- a scenario so far underexplored in the theoretical literature. In this paper, we aim to fill the gap and propose to study the more general setting when the robust risk is only partially identifiable. In particular, we introduce the worst-case robust risk as a new measure of robustness that is always well-defined regardless of identifiability. Its minimum corresponds to an algorithm-independent (population) minimax quantity that measures the best achievable robustness under partial identifiability. While these concepts can be defined more broadly, in this paper we introduce and derive them explicitly for a linear model for concreteness of the presentation. First, we show that existing robustness methods are provably suboptimal in the partially identifiable case. We then evaluate these methods and the minimizer of the (empirical) worst-case robust risk on real-world gene expression data and find a similar trend: the test error of existing robustness methods grows increasingly suboptimal as the fraction of data from unseen environments increases, whereas accounting for partial identifiability allows for better generalization.
... One is to consider uncovering the hidden causal structures within the algorithm. This entails revealing the cause-and-effect relationships that account for how the algorithm generates its outputs, a pursuit that has roots in logic, philosophy of science, and computer science (Pearl, 2000;Spirtes et al., 2000). Another, not necessarily unrelated, way to open the black box is by explaining how a specific outcome was generated. ...
Article
Full-text available
Achieving trustworthy AI is increasingly considered an essential desideratum to integrate AI systems into sensitive societal fields, such as criminal justice, finance, medicine, and healthcare, among others. For this reason, it is important to spell out clearly its characteristics, merits, and shortcomings. This article is the first survey in the specialized literature that maps out the philosophical landscape surrounding trust and trustworthiness in AI. To achieve our goals, we proceed as follows. We start by discussing philosophical positions on trust and trustworthiness, focusing on interpersonal accounts of trust. This allows us to explain why trust, in its most general terms, is to be understood as reliance plus some “extra factor”. We then turn to the first part of the definition provided, i.e., reliance, and analyze two opposing approaches to establishing AI systems’ reliability. On the one hand, we consider transparency and, on the other, computational reliabilism. Subsequently, we focus on debates revolving around the “extra factor”. To this end, we consider viewpoints that most actively resist the possibility and desirability of trusting AI systems before turning to the analysis of the most prominent advocates of it. Finally, we take up the main conclusions of the previous sections and briefly point at issues that remain open and need further attention.
... graphical rules are derived as compared to and based on the fundamentals of DAGs and SWIGs.12,13 The core element is to read off conditional independence based on the fundamental definition of d-separation that invites an open-ended search for an identification formula. ...
Preprint
Full-text available
Selection bias is a major obstacle toward valid causal inference in epidemiology. Over the past decade, several simple graphical rules based on causal diagrams have been proposed as the sufficient identification conditions for addressing selection bias and recovering causal effects. However, these simple graphical rules are usually coupled with specific identification strategies and estimators. In this article, we show two important cases of selection bias that cannot be addressed by these simple rules and their estimators: one case where selection is a descendant of a collider of the treatment and the outcome, and the other case where selection is affected by the mediator. To address selection bias in these two cases, we construct identification formulas by the g-computation and the inverse probability weighting (IPW) methods based on single-world intervention graphs (SWIGs). They are generalized to recover the average treatment effect by adjusting for post-treatment upstream causes of selection. We propose two IPW estimators and their variance estimators to recover the average treatment effect in the presence of selection bias in these two cases. We conduct simulation studies to verify the performance of the estimators when the traditional crude selected-sample analysis returns erroneous contradictory conclusions to the truth.
... Recent advances have shown that the rank of a crosscovariance matrix and its statistical test play essential roles in multiple fields of statistics especially in causal discovery (Sullivant et al., 2010;Spirtes, 2013). From one perspective, Independence and Conditional Independence (CI) are crucial concepts in causal discovery and Bayesian network learning (Pearl et al., 2000;Spirtes et al., 2000;Koller & 1 CMU 2 MBZUAI 3 ISU. ...
Preprint
Recent advances have shown that statistical tests for the rank of cross-covariance matrices play an important role in causal discovery. These rank tests include partial correlation tests as special cases and provide further graphical information about latent variables. Existing rank tests typically assume that all the continuous variables can be perfectly measured, and yet, in practice many variables can only be measured after discretization. For example, in psychometric studies, the continuous level of certain personality dimensions of a person can only be measured after being discretized into order-preserving options such as disagree, neutral, and agree. Motivated by this, we propose Mixed data Permutation-based Rank Test (MPRT), which properly controls the statistical errors even when some or all variables are discretized. Theoretically, we establish the exchangeability and estimate the asymptotic null distribution by permutations; as a consequence, MPRT can effectively control the Type I error in the presence of discretization while previous methods cannot. Empirically, our method is validated by extensive experiments on synthetic data and real-world data to demonstrate its effectiveness as well as applicability in causal discovery.
... The frameworks developed by Rubin [30] and Pearl [26] offering distinct yet complementary foundations for causal modeling and causal effect estimation. Rubin's framework focuses on counterfactual reasoning to estimate causal effects while Pearl's causal Bayesian networks and DAGs provide a graphical and algorithmic framework for causal inference. ...
Preprint
We consider the problem of estimating causal effects from observational data in the presence of network confounding. In this context, an individual's treatment assignment and outcomes may be affected by their neighbors within the network. We propose a novel matching technique which leverages hyperdimensional computing to model network information and improve predictive performance. We present results of extensive experiments which show that the proposed method outperforms or is competitive with the state-of-the-art methods for causal effect estimation from network data, including advanced computationally demanding deep learning methods. Further, our technique benefits from simplicity and speed, with roughly an order of magnitude lower runtime compared to state-of-the-art methods, while offering similar causal effect estimation error rates.
... This method aligns with the RCM framework and extends to Pearl's Structural Causal Model, emphasizing latent variables to define counterfactual scenarios reflecting the system's causal structure. (Pearl, 2000). To extract at least principal components representing hidden nature (individual impact, source impact, research behavior, i.e. internationalization) behind set of covariates a new set of uncorrelated variables (principal components) through an orthogonal linear transformation PC k will represents the weights (eigenvectors) corresponding to the k-th component, X i is the i-th original variable, and p is the total number of variables in the dataset. ...
Article
This research aims to assess the causal effects of the global crisis, namely the COVID-19 pandemic, on the dissemination of research in open science by examining changes in research impact using field-normalized measures. This is done by conceptualizing the pandemic-related changes as a coercive force that accelerated the adoption of open science practices, thus improving their recognition in different scientific disciplines. Using a Causal Autoregressive Integrated Moving Average (C-ARIMA) model, the study contrasts counterfactual scenarios to evaluate the potential impact of research in the absence of the pandemic against what was empirically observed. The pandemic considerably increased the FWCI of open science research, reaching a peak in February 2022. The overall effect suggests continued recognition of open science after the pandemic. The average monthly increase in FWCI was 0.335 above the expected growth, with a relative effect ranging from 15 to 23%. The pandemic hastened the influence of open science, with fields like social science, computer science, medicine, and engineering experiencing significant unexpected changes, while business management and accounting, or arts and humanities do not noted a significant effect. Pandemic restrictions and an increased need for transparency were likely major catalysts to improve the visibility and impact of open science in academia.
... Causality has long been recognized as a fundamental cornerstone of advanced knowledge discovery in fields spanning cognitive science, epidemiology, and machine learning [17,24]. Within natural language processing (NLP), numerous lines of inquiry-from causal event extraction [25] and counterfactual generation to specialized question answering-aim to move beyond simple correlations toward why-oriented explanations [13,14]. ...
Preprint
In knowledge-intensive tasks, especially in high-stakes domains like medicine and law, it is critical not only to retrieve relevant information but also to provide causal reasoning and explainability. Large language models (LLMs) have achieved remarkable performance in natural language understanding and generation tasks. However, they often suffer from limitations such as difficulty in incorporating new knowledge, generating hallucinations, and explaining their reasoning process. To address these challenges, integrating knowledge graphs with Graph Retrieval-Augmented Generation (Graph RAG) has emerged as an effective solution. Traditional Graph RAG methods often rely on simple graph traversal or semantic similarity, which do not capture causal relationships or align well with the model's internal reasoning steps. This paper proposes a novel pipeline that filters large knowledge graphs to emphasize cause-effect edges, aligns the retrieval process with the model's chain-of-thought (CoT), and enhances reasoning through multi-stage path improvements. Experiments on medical question-answering tasks show consistent gains, with up to a 10\% absolute improvement across multiple large language models (LLMs). This approach demonstrates the value of combining causal reasoning with stepwise retrieval, leading to more interpretable and logically grounded solutions for complex queries.
... In air quality inference, as the geographical scope widens, the challenge of spatial heterogeneity emerges. To tackle this, we leverage causality theory by establishing a Structural Causal Model (SCM) (Pearl et al. 2000). In the SCM shown in Figure 5a, C, X, and Y denote random variables for spatial context, observed data, and target area air pollutants, respectively, with arrows representing causal-effect relationships. ...
Preprint
Monitoring real-time air quality is essential for safeguarding public health and fostering social progress. However, the widespread deployment of air quality monitoring stations is constrained by their significant costs. To address this limitation, we introduce \emph{AirRadar}, a deep neural network designed to accurately infer real-time air quality in locations lacking monitoring stations by utilizing data from existing ones. By leveraging learnable mask tokens, AirRadar reconstructs air quality features in unmonitored regions. Specifically, it operates in two stages: first capturing spatial correlations and then adjusting for distribution shifts. We validate AirRadar's efficacy using a year-long dataset from 1,085 monitoring stations across China, demonstrating its superiority over multiple baselines, even with varying degrees of unobserved data. The source code can be accessed at https://github.com/CityMind-Lab/AirRadar.
... Estimating causal effects from observational data remains a fundamental challenge across scientific disciplines, including medicine, economics, and public policy (Pearl, 2009;Imbens and Rubin, 2015). While significant progress has been made in static settings, the complexity of longitudinal studies with time-varying treatments and covariates presents unique challenges that require novel methodological approaches. ...
Preprint
Full-text available
We present Dynamic Representation Balancing for Longitudinal Survival Analysis (DRB-LSA), a novel deep learning framework for estimating time-dependent causal effects in longitudinal observational studies with time-varying treatments and covariates. Current methods for longitudinal causal inference often struggle with complex temporal dependencies, time-varying confounding, and the need for interpretable predictions in survival analysis. Our approach addresses these challenges through three key innovations: (1) a dynamic balancing mechanism that adapts to evolving confounding patterns over time, (2) an attention-based representation learning architecture that captures temporal dependencies while maintaining interpretability, and (3) a flexible survival model based on neural mixture of experts that handles time-varying treatments and censoring. We provide theoretical guarantees for our method, establishing bounds on the estimation error of time-dependent conditional average treatment effects (CATE) and proving consistency under mild conditions. Through extensive experiments on both synthetic data and two real-world medical datasets (MIMIC-III and Framingham Heart Study), we demonstrate that DRB-LSA significantly outperforms existing methods , achieving up to 28% reduction in integrated mean squared error and 15% improvement in concordance index. Our method scales efficiently to large datasets and provides interpretable insights through attention weights and feature importance scores, making it particularly valuable for healthcare applications where understanding time-varying treatment effects is crucial for clinical decision-making.
... Insurance applications, on the other hand, typically fall outside this exception due to the complexity of the underlying risks (life trajectories), the low signal-tonoise ratio, and restrictions on the availability of relevant covariates to the insurer. For an introduction to causal inference, we refer to the modern classics Pearl (2009) and Hernán and Robins (2020). ...
Preprint
We describe challenges and opportunities related to risk assessment and mitigation for loss of earning capacity insurance with a special focus on Denmark. The presence of public benefits, claim settlement processes, and prevention initiatives introduces significant intricacy to the risk landscape. Accommodating this requires the development of innovative approaches from researchers and practitioners alike. Actuaries are uniquely positioned to lead the way, leveraging their domain knowledge and mathematical-statistical expertise to develop equitable, data-driven solutions that mitigate risk and enhance societal well-being.
... A razão de contribuição R-quadrado (RSCR) de 0,994 foi aceitável (ou seja, ≥0,9). Ela mede a extensão em que o modelo está livre de coeficientes de determinação negativos (Pearl, 2009). O índice de razão de supressão estatística (SSR) também foi aceitável em 1,0 (ou seja, ≥0,7). ...
Article
Full-text available
This study tests the effect of different sales models on a conceptual model based on creativity theory, linking retail salespersons’ functional and relational customer orientation to their performance. Salespersons’ creativity and retailers’ different sales models are considered to be mediating and moderating variables. We collected dyadic data from retail managers and salespeople across 177 retail companies in Brazil and used PLS-SEM to test the study’s hypotheses. The results show that the moderation exerted by distinct sales models offered by a company contributes to the heightened prominence of creative salespersons operating in the retail sector, as they can adapt their sales strategies effectively. Furthermore, the results indicate that functional customer orientation negatively moderates the effect of relational customer orientation on creativity, which suggests that attempting to simultaneously implement both customer orientations is suboptimal in retail selling contexts. Keywords: creativity; sales strategy; customer orientation; salesperson performance; retail.
... The R-squared contribution ratio (RSCR) of 0.994 was acceptable (i.e., ≥0.9). It measures the extent to which the model is free of negative determination coefficients (Pearl, 2009). The statistical suppression ratio (SSR) index was also acceptable at 1.0 (i.e., ≥0.7). ...
Article
Full-text available
This study tests the effect of different sales models on a conceptual model based on creativity theory, linking retail salespersons’ functional and relational customer orientation to their performance. Salespersons’ creativity and retailers’ different sales models are considered to be mediating and moderating variables. We collected dyadic data from retail managers and salespeople across 177 retail companies in Brazil and used PLS-SEM to test the study’s hypotheses. The results show that the moderation exerted by distinct sales models offered by a company contributes to the heightened prominence of creative salespersons operating in the retail sector, as they can adapt their sales strategies effectively. Furthermore, the results indicate that functional customer orientation negatively moderates the effect of relational customer orientation on creativity, which suggests that attempting to simultaneously implement both customer orientations is suboptimal in retail selling contexts. Keywords: creativity; sales strategy; customer orientation; salesperson performance; retail.
... The use of mono-and cross-sectional measurement (cross-sectional survey) brings up limitations regarding the interpretations and validity of the data. Using cross-sectional survey in data collection forbids causal interpretations of the results (Pearl, 2000), while mono-measurement increases the risk of common method variance (CMV; Podsakoff et al., 2003Podsakoff et al., , 2012. CMV can increase from, for example, survey complexity, length, and item ordering. ...
Article
Full-text available
This study uses a person-centered approach to investigate construction workers’ learning at work, focusing on their approaches to learning, self-efficacy beliefs and work engagement and how these vary according to their goal orientation profiles. Survey data were collected from Finnish construction sector employees (N = 1,280) in June 2021. The data were analysed using latent profile analysis (LPA). Analyses revealed four goal orientation profiles: Disengaged (19.1%), Avoidance-oriented (2.1%), Performance-oriented (43.0%) and Mastery-oriented (35.7%). Profiles based on high learning orientation (Performance-oriented, Mastery-oriented) had higher emphasis on a deep approach to learning, as well as higher self-efficacy and work engagement. The profiles differed less regarding unreflective approach to learning. The results illuminate construction workers’ learning processes in terms of learning strategies and motivation and how they compare to those of the more extensively studied student population. The results also encourage finding ways to support workers’ mastery orientation to benefit not only learning but also well-being by enhancing more reflective learning approaches.
... There has recently been increased interest in causal reasoning from AI researchers and philosophers. The standard framework for reasoning about causal dependencies in both stochastic and deterministic settings is Structural Equation Modelling (SEM) (Pearl 2000;Spirtes, Glymour, and Scheines 2001). Both types of reasoning are crucial for the field of AI: in stochastic domains it is often used in Causal Machine Learning (Peters, Janzing, and Schölkopf 2017) and Causal Discovery, while in deterministic domains it forms the basis for work on Actual Causality (Halpern 2016). ...
Preprint
Full-text available
Structural Equation Models (SEM) are the standard approach to representing causal dependencies between variables in causal models. In this paper we propose a new interpretation of SEMs when reasoning about Actual Causality, in which SEMs are viewed as mechanisms transforming the dynamics of exogenous variables into the dynamics of endogenous variables. This allows us to combine counterfactual causal reasoning with existing temporal logic formalisms, and to introduce a temporal logic, CPLTL, for causal reasoning about such structures. We show that the standard restriction to so-called \textit{recursive} models (with no cycles in the dependency graph) is not necessary in our approach, allowing us to reason about mutually dependent processes and feedback loops. Finally, we introduce new notions of model equivalence for temporal causal models, and show that CPLTL has an efficient model-checking procedure.
... For example, "eating ice cream" is usually associated with "sun burnt" while actually caused by "hot weather". This separation theoretically requires training and evaluation data with ground truth cause-effect pairs discovered with interventional experiments [34]. Traditionally, these interventions are estimated by randomized controlled trials but they are universally costly and mostly infeasible for life-like video events. ...
Preprint
Full-text available
This paper introduces a new problem, Causal Abductive Reasoning on Video Events (CARVE), which involves identifying causal relationships between events in a video and generating hypotheses about causal chains that account for the occurrence of a target event. To facilitate research in this direction, we create two new benchmark datasets with both synthetic and realistic videos, accompanied by trigger-target labels generated through a novel counterfactual synthesis approach. To explore the challenge of solving CARVE, we present a Causal Event Relation Network (CERN) that examines the relationships between video events in temporal and semantic spaces to efficiently determine the root-cause trigger events. Through extensive experiments, we demonstrate the critical roles of event relational representation learning and interaction modeling in solving video causal reasoning challenges. The introduction of the CARVE task, along with the accompanying datasets and the CERN framework, will advance future research on video causal reasoning and significantly facilitate various applications, including video surveillance, root-cause analysis and movie content management.
... Following Pearl et al. (2000), SCM provides a mathematical framework for modeling causal relationships between variables. An SCM M = (U, V, F ) consists of a set of observable endogenous variables V , a set of unobserved exogenous variables U and a set of functions F . ...
Preprint
Full-text available
When applied in healthcare, reinforcement learning (RL) seeks to dynamically match the right interventions to subjects to maximize population benefit. However, the learned policy may disproportionately allocate efficacious actions to one subpopulation, creating or exacerbating disparities in other socioeconomically-disadvantaged subgroups. These biases tend to occur in multi-stage decision making and can be self-perpetuating, which if unaccounted for could cause serious unintended consequences that limit access to care or treatment benefit. Counterfactual fairness (CF) offers a promising statistical tool grounded in causal inference to formulate and study fairness. In this paper, we propose a general framework for fair sequential decision making. We theoretically characterize the optimal CF policy and prove its stationarity, which greatly simplifies the search for optimal CF policies by leveraging existing RL algorithms. The theory also motivates a sequential data preprocessing algorithm to achieve CF decision making under an additive noise assumption. We prove and then validate our policy learning approach in controlling unfairness and attaining optimal value through simulations. Analysis of a digital health dataset designed to reduce opioid misuse shows that our proposal greatly enhances fair access to counseling.
... These datasets included demographic information, clinical measurements (e.g., blood pressure, glucose levels), and longitudinal health outcomes for patients diagnosed with chronic diseases such as diabetes and chronic obstructive pulmonary disease (COPD). To ensure data quality and consistency, all records underwent preprocessing, including normalization, missing data imputation, and outlier detection [22]. ...
Article
Full-text available
Bayesian networks (BNs) are emerging as transformative tools for decision-making in healthcare, particularly in predicting outcomes of chronic diseases such as diabetes and chronic obstructive pulmonary disease (COPD). This study aimed to develop, validate, and evaluate the utility of a Bayesian network model for enhancing clinical decision-making. By integrating diverse patient-specific data, BNs provided accurate predictions for disease progression, hospitalization risks, and treatment outcomes, achieving an area under the curve (AUC) of 0.91, with accuracy, sensitivity, and specificity exceeding 90%. The study employed anonymized datasets from electronic health records and utilized advanced probabilistic modeling tools, including R and Python libraries, to construct the BN. The model outperformed traditional methods like logistic regression and random forests in predictive accuracy and interpretability. Key advantages included the BN's ability to manage uncertainty, infer missing data, and dynamically update predictions in real time. However, challenges such as model complexity, integration barriers with electronic health record systems, and ethical considerations related to data privacy were identified. Future directions emphasize integrating BNs with big data and artificial intelligence technologies to enhance scalability, precision, and clinical adoption. The findings underline the potential of Bayesian networks to revolutionize chronic disease management by fostering personalized care, optimizing resource allocation, and improving patient outcomes.
... Matematicamente, uma RB pode ser representada por B = (Pc, G), onde G é um grafo acíclico direcionado e Pc são as probabilidades condicionais associadas a cada variável probabilística representada em um nó do grafo. Dessa forma, as RBs estabelecem uma conexão entre a MBE e a IA, ao serem aplicadas em cálculos probabilísticos causais para descrever práticas médicas baseadas em evidências (Pearl, 2000). (Brasil, 2012). ...
Chapter
This study reports experiences in health education in two quilombola communities in Rio Grande do Sul, highlighting the use of Bayesian networks (BN) to assess the risks of type 2 diabetes mellitus (DM2) and systemic arterial hypertension (SAH). It was conducted as a cross-sectional and observational study, using the problematization methodology, with steps that included observation, definition of key points, theorizing, hypothesis and practical application. The modeling of quality of life was performed using the Netica software, with the implementation of Bayesian networks (BNs), allowing the insertion of probabilities of occurrence of the variables through the network nodes. The profile of the 34 participants revealed a predominance of women (79.4%), aged between 30 and 59 years (55.9%) and an average body mass index (BMI) of 32.5 kg/m². Among them, 51.5% had a diagnosis of SAH and 23.5% of DM2. Inadequate diet was observed, with high sugar consumption (38.2%) and low use of whole foods (3.0%). The RBs showed sensitivity of 71.42% for DM2 and 76.47% for hypertension, and specificity of 77.7% and 88.23%, respectively, demonstrating high accuracy. The modeling also identified a significant association between the risks of diseases and factors such as BMI, age, family history and glucose. Educational strategies contributed to preventing complications and promoting quality of life, while the RBs proved to be promising tools for diagnosis and health education. The study reinforces the importance of inclusive public policies aimed at quilombola communities.
... [10] relates Abstract Dialectical Frameworks to causal reasoning, and, more precisely, to Pearl's causal models [56]. In essence, a causal model describes causal dependencies between exogeneous variables (which cannot directly be observed) and endogenous variables, which can be observed. ...
Chapter
Full-text available
Conditionals, i. e. expressions of the logical form "if A, then B", have been a central topic of study ever since logic was on the academic menu. In contemporary logic, there is a consensus that the semantics of conditionals are best obtained by stipulating a subset of possible worlds in which the antecedent is true, and verifying whether the consequent is true in those worlds. Such a subset of possible worlds can represent, for example, the most typical worlds in which the antecedent is true. This idea has proven a fruitful basis, allowing for many systematic characterisation results as well as for making connections to other topics, such as belief revision and modal logic. In formal argumentation, the potential of these semantical ideas has not gone unnoticed in the last years, and Vol. \jvolume No. \jnumber \jyear Journal of Applied Logics-IfCoLog Journal of Logics and their Applications Heyninck, Kern-Isberner, Rienstra, Skiba and Thimm many works have attempted to bridge the worlds of conditionals and arguments on the basis of these ideas. In this article, we give a thorough introduction to the semantics of conditionals and survey the adaptions of these semantics in the literature on computational argumentation, including structured argumen-tation and generalisations of abstract argumentation such as abstract dialectical frameworks. Furthermore, we highlight opportunities for future research on this topic.
... We follow the framework of Causal Fairness Analysis described in (Plečko and Bareinboim, 2024), which is based on the language of structural causal models (SCMs) (Pearl, 2000). Our approach of analyzing and aggregating findings across multiple data sources is also related to the data-fusion paradigm in the causal inference literature (Bareinboim and Pearl, 2016). ...
Preprint
Full-text available
The new era of large-scale data collection and analysis presents an opportunity for diagnosing and understanding the causes of health inequities. In this study, we describe a framework for systematically analyzing health disparities using causal inference. The framework is illustrated by investigating racial and ethnic disparities in intensive care unit (ICU) outcome between majority and minority groups in Australia (Indigenous vs. Non-Indigenous) and the United States (African-American vs. White). We demonstrate that commonly used statistical measures for quantifying inequity are insufficient, and focus on attributing the observed disparity to the causal mechanisms that generate it. We find that minority patients are younger at admission, have worse chronic health, are more likely to be admitted for urgent and non-elective reasons, and have higher illness severity. At the same time, however, we find a protective direct effect of belonging to a minority group, with minority patients showing improved survival compared to their majority counterparts, with all other variables kept equal. We demonstrate that this protective effect is related to the increased probability of being admitted to ICU, with minority patients having an increased risk of ICU admission. We also find that minority patients, while showing improved survival, are more likely to be readmitted to ICU. Thus, due to worse access to primary health care, minority patients are more likely to end up in ICU for preventable conditions, causing a reduction in the mortality rates and creating an effect that appears to be protective. Since the baseline risk of ICU admission may serve as proxy for lack of access to primary care, we developed the Indigenous Intensive Care Equity (IICE) Radar, a monitoring system for tracking the over-utilization of ICU resources by the Indigenous population of Australia across geographical areas.
... We can use (6) to replace the "do" operator of [16] and define: ...
Preprint
Full-text available
We introduce a class dynamical systems called G-systems equipped with a coupling operation. We use G-systems to define the notions of dependence (borrowed from dependence logic) and causality (borrowed from Pearl) for dynamical systems. As a converse to coupling we define decomposition or ``reducibility''. We give a characterization of reducibility in terms of the dependence "atom". We do all this with the motivation of developing mathematical foundations for 4E cognitive science, see introductory sections.
... Although first attempts on graphical models dates back to even before Granger-causality (GC) [11], to the early 20th century, recent advances on Directed Graphical Causal models (DCGM) and probablistic view of causality are influenced mostly by J. Pearl's works (e.g. [12]). A conventional method for identifying causal relationships is through interventions or randomized experiments. ...
Preprint
Full-text available
Understanding the causal interactions in simple brain tasks, such as face detection, remains a challenging and ambiguous process for researchers. In this study, we address this issue by employing a novel causal discovery method -- Directed Acyclic Graphs via M-matrices for Acyclicity (DAGMA) -- to investigate the causal structure of the brain's face-selective network and gain deeper insights into its mechanism. Using natural movie stimuli, we extract causal network of face-selective regions and analyze how frames containing faces influence this network. Our findings reveal that the presence of faces in the stimuli have causal effect both on the number and strength of causal connections within the network. Additionally, our results highlight the crucial role of subcortical regions in satisfying causal sufficiency, emphasizing its importance in causal studies of brain. This study provides a new perspective on understanding the causal architecture of the face-selective network of the brain, motivating further research on neural causality.
Article
Full-text available
Recent evidence suggests that cultural stress might predict Latinos’ mental health outcomes. Yet, how two sources of cultural stress such as discrimination and negative context of reception are related to anxiety and depression is not well understood. This study aimed to investigate the impact of discrimination, negative context of reception, and demographic factors on anxiety and depression levels among 1426 Latino adults in the United States. Using two novel online simulators based on Bayesian Belief Networks, we explored how variations in these independent variables influence mental health outcomes. Our findings reveal that discrimination and negative context of reception significantly affect anxiety and depression, with discrimination being a stronger predictor. Generational status also played a key role, with second-generation Latinos experiencing worse mental health compared to the first generation. The item “To what extent do you feel that Americans have something against you?” was identified as the strongest predictor of mental health. Probabilistic machine learning approach allowed for the examination of complex interactions and non-linear relationships, providing deeper insights into the dynamics of cultural stressors and mental health. These findings suggest that addressing discrimination and negative context of reception could be vital for interventions aimed at improving the mental health of Latino populations. The use of online simulation tools in this research offers a novel method for subject-matter experts to explore and understand the intricate relationships between cultural stressors and mental health, potentially informing future prevention strategies.
Preprint
Full-text available
This paper develops a semiparametric Bayesian instrumental variable analysis method for estimating the causal effect of an endogenous variable when dealing with unobserved confounders and measurement errors with partly interval-censored time-to-event data, where event times are observed exactly for some subjects but left-censored, right-censored, or interval-censored for others. Our method is based on a two-stage Dirichlet process mixture instrumental variable (DPMIV) model which simultaneously models the first-stage random error term for the exposure variable and the second-stage random error term for the time-to-event outcome using a bivariate Gaussian mixture of the Dirichlet process (DPM) model. The DPM model can be broadly understood as a mixture model with an unspecified number of Gaussian components, which relaxes the normal error assumptions and allows the number of mixture components to be determined by the data. We develop an MCMC algorithm for the DPMIV model tailored for partly interval-censored data and conduct extensive simulations to assess the performance of our DPMIV method in comparison with some competing methods. Our simulations revealed that our proposed method is robust under different error distributions and can have superior performance over its parametric counterpart under various scenarios. We further demonstrate the effectiveness of our approach on an UK Biobank data to investigate the causal effect of systolic blood pressure on time-to-development of cardiovascular disease from the onset of diabetes mellitus.
Article
Full-text available
The theoretical relationship between social media use and job satisfaction, especially concerning gender-specific mechanisms, remains a subject of ongoing debate in the literature. This divergence reflects our insufficient understanding of the complex relationships among gender, social media use, and job satisfaction. Drawing on Social Role Theory (SRT) and the Theory of Planned Behavior (TPB), this study utilizes 4651 valid samples from the 2020 China Family Panel Studies (CFPS) database to investigate how gender influences interpersonal relationships through social media sharing frequency, thereby enhancing job satisfaction. The findings indicate that women, compared to men, exhibit higher job satisfaction and more frequent social media sharing behavior. Moreover, the frequency of social media sharing positively affects job satisfaction by improving interpersonal relationships. This study employs a chain-mediated causal path analysis to delve into the causal relationships among gender, social media sharing frequency, and interpersonal relationships, effectively addressing previous limitations in handling multiple mediating effects. The findings not only provide new insights into the role of social media in the modern workplace but also offer empirical evidence and practical guidance for organizations on leveraging social media to foster employee relationships and enhance job satisfaction.
Article
Bayesian networks (BNs) are a powerful probabilistic graphical tool for modeling relationships between random variables in an interpretable way. The relationships among variables are represented by the BN structure, a directed acyclic graph, which can be learned from a data set. However, the structural learning process is NP-hard. One popular strategy for learning a BN’s structure is the Search and Score approach, where a quality score is defined as the objective function to be optimized. This approach has led to various algorithms, ranging from linear integer programming to heuristics and metaheuristics. In this paper, we present a novel algorithm for BN structure learning, an adaptation of the Multi-Agent Genetic Algorithm designed to tackle the structural learning problem efficiently. Our algorithm was compared to three others across benchmark problems of varying variable sizes, using a randomized factorial design with different sample sizes. Results show our method outperformed others, especially in detecting edge presence and direction, and proved effective for both small and large-scale BN learning, as confirmed by statistical tests.
Preprint
Genuine network non-locality (GNN) refers to the existence of quantum correlations in a network with independent sources that cannot be explained by local hidden-variables (LHV) models. Even in the simplest scenario, determining whether these quantum correlations remain genuinely network non-local when derived from entangled states that deviate from their ideal forms is highly challenging due to the non-convex nature of local correlations. Understanding the boundary of these correlations thus becomes a hard problem, but one that raises academic interest specifically its robustness to noise. To address this problem, we introduce a causal domain-informed learning algorithm called the LHV k-rank neural network, which assesses the rank parameter of the non-ideal combined state produced by sources. Applied to the triangle network scenario with the three sources generating a class of quantum states known as X states, the neural network reveals that non-locality persists only if the states remain pure. Remarkably, we find that even slight deviations from ideal Bell states due to noise cause GNN to vanish, exhibiting a discrete behavior that hasn't been witnessed in the standard bell scenario. This finding thus raises a fundamental question as to whether GNN in the triangle network is exclusive to pure states or not. Additionally, we explore the case of the three sources producing dissimilar states, indicating that GNN requires all its sources to send pure entangled states with joint entangled measurements as resources. Apart from these results, this work succeeds in showing that machine learning approaches with domain-specific constraints can greatly benefit the field of quantum foundations.
Article
Full-text available
Despite significant research, discovering causal relationships from fMRI remains a challenge. Popular methods such as Granger Causality and Dynamic Causal Modeling fall short in handling contemporaneous effects and latent common causes. Methods from causal structure learning literature can address these limitations but often scale poorly with network size and need acyclicity. In this study, we first provide a taxonomy of existing methods and compare their accuracy and efficiency on simulated fMRI from simple topologies. This analysis demonstrates a pressing need for more accurate and scalable methods, motivating the design of Causal discovery for Large-scale Low-resolution Time-series with Feedback (CaLLTiF). CaLLTiF is a constraint-based method that uses conditional independence between contemporaneous and lagged variables to extract causal relationships. On simulated fMRI from the macaque connectome, CaLLTiF achieves significantly higher accuracy and scalability than all tested alternatives. From resting-state human fMRI, CaLLTiF learns causal connectomes that are highly consistent across individuals, show clear top-down flow of causal effect from attention and default mode to sensorimotor networks, exhibit Euclidean distance-dependence in causal interactions, and are highly dominated by contemporaneous effects. Overall, this work takes a major step in enhancing causal discovery from whole-brain fMRI and defines a new standard for future investigations.
ResearchGate has not been able to resolve any references for this publication.