Wiktoria Lawniczak’s research while affiliated with University of Leeds and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (2)


A hierarchical causal diagram illustrates individual-level causal relationships among five variables (circles are unobserved, i.e., latent; squares are observed; double-edged enclosures are determined variables): Y\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y$$\end{document}, the outcome; X\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X$$\end{document}, the exposure; Z\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Z$$\end{document}, a ‘regular’ confounder of the X-Y\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X-Y$$\end{document} relationship that is observed; L\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L$$\end{document}, a latent confounder of the X-Y\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X-Y$$\end{document} relationship that is unobserved but affects individual-level latent variable Ni\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${N}_{i}$$\end{document}, which manifests as an observed cluster-level feature, Nj\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${N}_{j}$$\end{document}. The solid single arrows signify causal relationships between variables; dashed lines are bivariate correlations realised among aggregated cluster-level (fully determined) variables; and double-lined arrows indicate deterministic pathways [43]
Table 2 (continued)
A schematic illustration of the algorithm that transforms an individual-level latent variable into a cluster-level measure of cluster size, which is used to produce the data clusters, illustrated using the example of daily mean levels of physical activity (PA\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$PA$$\end{document}) in minutes as the exposure and body weight (Wt\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Wt$$\end{document}) in kilograms as the outcome. (footer): The algorithm categorises simulated individual-level data into C\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{C}}$$\end{document} clusters to convey cross-level associations with causal origins as per the data generating mechanism of Fig. 1. The process involves: (a) sorting individual-level data by ascending latent variable Ni\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${N}_{i}$$\end{document} values; (b) rescaling such that, once rounded, N^i\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widehat{N}}_{i}$$\end{document} are potential cluster sizes with mean N/C=1000\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\boldsymbol N/\boldsymbol C=1000$$\end{document} and standard deviation 10\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$10$$\end{document}; (c) subset selection into C\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{C}}$$\end{document} evenly sized subsets – enclosed in the three ellipses; (d) randomly select one N^i\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\widehat{N}}_{i}$$\end{document} value per subset and round to generate C=100\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{C}}=100$$\end{document} cluster size values [alternatively, take subgroup means and round]; (e) undertake value modification to randomly selected cluster size values by adding or subtracting one to ensure all cluster sizes sum to population size; and (f) regroup subsets into unequally sized clusters – enclosed in the two new ellipses – based on the ordered values of Ni\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${N}_{i}$$\end{document}
of the multilevel and main ecological analyses of simulated data (plotted in black and blue respectively) for all four scenarios for continuous (charts A, C, E, G) and binary outcomes (charts B, D,F, H) – the diamond shaped plots are median estimates (y-axis) plotted against individual-level simulated ‘true’ effect sizes (x-axis); the dotted grey line indicates perfect agreement between simulated and estimated effect sizes; continuous lines are fitted lines to the median estimates. Scenario 1: Estimates of ρ7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\rho }_{7}$$\end{document} with regular confounding only. Scenario 2: Estimates of ρ7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\rho }_{7}$$\end{document} with latent confounding only. Scenario 3: Estimates of ρ7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\rho }_{7}$$\end{document} with regular and latent confounding that are not causally related. Scenario 4: Estimates of ρ7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\rho }_{7}$$\end{document} with regular and latent confounding that are causally related
Plots of multilevel and main ecological estimates of simulated data (plotted in black and orange respectively) for Scenario 4 (where estimates of ρ7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\rho }_{7}$$\end{document} were sought for causally related regular and latent confounding) with additional complexity considerations: (a) low outcome prevalence (0.1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.1$$\end{document}%); (b) binary Li-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{i}-$$\end{document} confounding (10\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$10$$\end{document}% prevalence) with continuous outcome; and (c) binary Li-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${L}_{i}-$$\end{document} confounding with binary outcome (both 10\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$10$$\end{document}% prevalence). The diamond shaped plots are individual simulation cluster-level estimates (y-axis) plotted against the individual-level simulated ‘true’ effect sizes (x-axis); the grey dotted line depicts perfect agreement between simulated and estimated effect sizes; continuous lines are linear fitted lines to all 1000\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1000$$\end{document} estimates. Scenario 4a: Estimates of ρ7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\rho }_{7}$$\end{document} with regular and latent confounding that are causally related with low binary prevalence. Scenario 4b: Estimates of ρ7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\rho }_{7}$$\end{document} with regular and latent confounding that are causally related with binary latent confounding and continuous outcome. Scenario 4c: Estimates of ρ7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\rho }_{7}$$\end{document} with regular and latent confounding that are causally related with binary latent confounding and binary outcome

+1

Simulating hierarchical data to assess the utility of ecological versus multilevel analyses in obtaining individual-level causal effects
  • Article
  • Full-text available

March 2025

·

28 Reads

Lydia Kakampakou

·

·

Andreas Hoehn

·

[...]

·

Understanding causality, over mere association, is vital for researchers wishing to inform policy and decision making – for example, when seeking to improve population health outcomes. Yet, contemporary causal inference methods have not fully tackled the complexity of data hierarchies, such as the clustering of people within households, neighbourhoods, cities, or regions. However, complex data hierarchies are the rule rather than the exception. Gaining an understanding of these hierarchies is important for complex population outcomes, such as non-communicable disease, which is impacted by various social determinants at different levels of the data hierarchy. The alternative of analysing aggregated data could introduce well-known biases, such as the ecological fallacy or the modifiable areal unit problem. We devise a hierarchical causal diagram that encodes the multilevel data generating mechanism anticipated when evaluating non-communicable diseases in a population. The causal diagram informs data simulation. We also provide a flexible tool to generate synthetic population data that captures all multilevel causal structures, including a cross-level effect due to cluster size. For the very first time, we can then quantify the ecological fallacy within a formal causal framework to show that individual-level data are essential to assess causal relationships that affect the individual. This study also illustrates the importance of causally structured synthetic data for use with other methods, such as Agent Based Modelling or Microsimulation Modelling. Many methodological challenges remain for robust causal evaluation of multilevel data, but this study provides a foundation to investigate these.

Download

P58 Assessing the utility of multilevel versus ecological analyses to obtain individual-level causal effect estimates

August 2024

·

7 Reads

Journal of Epidemiology and Community Health

Background Government bodies, private enterprises, and researchers increasingly use ‘big data’ to monitor, evaluate interventions, make future predictions, and seek causal understanding. Such data are often complex in structure (i.e., hierarchical), which creates challenges for methods that work for a single homogeneous population, but which mislead if applied to data with substructure. If causal insights are sought, this usually pertains to the individual, yet most datasets are aggregated due to issues surrounding sensitive personal information, which is why it is common to encounter simulation approaches, such as agent-based modelling (ABM), or ecological analyses that evaluate only marginal (i.e., clustered) information. Contemporary causal inference methods are yet to tackle the full complexities of multilevel data structure, beyond longitudinal repeated measures. There is thus a gap in our understanding and methods capabilities surrounding causal analysis of structured data, which this study examines. Methods 1) devise a hierarchical causal diagram that encodes a multilevel data generating mechanism with prespecified cross-level causal relationships; 2) simulate multilevel data from the causal diagram and obtain aggregated data; 3) contrast multilevel and ecological estimates of a simulated individual-level causal effect, to assess the presence and extent of potential biases. Results Unlike a multilevel analysis of the full data, ecological analyses of cluster-level data do not generally yield robust causal effect estimates. While it is known that ecological analyses invoke the ‘ecological fallacy’ (i.e., where attributing features of clusters to units within clusters may mislead), this study quantifies this for the first time within a formal causal framework. An algorithm to simulate causally structured multilevel data is also demonstrated. Conclusion Insights into the limitations of common analytical practices were made possible by simulating causally structured hierarchical data, demonstrating the value of causal diagrams in both simulation and causal analysis. Methodological challenges remain for robust causal evaluation of big data, but this study shows how to investigate these challenges. Results reveal the need for individual-level data with application of multilevel analyses to achieve robust causal inquiry; ecological analyses do not generally provide sound causal effect estimation. If individual-level data are unavailable, synthetic data (informed by available marginal data) becomes necessary to answer causal questions and this study provides a tool to generate synthetic population data that reflects multilevel causal structures, which in turn will then better inform the use of methods such as ABMs. This study has enormous implications for the use of big data when seeking causal insights.