Ioanna Manolopoulou

Ioanna Manolopoulou
University College London | UCL · Department of Statistical Science

PhD

About

44
Publications
6,227
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
353
Citations
Introduction
Skills and Expertise

Publications

Publications (44)
Preprint
In this extended abstract paper, we address the problem of interpretability and targeted regularization in causal machine learning models. In particular, we focus on the problem of estimating individual causal/treatment effects under observed confounders, which can be controlled for and moderate the effect of the treatment on the outcome of interes...
Article
Full-text available
This article develops a sparsity-inducing version of Bayesian Causal Forests, a recently proposed nonparametric causal regression model that employs Bayesian Additive Regression Trees and is specifically designed to estimate heterogeneous treatment effects using observational data. The sparsity-inducing component we introduce is motivated by empiri...
Article
Full-text available
Understanding the shopping motivations behind market baskets has significant commercial value for the grocery retail industry. The analysis of shopping transactions demands techniques that can cope with the volume and dimensionality of grocery transactional data while delivering interpretable outcomes. Latent Dirichlet allocation (LDA) allows proce...
Preprint
Full-text available
Large observational data are increasingly available in disciplines such as health, economic and social sciences, where researchers are interested in causal questions rather than prediction. In this paper, we examine the problem of estimating heterogeneous treatment effects using non‐parametric regression‐based methods, starting from an empirical st...
Article
Full-text available
Large observational data are increasingly available in disciplines such as health, economic and social sciences, where researchers are interested in causal questions rather than prediction. In this paper, we examine the problem of estimating heterogeneous treatment effects using non‐parametric regression‐based methods, starting from an empirical st...
Preprint
Full-text available
Understanding the customer behaviours behind transactional data has high commercial value in the grocery retail industry. Customers generate millions of transactions every day, choosing and buying products to satisfy specific shopping needs. Product availability may vary geographically due to local demand and local supply, thus driving the importan...
Preprint
Full-text available
This paper develops a sparsity-inducing version of Bayesian Causal Forests, a recently proposed nonparametric causal regression model that employs Bayesian Additive Regression Trees and is specifically designed to estimate heterogeneous treatment effects using observational data. The sparsity-inducing component we introduce is motivated by empirica...
Preprint
Understanding the shopping motivations behind market baskets has high commercial value in the grocery retail industry. Analyzing shopping transactions demands techniques that can cope with the volume and dimensionality of grocery transactional data while keeping interpretable outcomes. Latent Dirichlet Allocation (LDA) provides a suitable framework...
Article
Full-text available
Geographic isolation substantially contributes to species endemism on oceanic islands when speciation involves the colonisation of a new island. However, less is understood about the drivers of speciation within islands. What is lacking is a general understanding of the geographic scale of gene flow limitation within islands, and thus the spatial s...
Article
In studies of the interstellar medium in galaxies, radiative transfer models of molecular emission are useful for relating molecular line observations back to the physical conditions of the gas they trace. However, doing this requires solving a highly degenerate inverse problem. In order to alleviate these degeneracies, the abundances derived from...
Preprint
In studies of the interstellar medium in galaxies, radiative transfer models of molecular emission are useful for relating molecular line observations back to the physical conditions of the gas they trace. However, doing this requires solving a highly degenerate inverse problem. In order to alleviate these degeneracies, the abundances derived from...
Article
Background: The expected value of sample information (EVSI) determines the economic value of any future study with a specific design aimed at reducing uncertainty about the parameters underlying a health economic model. This has potential as a tool for trial design; the cost and value of different designs could be compared to find the trial with t...
Article
Full-text available
The rise of ‘big data’ has led to the frequent need to process and store data sets containing large numbers of high dimensional observations. Because of storage restrictions, these observations might be recorded in a lossy‐but‐sparse manner, with information collapsed onto a few entries which are considered important. This results in informative mi...
Preprint
The rise of "big data" has led to the frequent need to process and store datasets containing large numbers of high dimensional observations. Due to storage restrictions, these observations might be recorded in a lossy-but-sparse manner, with information collapsed onto a few entries which are considered important. This results in informative missing...
Preprint
Full-text available
The field of retail analytics has been transformed by the availability of rich data which can be used to perform tasks such as demand forecasting and inventory management. However, one task which has proved more challenging is the forecasting of demand for products which exhibit very few sales. The sparsity of the resulting data limits the degree t...
Preprint
Full-text available
Background: The Expected Value of Sample Information (EVSI) determines the economic value of any future study with a specific design aimed at reducing uncertainty in a health economic model. This has potential as a tool for trial design; the cost and value of different designs could be compared to find the trial with the greatest net benefit. Howev...
Article
Full-text available
Background: The Expected Value of Sample Information (EVSI) is used to calculate the economic value of a new research strategy. Although this value would be important to both researchers and funders, there are very few practical applications of the EVSI. This is due to computational difficulties associated with calculating the EVSI in practical he...
Article
Full-text available
Preposterior analysis covers a wide range of approaches in many different disciplines and relates to any analysis concerned with understanding the properties of a future posterior distribution before relevant data have been collected. Embedding preposterior analysis in a decision making context implies that we are interested in the hypothetical val...
Article
The Expected Value of Perfect Partial Information (EVPPI) is a decision-theoretic measure of the ‘cost’ of parametric uncertainty in decision making used principally in health economic decision making. Despite this decision-theoretic grounding, the uptake of EVPPI calculations in practice has been slow. This is in part due to the prohibitive comput...
Conference Paper
Multi-class classification problems have been studied for pure nominal and pure ordinal responses. However, there are some cases where the multi-class responses are a mixture of nominal and ordinal. To address this problem we build a hierarchical multinomial probit model with a mixture of both types of responses using latent variables. The nominal...
Article
Full-text available
The Expected Value of Perfect Partial Information (EVPPI) is a decision-theoretic measure of the 'cost' of parametric uncertainty in decision making used principally in health economic decision making. Despite this decision-theoretic grounding, the uptake of EVPPI calculations in practice has been slow. This is in part due to the prohibitive comput...
Article
Full-text available
This article describes the use of flexible Bayesian regression models for estimating a partially identified probability function. Our approach permits efficient sensitivity analysis concerning the posterior impact of priors on the partially identified component of the regression model. The new methodology is illustrated on an important problem wher...
Article
Full-text available
BPEC is an R package for Bayesian Phylogeographic and Ecological Clustering which allows geographical, environmental and phenotypic measurements to be combined with DNA sequences in order to reveal clustered structure resulting from migration events. DNA sequences are modelled using a collapsed version of a simplified coalescent model projected ont...
Article
Full-text available
This article describes the use of flexible Bayesian regression models for estimating a partially identified probability function. Our approach permits efficient sensitivity analysis concerning the posterior impact of priors on the partially identified component of the regression model. The new methodology is illustrated on an important problem wher...
Article
Full-text available
Geographical isolation by oceanic barriers and climatic stability has been postulated as some of the main factors driving diversification within volcanic archipelagos. However, few studies have focused on the effect that catastrophic volcanic events have had on patterns of within-island diversification in geological time. This study employed data f...
Article
Full-text available
Over recent years Value of Information analysis has become more widespread in health-economic evaluations, specifically as a tool to perform Probabilistic Sensitivity Analysis. This is largely due to methodological advancements allowing for the fast computation of a typical summary known as the Expected Value of Partial Perfect Information (EVPPI)....
Article
Full-text available
The Expected Value of Perfect Partial Information (EVPPI) is a decision-theoretic measure of the "cost" of uncertainty in decision making used principally in health economic decision making. Despite having optimal properties in terms of quantifying the value of decision uncertainty, the EVPPI is rarely used in practise. This is due to the prohibiti...
Article
Full-text available
This paper adapts tree-based Bayesian regression models for estimating a partially identified probability function. In doing so, ideas from the recent literature on Bayesian partial identification are applied within a sophisticated applied regression context. Our approach permits efficient sensitivity analysis concerning the posterior impact of pri...
Article
Full-text available
A primary challenge in unsupervised clustering using mixture models is the selection of a family of basis distributions flexible enough to succinctly represent the distributions of the target subpopulations. In this paper we introduce a new family of Gaussian Well distributions (GWDs) for clustering applications where the target subpopulations are...
Article
We develop and analyze models of the spatio-temporal organization of lymphocytes in the lymph nodes and spleen. The spatial dynamics of these immune system white blood cells are influenced by biochemical fields and represent key components of the overall immune response to vaccines and infections. A primary goal is to learn about the structure of t...
Article
Full-text available
Phylogeographic ancestral inference is issue frequently arising in population ecology that aims to understand the geographical roots and structure of species. Here, we specifically address relatively small scale mtDNA datasets (typically less than 500 sequences with fewer than 1000 nucleotides), focusing on ancestral location inference. Our approac...
Article
Full-text available
Phylogeographic methods have attracted a lot of attention in recent years, stressing the need to provide a solid statistical framework for many existing methodologies so as to draw statistically reliable inferences. Here, we take a flexible fully Bayesian approach by reducing the problem to a clustering framework, whereby the population distributio...
Article
Full-text available
One of the challenges in using Markov chain Monte Carlo for model analysis in studies with very large datasets is the need to scan through the whole data at each iteration of the sampler, which can be computationally prohibitive. Several approaches have been developed to address this, typically drawing computationally manageable subsamples of the d...
Article
Full-text available
A risk model of a joint business (insurer/re-insurer) is studied in the Large Devi- ations (LD) regime. In the model considered, the premium paid by the insurer to the re-insurer changes if the insurer's capital falls below a certain level P.A n op- timal premium arrangement for the both business participants is investigated. By a proper choice of...
Article
Full-text available
Summary In cases where genetic sequence data are collected together with associated physical traits it is natural to want to link patterns observed in the trait val- ues to the underlying genealogy of the individuals. If the traits correspond to specific phenotypes, we may wish to associate specific mutations with changes observed in phenotype dist...
Article
In cases where genetic sequence data are collected together with associated geographical data it is natural to want to link patterns observed in the geographical values to the underlying genealogy of the individuals. We discuss the standard approach to analyses of this sort and propose a new framework which overcomes a number of shortcomings in the...
Article
Full-text available
The immune response to vaccines and microbial pathogens is characterized by the spatial reorganization of leukocytes into microanatomical structures such as germinal centers and granu-lomas. Data on cellular organization is often provided by immunofluorescence histology, in which antibodies against specific molecules are conjugated (directly or ind...

Projects

Project (1)
Project
In the recent years, there has been a surging interest in the use of Statistical/Machine Learning (ML) tools for causal inference. These tools can leverage large datasets and usually deliver excellent predictive performance. However they need to be properly adjusted to be used in causal settings (e.g. to account for confounding bias). The broad idea of this project is to design and develop Causal ML methods for estimating individual treatment effects, and for policy evaluation/optimization (Reinforcement Learning), with observational data.