Elena Stanghellini

Elena Stanghellini
University of Perugia | UNIPG · Department of Economics, Finance and Statistics

Ph.D. in Applied Statistics

About

77
Publications
12,457
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
667
Citations
Introduction
I work primarily on graphical Markov models, both from the theoretical and applied point of view. From the theoretical point of view, I have been exploring the issues concerning identification and distortion induced by unobserved factors, decomposition of total effect into direct and indirect ones. From the applied one, I have been using graphical Markov models to financial data and epidemiology.
Additional affiliations
October 2011 - October 2013
American Institute of Mathematics
Position
  • Member of a research group
Description
  • Identification of graphical models with latent variables
October 1994 - June 1996
The Open University
Position
  • Reserch Fellow
Description
  • I was working with a group of colleagues on how to improve the current scoring system of Barclays Bank, possibily taking jointly into account several response variables. We successfully explored the use of graphical models.
May 2007 - October 2007
University of Warwick
Position
  • Visiting Fellow

Publications

Publications (77)
Article
In statistical analysis, Cochran's formula plays a crucial role in disentangling the relationships between marginal and conditional regression coefficients. However, its results and implications are valid only within the linear case. Despite this, due to its simplicity and interpretability, practitioners often continue to use Cochran's formula also...
Article
Full-text available
This short note is a commentary on the paper by Mathur and Shpitser (2024), with the aim to enlarge the class of graphs for which the conditional Average Treatment Effect is nonparametrically identified, by allowing the outcome to be on the pathway between the treatment and the selection indicator. A first straightforward generalization is possible...
Preprint
Full-text available
With reference to a binary outcome and a binary mediator, we derive identification bounds for natural effects under a reduced set of assumptions. Specifically, no assumptions about confounding are made that involve the outcome; we only assume no unobserved exposure-mediator confounding as well as a condition termed partially constant cross-world de...
Article
Full-text available
In recent years, the leveraged loan market has experienced considerable growth, with the covenant-lite loan being the predominant agreement. The goal of this research is to assess whether the covenant-lite type reduces or increases the probability of default. Mediation analysis allows us to decompose the effect of balance sheet indicators on a defa...
Article
Full-text available
In recent years, attention toward Environmental, Social and Governance (ESG) issues has become increasingly important in the investment decision-making process, prompting interest of investors, companies, regulators and researchers on the possible relationships between financial performances and sustainable variables. With the aim to increase our u...
Article
Full-text available
With reference to a stratified case–control (CC) procedure based on a binary variable of primary interest, we derive the expression of the distortion induced by the sampling design on the parameters of the logistic model of a secondary variable. This is particularly relevant when performing mediation analysis (possibly in a causal framework) with s...
Article
Full-text available
In the post-pandemic era, the exposure to leveraged finance has emerged as a key factor of vulnerability for banks, coping with increasing inflation and interest rates. For this reason, the growth of the leveraged loans market is receiving significant attention from the Authorities (e.g. ECB, 2022). In this paper, we analyze an original sample of l...
Chapter
The attention to sustainable finance has dramatically increased in the recent past. In Europe, the perceived relevance of financial sustainability is mainly due to the European Commission’s commitment to integrate Environmental, Social and Governance (ESG) parameters into all aspects of the financial system. Our objective is to investigate the exis...
Preprint
Full-text available
By exploiting the theory of skew-symmetric distributions, we generalise existing results in sensitivity analysis by providing the analytic expression of the bias induced by marginalization over an unobserved continuous confounder in a logistic regression model. The expression is approximated and mimics Cochran's formula under some simplifying assum...
Article
Full-text available
Background The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the COVID-19 pandemic and so it is crucial the right evaluation of viral infection. According to the Centers for Disease Control and Prevention (CDC), the Real-Time Reverse Transcription PCR (RT-PCR) in respiratory samples is the gold standard for confirm...
Preprint
Full-text available
With reference to a stratified case-control procedure based on a binary variable of primary interest, we derive the expression of the distortion induced by the sampling design on the parameters of the logistic model of a secondary variable. This is particularly relevant when performing mediation analysis (possibly in a causal framework) with strati...
Article
Full-text available
With reference to a single mediator context, this brief report presents a model-based strategy to estimate counterfactual direct and indirect effects when the response variable is ordinal and the mediator is binary. Postulating a logistic regression model for the mediator and a cumulative logit model for the outcome, we present the exact parametric...
Preprint
Full-text available
With reference to a single mediator context, this brief report presents a model-based strategy to estimate counterfactual direct and indirect effects when the response variable is ordinal and the mediator is binary. Postulating a logistic regression model for the mediator and a cumulative logit model for the outcome, the exact parametric formulatio...
Conference Paper
In recent years, the context of the banking system,characterised by expansive monetary policies, has boosted the investments in leveraged loans. The COVID-19 pandemic brought the first real slowdown of the global economy since the financial crisis of 2007-08, and the growth of the leveraged loan market has been subject to significant attention from...
Article
The decomposition of the overall effect of a treatment into direct and indirect effects is here investigated with reference to a recursive system of binary random variables. We show how, for the single mediator context, the marginal effect measured on the log odds scale can be written as the sum of the indirect and direct effects plus a residual te...
Preprint
Full-text available
The decomposition of the overall effect of a treatment into direct and indirect effects is here investigated with reference to a recursive system of binary random variables. We show how, for the single mediator context, the marginal effect measured on the log odds scale can be written as the sum of the indirect and direct effects plus a residual te...
Article
Full-text available
With reference to causal mediation analysis, a parametric expression for natural direct and indirect effects is derived for the setting of a binary outcome with a binary mediator, both modelled via a logistic regression. The proposed effect decomposition operates on the odds ratio scale and does not require the outcome to be rare. It generalizes th...
Preprint
Full-text available
A parametric expression for causal natural direct and indirect effects is derived for the setting of a binary outcome with a binary mediator. The proposed effect decomposition does not require the outcome to be rare and generalizes the existing ones, allowing for interactions between both the exposure and the mediator and confounding covariates. Fu...
Article
Full-text available
Predictors of decline in health in older populations have been investigated in multiple studies before. Most longitudinal studies of aging, however, assume that dropout at follow-up is ignorable (missing at random) given a set of observed characteristics at baseline. The objective of this study was to address non-ignorable dropout in investigating...
Conference Paper
Full-text available
After Basel II, Credit Scoring techniques have assumed increasing importance and active research is taking place in this field. The availability of data and their accuracy always play an important role and became crucial when it must be decided which technique, among the several available alternatives, should be used. Our research interests focus o...
Article
A fundamental research question is how much a variation in a covariate influences a binary response variable in a logistic regression model, both directly or through mediators. We derive the exact formula linking the parameters of marginal and conditional regression models with binary mediators when no conditional independence assumptions can be ma...
Preprint
A fundamental research question is how much a variation in a covariate influences a binary response variable in a logistic regression model, both directly or through mediators. We derive the exact formula linking the parameters of marginal and conditional regression models with binary mediators when no conditional independence assumptions can be ma...
Article
Full-text available
Recent work (Seaman et al., 2013; Mealli & Rubin, 2015) attempts to clarify the not always well-understood difference between realised and everywhere definitions of missing at random (MAR) and missing completely at random. Another branch of the literature (Mohan et al., 2013; Pearl & Mohan, 2013) exploits always-observed covariates to give variable...
Article
Unless strong assumptions are made, nonparametric identification of principal causal effects can only be partial and bounds (or sets) for the causal effects are established. In the presence of a secondary outcome, recent results exist to sharpen the bounds that exploit conditional independence assumptions. More general results, though not embedded...
Article
We explore the sensitivity of time varying confounding adjusted estimates to different dropout mechanisms. We extend the Heckman correction to two time points and explore selection models to investigate situations where the dropout process is driven by unobserved variables and the outcome respectively. The analysis is embedded in a Bayesian framewo...
Article
Full-text available
This paper considers three different techniques applicable in the context of credit scoring when the event under study is rare and therefore we have to cope with unbal- anced data. Logistic regression for matched case-control studies, logistic regression for a random balanced data sample and logistic regression for a sample balanced by ROSE (Random...
Conference Paper
Full-text available
When analysing the determinants of bankruptcy of small and medium enterprises, one of the most common problems is that of unbalanced data, as very often the event under study happens in only a small percentage of cases. The aim of this paper is to explore three different statistical methods of coping with unbalanced data and to identify which of th...
Article
Full-text available
The instrumental variable (IV) formula has become widely used to address the issue of identification of a causal effect in linear systems with an unobserved variable that acts as direct confounder. We here propose two alternative formulations to achieve identification when the assumptions underlying the use of IV are violated. Parallel to the IV, t...
Chapter
Full-text available
Unless strong assumptions are made, identification of principal causal effects in causal studies can only be partial and bounds (or sets) for the causal effects are established. In the presence of a secondary outcome, recent results exist to sharpen the bounds that exploit conditional independence assumptions (Mealli and Pacini, J. Am. Stat. Assoc....
Article
Full-text available
When estimating regression models with missing outcomes, scientists usually have to rely either on a missing at random assumption (missing mechanism is independent from the outcome given the observed variables) or on exclusion restrictions (some of the covariates affecting the missingness mechanism do not affect the outcome). Both these hypotheses...
Article
Full-text available
Corporate social responsibility (CSR) is a multidimensional concept that involves several aspects, ranging from environment to social and governance. Companies aiming to comply with CSR standards have to face challenges that vary from one aspect to the other and from one industry to the other. Latent variable models may be usefully employed to prov...
Article
Full-text available
Identifiability of parameters is an essential property for a statistical model to be useful in most settings. However, establishing parameter identifiability for Bayesian networks with hidden variables remains challenging. In the context of finite state spaces, we give algebraic arguments establishing identifiability of some special models on small...
Article
Full-text available
Conditions are presented for local identifiability of discrete undirected graphical models with a binary hidden node. These models can be obtained by extending the latent class model to allow for conditional associations between the observed variables. We establish a necessary and sufficient condition for the model to be locally identified almost e...
Chapter
Full-text available
In applied studies, researchers are often confronted with semicontinuous response models. These are models with a semicontinuous response variable, i.e. a continuous variable that has a lower bound, that we here consider to be zero, and such that a sizable fraction of the observations takes value on this boundary. Semicontinuous response models are...
Conference Paper
Full-text available
Whether parameters of a DAG model with hidden variables can be identified is a difficult question. Here we give algebraic arguments establishing identifiability for two special DAG models with certain restrictions on the size of the finite state spaces of all variables. These results can be used to shed light on many other models. As an illustratio...
Conference Paper
Full-text available
Unless strong assumptions are made, identification of principal causal effects in causal studies can only be partial and bounds (or sets) for the causal effects are established. In the presence of a secondary outcome, recent results exist to sharpen the bounds that exploit conditional independence assumptions (Mealli and Pacini, 2012). More general...
Article
Full-text available
a b s t r a c t A generalization of the Probit model is presented, with the extended skew-normal cumulative distribution as a link function, which can be used for modelling a binary response variable in the presence of selectivity bias. The estimate of the parameters via ML is addressed, and inference on the parameters expressing the degree of sele...
Article
Full-text available
In a recent observational study on a sample of patients with herniated lumbar discs who underwent physiotherapy, the recovery rate of those who chose physiotherapy rather than surgery, as recommended, was not appreciably different from that of the other patients, although their prognosis was worse. To investigate whether this finding was due to a c...
Article
Health care interventions that use quality of life or health scores often provide data which are skewed and bounded. The scores are typically formed by adding up numerical responses to a number of questions. Different questions might have different weights, but the scores will be bounded, and are often scaled to the range 0–100. If improvement in h...
Article
Full-text available
La corretta misurazione del rischio di default presenta difficoltà dovute al fatto che l'evento in studio è raro. In questo lavoro si usa la metodologia statistica tipica degli studi caso-controllo. Questa tecnica è applicata alla quantificazione della probabilità di default delle SME umbre a partira dagli indicatori di bilancio. Un rigoroso proces...
Article
Full-text available
Compliance with Corporate Social Responsibility (CSR) standards may require capacity that varies from one aspect to the other and companies in different industries may encounter different difficulties. Since CSR is a multidimensional concept, latent variable models may be usefully employed to provide a unidimensional measure of the ability of a fir...
Article
Full-text available
Monitoring the incidence of bacterial meningitis is important to plan and evaluate preventive policies. The study's aim was to estimate the incidence of bacterial meningitis by aetiological agent in the period 2001-2005, in Lazio Italy (5.3 mln inhabitants). Data collected from four sources--hospital surveillance of bacterial meningitis, laboratory...
Article
Full-text available
This paper details a method for estimating the unknown parameters of a regression model when the estimates of the dependent variable should be embedded in an input–output table with accounting constraints. Since in regression modelling the dependent variable is usually transformed either to achieve homoscedasticity of the residuals or for a better...
Chapter
Oltre al modello logistico e alla analisi discriminante, altri metodi statistici sono correntemente in uso nell’ambito del credit scoring. Essi costituiscono l’evoluzione dei modelli presentati nei capitoli precedenti e, pertanto, per la loro formulazione occorre attingere in larga parte dai concetti teorici già illustrati. Alcuni di questi strumen...
Chapter
Nel momento in cui riceve una richiesta di finanziamento, la banca o l’intermediario finanziario deve valutare il rischio che il soggetto che richiede il credito non sia in grado di fare fronte agli impegni contrattuali. Sempre più spesso nei moderni intermediari finanziari, per formulare il proprio giudizio, l’analista del credito si avvale di tec...
Chapter
Questo capitolo vuole introduire il lettore al modello logistico nell’ambito del credit scoring. La distribuzione di interesse è quella della variabile casuale di classificazione, che qui indicheremo con Y, condizionata ai valori x = (x1, x2, ..., xp ) delle variabili esplicative. In questo contesto, il modello logistico è una funzione che lega lo...
Chapter
Come abbiamo detto, le variabili casuali categoriali svolgono un ruolo fondamentale nei modelli statistici per il credit scoring. La principale variabile casuale categoriale è la variabile di classificazione. Tuttavia, variabili casuali categoriali si trovano spesso anche fra le variabili che delineano il profilo socio-economico dei soggetti: quest...
Chapter
A differenza del modello logistico, l’analisi discriminante è nata come strumento di classificazione. Nella sua prima formulazione, che risale a Fisher (1936), essa costituisce un metodo per descrivere, attraverso una funzione unidimensionale, la differenza fra due popolazioni e allocare ciascuna osservazione alla popolazione di provenienza. Sebben...
Article
We explore the effects of truncation on the joint distribution of the observable random variables. A general formula for the distortion induced by truncation in the least-squares coefficients is presented. The implications of our derivations are illustrated with an example.
Article
Full-text available
We study criteria for identifiability of path analysis models with one hidden variable. We first derive sufficient criteria for identification of models in which marginalisation is carried out over the hidden variable. The sufficient criteria are based on the structure of the directed acyclic graph associated with the path analysis model and can be...
Article
We discuss the problem of identification of relevant parameters in a DAG model for Gaussian variables with one unobserved variable that acts as a counfounder. We first make explicit what we intend for identification and then discuss an example where a modified notion of instrumental variable renders a system with a confounder identifiable. Copyrigh...
Article
We present a model to estimate the size of an unknown population from a number of lists that applies when the assumptions of (a) homogeneity of capture probabilities of individuals and (b) marginal independence of lists are violated. This situation typically occurs in epidemiological studies, where the heterogeneity of individuals is severe and res...
Article
Consumer credit has become an enormous business in industrialized countries. Recently, finance agencies have started to develop new products aiming not only to widen their portfolio but also to keep active relationships with good clients already taken on file and to prevent bad clients from becoming a loss for the agency. As a result, models for th...
Article
This paper explores the usefulness of the multivariate skew-normal distribution in the context of graphical models. A slight extension of the family recently discussed by Azzalini & Dalla Valle (1996) and Azzalini & Capitanio (1999) is described, the main motivation being the additional property of closure under conditioning. After considerations o...
Article
Full-text available
We generalize factor analysis models by allowing the concentration matrix of the residuals to have nonzero off-diagonal elements. The resulting model is named graphical factor analysis model. Allowing a structure of associations gives information about the correlation left unexplained by the unobserved variables, which can be used both in the confi...
Article
A bank offering unsecured personal loans may be interested in several related outcome variables, including defaulting on the repayments, early repayment or failing to take up an offered loan. Current predictive models used by banks typically consider such variables individually. However, the fact that they are related to each other, and to many int...
Article
Full-text available
1. Models for consumer credit scoring With the term consumer credit" we mean any of the many forms of commerce under which an individual obtains money or goods or services on condition of a promise to repay the money or to pay for the goods or services, along with a fee the interest, at some speciic future date or dates" Lewis, 1994, p. 1. Consumer...
Conference Paper
We use a latent class model to analyse the results of an aptitude test in statistics assigned to a group of students at our university. We implement various algorithms for maximum likelihood estimation of the parameters (the EM algorithm, two accelerators of the EM algorithm introduced recently and a plain Fisher-scoring algorithm) and compare thei...
Chapter
We introduce a graphical factor analysis model as a graphical Gaussian model with latent variables satisfying a set of conditional independence constraints. After a brief introduction of the factor analysis model, we generalise the class of such models by allowing the concentration matrix of the residuals to have non-zero off-diagonal elements. The...
Article
We state a sufficient condition for the global identification of a single-factor model when some conditional associations among residuals are allowed The condition relies on the structure of the conditional independence graph of the observed variable given the latent factor, and can be derived using graphical rules.
Article
Full-text available
Graphical models are a class of statistical tools which have recently undergone extensive theoretical development. Theyallow one to build models representing the relationships between large numbers of variables, helping to identify paths by which different variables are influenced by others. They look particularly promising for credit-scoring and c...
Article
Full-text available
We study the survival probability of an homogeneous group of economic agents by adopting the reduced-form approach and assuming an ane evolution of the default intensities. We consider both the cases of continuous and discrete-times observations, and propose a modified likelihood function. We discuss the estimation of the parameters of interest and...
Article
Full-text available
La modellizzazione dei flussi turistici consente di ottenere una importante analisi ai fini della conoscenza del funzionamento dei sistemi economici regionali, specialmente in Italia, dove, in alcune regioni l’incidenza della spesa turistica rappresenta circa il 10% del consumo interno delle famiglie (ad esempio in Toscana, Veneto ed Emilia Romagna...

Network

Cited By