
Alessandra Menafoglio- PhD
- Professor (Assistant) at Politecnico di Milano
Alessandra Menafoglio
- PhD
- Professor (Assistant) at Politecnico di Milano
About
87
Publications
9,508
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,073
Citations
Introduction
Current institution
Additional affiliations
November 2016 - present
May 2015 - October 2016
Education
January 2012 - March 2015
October 2009 - December 2011
September 2006 - July 2009
Publications
Publications (87)
We address the problem of predicting a target ordinal variable based on observable features consisting of functional profiles. This problem is crucial, especially in decision-making driven by sensor systems, when the goal is to assess an ordinal variable such as the degree of deterioration, quality level, or risk stage of a process, starting from f...
Conformal inference provides a rigorous statistical framework for uncertainty quantification in machine learning, enabling well-calibrated prediction sets with precise coverage guarantees for any classification model. However, its reliance on the idealized assumption of perfect data exchangeability limits its effectiveness in the presence of real-w...
This paper introduces a robust approach to functional principal component analysis (FPCA) for compositional data, particularly density functions. While recent papers have studied density data within the Bayes space framework, there has been limited focus on developing robust methods to effectively handle anomalous observations and large noise. To a...
This work aims at identifying and modelling statistical dependencies between empirical amplification functions of sites in central Italy and the main geological and geophysical characteristics of the region, within a geostatistical analysis framework. The empirical functions, named δS2S, are estimated by decomposing the re-siduals of the median pre...
In this talk, based on [1], we propose a spatio-temporal analysis of daily death counts in Italy, collected by ISTAT (Italian Statistical Institute), in Italian provinces and municipalities. While in [1] the focus was on the elderly class (70+ years old), we here focus on the middle class (50–69 years old), carrying out analogous analyses and compa...
In this paper, we propose a new robust non-parametric functional analysis of variance method (RoFANOVA) that reduces the weights of outlying curves on the functional analysis of variance. It is implemented through a permutation test based on a test statistic obtained via a functional M-estimator. The performance of the RoFANOVA is demonstrated thro...
In many industrial scenarios, the process quality characteristic of interest can be either a scalar or, in a more general case, a profile, that is, a function defined on a compact set, usually time or space. In the latter case, the quality characteristic can also be referred to as functional quality characteristic. Profiles are usually obtained fro...
https://authors.elsevier.com/a/1h4BYMMTPor9N
A new orthogonal decomposition for bivariate probability densities embedded in Bayes Hilbert spaces is derived. It allows representing a density into independent and interactive parts, the former being built as the product of revised definitions of marginal densities, and the latter capturing the dependence between the two random variables being st...
Modern statistical process monitoring (SPM) applications focus on profile monitoring, i.e., the monitoring of process quality characteristics that can be modeled as profiles, also known as functional data. Despite the large interest in the profile monitoring literature, there is still a lack of software to facilitate its practical application. This...
The identification of the channels through which a given shock spreads to the rest of the economy, determining its final impact, is essential to formulate effective policy interventions. Input-output tables (IOTs) are widely used to detect the network of intersectoral relations of a country - i.e., its sectoral technological structure or domestic s...
In this paper, we propose an adaptive smoothing spline (AdaSS) estimator for the function-on-function linear regression model where each value of the response, at any domain point, depends on the full trajectory of the predictor. The AdaSS estimator is obtained by the optimization of an objective function with two spatially adaptive penalties, base...
The problem of providing data-driven models for sediment transport in a pre-Alpine stream in Italy is addressed. This study is based on a large set of measurements collected from real pebbles, traced along the stream through radio-frequency identification tags after precipitation events. Two classes of data-driven models based on machine learning a...
This work proposes a novel approach to the calibration of regionalized regression models, with particular reference to ground-motion models (GMMs), which are key for probabilistic seismic hazard analysis and earthquake engineering applications. A novel methodology, named multi-source geographically-weighted regression (MS-GWR), is developed, allowi...
We present a novel approach named Physics-based Residual Kriging for the statistical prediction of spatially dependent functional data. It incorporates a physical model—expressed by a partial differential equation—within a Universal Kriging setting through a geostatistical modelization of the residuals with respect to the physical model. The approa...
In this chapter, we review the mathematical framework for spatial prediction (kriging) for complex data. We focus here on the approach developed within the area of Object-Oriented Spatial Statistics which grounds on the foundational idea that the atom of the geostatistical analysis is the entire data point, regardless of its complexity. This is see...
The focus of this chapter is on spatial statistical methods for functional compositions (FCs). The latter constitutes the generalization to the functional setting of multivariate compositional data. Instances of these data are probability density functions. Our work is fully consistent with the observation that data analyses in a modern geostatisti...
The development of data acquisition systems is facilitating the collection of data that are apt to be modelled as functional data. In some applications, the interest lies in the identification of significant differences in group functional means defined by varying experimental conditions, which is known as functional analysis of variance (FANOVA)....
A general and flexible bi-clustering algorithm for the analysis of Hilbert data is presented in the Object Oriented Data Analysis framework. The algorithm, called HC2 (i.e. Hilbert Cheng and Church), is a non-parametric method to bi-cluster Hilbert data indexed in a matrix structure. The Cheng and Church approach is here extended to the general cas...
Georeferenced compositional data are prominent in many scientific fields and in spatial statistics. This work addresses the problem of proposing models and methods to analyze and predict, through kriging, this type of data. To this purpose, a novel class of α-transformations, named the Isometric α-transformation (α-IT), is proposed, which encompass...
The aim of this work is to introduce an approach to null hypothesis significance testing in a functional linear model for spatial data. The proposed method is capable of dealing with the spatial structure of data by building a permutation testing procedure on spatially filtered residuals of a spatial regression model. Indeed, due to the spatial dep...
Georeferenced compositional data are prominent in many scientific fields and in spatial statistics. This work addresses the problem of proposing models and methods to analyze and predict, through kriging, this type of data. To this purpose, a novel class of transformations, named the Isometric $\alpha$-transformation ($\alpha$-IT), is proposed, whi...
With the tools and perspective of Object Oriented Spatial Statistics, we analyze official daily data on mortality from all causes in the provinces and municipalities of Italy for the year 2020, the first of the COVID-19 pandemic. By comparison with mortality data from 2011 to 2019, we assess the local impact of the pandemic as perturbation factor o...
In this article, we implement a new approach to calibrate ground-motion models (GMMs) characterized by spatially varying coefficients, using the calibration dataset of an existing GMM for crustal events in Italy. The model is developed in the methodological framework of the multisource geographically weighted regression (MS-GWR, Caramenti et al., 2...
The assessment of the vulnerability of a community endangerd by seismic hazard is of paramount importance for planning a precision policy aimed at the prevention and reduction of its seismic risk. We aim at measuring the vulnerability of the Italian municipalities exposed to seismic hazard, by analysing the open data offered by the Mappa dei Rischi...
The industrial development of new production processes like additive manufacturing (AM) is making available novel types of complex shapes that go beyond traditionally manufactured geometries and 2.5D free-form surfaces. New challenges must be faced to characterize, model and monitor the natural variability of such complex shapes, since previously p...
It often occurs in practice that it is sensible to give different weights to the variables involved in a multivariate data analysis—and the same holds for compositional data as multivariate observations carrying relative information. It can be convenient to apply weights to better accommodate differences in the quality of the measurements, the occu...
On modern ships, the quick development in data acquisition technologies is producing data‐rich environments where variable measurements are continuously streamed and stored during navigation and thus can be naturally modelled as functional data or profiles. Then, both the CO emissions (i.e. the quality characteristic of interest) and the variable p...
We present a numerical model of soil erosion at the basin scale that allows one to describe surface runoff without a priori identifying drainage zones, river beds and other water bodies. The model is based on robust and unconditionally stable numerical techniques and guarantees mass conservation and positivity of the surface and subsurface water la...
In this work, we present a novel downscaling procedure for compositional quantities based on the Aitchison geometry. The method is able to naturally consider compositional constraints, i.e. unit-sum and positivity, accounting for the scale invariance and relative scale of these data. We show that the method can be used in a block sequential Gaussia...
In the presence of increasingly massive and heterogeneous spatial data, geostatistical modeling of distributional observations plays a key role. Choosing the “right” embedding space for these data is of paramount importance for their statistical processing, to account for their nature and inherent constraints. The Bayes space theory is a natural em...
Distributional data, such as age distributions of populations, can be treated as continuous or discrete data, but the main interest is in the relative information, e.g., in terms of ratios (or logratios) between the different age classes. Here we present a unifying framework for the discrete and the continuous case based on the theory of Bayes spac...
Recent advances in satellite technologies, statistical and mathematical models, and computational resources have paved the way for operational use of satellite data in monitoring and forecasting natural hazards. We present a review of the use of satellite data for Earth observation in the context of geohazards preventive monitoring and disaster eva...
The assessment of the vulnerability of a community endangerd by seis-mic hazard is of paramount importance for planning a precision policyaimed at the prevention and reduction of its seismic risk. We aim at mea-suring the vulnerability of the Italian municipalities exposed to seismichazard, by analyzing the open data offered by the Mappa dei Rischi...
The problem of bi-clustering functional data, which has recently been addressed in literature, is considered. A definition of ideal functional bi-cluster is given and a novel bi-clustering method, called Functional Cheng and Church (FunCC), is developed. The introduced algorithm searches for non-overlapping and non-exhaustive bi-clusters in a set o...
We address the problem of characterizing spatially variable Natural Background Levels (NBLs) of concentrations of chemical species of environmental concern in a large-scale groundwater body. Assessment of NBLs is critical to identify significant trends of (possibly hazardous) chemical concentrations in aquifer systems, the latter being typically as...
In this work we describe a first version of the simulation tool developed within the SMART-SED project. The two main components of the SMART-SED model consist in a data preprocessing tool and in a robust numerical solver, which does not require a priori identification of river beds and other surface run-off areas, thus being especially useful to pr...
A new orthogonal decomposition for bivariate probability densities embedded in Bayes Hilbert spaces is derived. It allows one to represent a density into independent and interactive parts, the former being built as the product of revised definitions of marginal densities and the latter capturing the dependence between the two random variables being...
In this paper, we propose an adaptive smoothing spline (AdaSS) estimator for the function-on-function linear regression model where each value of the response, at any domain point, depends on the full trajectory of the predictor. The AdaSS estimator is obtained by the optimization of an objective function with two spatially adaptive penalties, base...
Data taking value on a Riemannian manifold and observed over a complex spatial domain are becoming more frequent in applications, e.g. in environmental sciences and in geoscience. The analysis of these data needs to rely on local models to account for the non stationarity of the generating random process, the nonlinearity of the manifold and the co...
This work offers a novel methodological framework to address the problem of generating data-driven earthquake shaking fields at different vibration periods, which are key to support decision making and civil protection planning. We propose to analyse the entire profiles of spectral accelerations and project their information content to unsampled lo...
In this work, we present a novel downscaling procedure for compositional quantities based on the Aitchison geometry. The method is able to naturally consider compositional constraints, i.e. unit-sum and positivity. We show that the method can be used in a block sequential Gaussian simulation framework in order to assess the variability of downscale...
We illustrate a fewrecent ideas of Object Oriented Spatial Statistics (O2S2), focusing on the problem of kriging prediction in situations where a global second order stationarity assumption for the random field generating the data is not justifiable or the space domain of the field is complex. By localizing the analysis through the Random Domain De...
The modern development of data acquisition technologies in many industrial processes is facilitating the collection of quality characteristics that are apt to be modelled as functions, which are usually referred to as profiles. At the same time, measurements of concurrent variables, which are related to the quality characteristic profiles, are ofte...
In functional data analysis some region(s) of the domain of the functions can be of more interest than others due to the quality of measurement, relative scale of the domain, or simply due to some external reason (e.g., interest of stakeholders). Weighting the domain is of interest particularly with probability density functions (PDFs), as derived...
To respond to the compelling air pollution programs, shipping companies are nowadays setting‐up on their fleets modern multisensor systems that stream massive amounts of observational data, which can be considered as varying over a continuous domain. Motivated by this context, a novel procedure is proposed, which extends classical multivariate tech...
This paper proposes a novel nonparametric approach to model and reveal differences in the geochemical properties of the soil, when these are described by space–time measurements collected in a spatial region naturally divided into two parts. The investigation is motivated by a real study on a space–time geochemical data set, consisting of measureme...
Probability density functions (PDFs) can be understood as continuous compositions by the theory of Bayes spaces. The origin of a Bayes space is determined by a given reference measure. This can be easily changed through the well-known chain rule which has an impact on the geometry of the Bayes space. This work provides a mathematical framework for...
Applied sciences have witnessed an explosion of georeferenced data. Object oriented spatial statistics (O2S2) is a recent system of ideas that provides a solid framework where the new challenges posed by the GeoData revolution can be faced, by grounding the analysis on a powerful geometrical and topological approach. We shall present a perspective...
Data taking value on a Riemannian manifold and observed over a complex spatial domain are becoming more frequent in applications, e.g. in environmental sciences and in geoscience. The analysis of these data needs to rely on local models to account for the non stationarity of the generating random process, the non linearity of the manifold and the c...
We propose a new methodology for the analysis of spatial fields of object data distributed over complex domains. Our approach enables to jointly handle both data and domain complexities, through a divide et impera approach. As a key element of innovation, we propose to use a random domain decomposition, whose realizations define sets of homogeneous...
Bayes spaces are mathematical spaces whose elements are σ‐finite positive measures, identified with their densities w.r.t. a reference measure, when these exist. In Bayes spaces, statistical analysis of multivariate and functional compositional data (e.g., discrete or continuous probability density functions) can be performed. The geometrical struc...
The single diffusion tensor model for mapping the brain white matter microstructure has long been criticized as providing sensitive yet non-specific clinical biomarkers for neurodegenerative diseases because (i) voxels in diffusion images actually contain more than one homogeneous tissue population and (ii) diffusion in a single homogeneous tissue...
In this paper we propose Universal trace co-kriging, a novel methodology for interpolation of multivariate Hilbert space valued functional data. Such data commonly arises in multi-fidelity numerical modeling of the subsurface and it is a part of many modern uncertainty quantification studies. Besides theoretical developments we also present methodo...
The advance of sensor and information technologies is leading to data-rich industrial environments, where large amounts of data are potentially available. This study focuses on industrial applications where image data are used more and more for quality inspection and statistical process monitoring. In many cases of interest, acquired images consist...
The problem of performing functional linear regression when the response variable is represented as a probability density function (PDF) is addressed. PDFs are interpreted as functional compositions, which are objects carrying primarily relative information. In this context, the unit integral constraint allows to single out one of the possible repr...
The original version of this article unfortunately contained a mistake. In ”Acknowledgment” the Grant Agreement No. was incorrect. The correct number is No. 636811.
We review recent advances in Object Oriented Spatial Statistics, a system of ideas, algorithms and methods that allows the analysis of high dimensional and complex data when their spatial dependence is an important issue. At the intersection of different disciplines – including mathematics, statistics, computer science and engineering – Object Orie...
We address the problem of stochastic simulation of soil particle-size curves (PSCs) in heterogeneous aquifer systems. Unlike traditional approaches that focus solely on a few selected features of PSCs (e.g., selected quantiles), our approach considers the entire particle-size curves and can optionally include conditioning on available data. We rely...
The quality characteristics in manufacturing processes are often represented in terms of spatially- or time-ordered data, called “profiles”, which are characterized by amplitude and phase variability. In this context, curve registration plays a key role, as it allows separating the two kinds of between-profiles variability and reducing any undesired...
In this paper we investigate the practical and methodological use of Universal Kriging of functional data to predict unconventional shale gas production in undrilled locations from known production data. In Universal Kriging of functional data, two approaches are considered: (1) estimation by means of Cokriging of functional components (Universal C...
We focus on the geostatistical characterization of the spatial distribution of soil particle-size curves (PSCs) within an alluvial aquifer. We consider as data object the entire PSC with a Compositional Data Analysis (CoDa) approach. Data are viewed as a point in the infinite-dimensional Hilbert space of functional compositions. The latter is endow...
The statistical analysis of data belonging to Riemannian manifolds is becoming increasingly important in many applications, such as shape analysis, diffusion tensor imaging and the analysis of covariance matrices. In many cases, data are spatially distributed but it is not trivial to take into account spatial dependence in the analysis because of t...
This work addresses the problem of characterizing the spatial field of soil particle-size distributions within a heterogeneous aquifer system. The medium is conceptualized as a composite system, characterized by spatially varying soil textural properties associated with diverse geomaterials. The heterogeneity of the system is modeled through an ori...
The world, in particularly, the USA has seen an explosion in development of unconventional shale resources. In these reservoirs drilling and production occurs at development times orders of magnitude shorter than in conventional resources. As a result, decisions about where to drill and how to complete wells (hydro fracturing) need to be made in al...
We develop a comprehensive framework for linear spatial prediction in Hilbert spaces. We explore the problem of Best Linear Unbiased (BLU) prediction in Hilbert spaces through an original point of view, based on a new Operatorial definition of Kriging. We ground our developments on the theory of Gaussian processes in function spaces and on the asso...
BarCamp is a quite new type of event for the scientific and technological community. In full generality, it is an “unconference”, a meeting where everyone can contribute, presenting a topic and generating a discussion. In this paper, we propose the BarCamp as an innovative way of producing and communicating statistical knowledge, and we describe th...
We consider the problem of predicting the spatial field of particle-size curves (PSCs) from a sample observed at a finite set of locations within an alluvial aquifer near the city of Tübingen, Germany. We interpret PSCs as cumulative distribution functions and their derivatives as probability density functions. We thus (a) embed the available data...
Probability density functions are frequently used to characterize the distribu-
tional properties of large-scale database systems. As functional compositions,
densities carry primarily relative information. As such, standard methods of func-
tional data analysis (FDA) are not appropriate for their statistical processing. The
specific features of de...
In this work we address the problem of performing uncertainty and sensitivity analysis of complex physical systems where classical Monte-Carlo methods are too expensive to be applied due to the high computational complexity. We consider the Polynomial Chaos Expansion (PCE) as an efficient way of computing a response surface for a model of gas injec...
We address the problem of predicting spatially dependent functional data belonging to a Hilbert space, with a Functional Data Analysis approach. Having defined new global measures of spatial variability for functional random processes, we derive a Universal Kriging predictor for functional data. Consistently with the new established theoretical res...