Antonio Punzo

Antonio Punzo
University of Catania | UNICT · Department of Economics and Business

PhD in Methodological and Applied Statistics

About

122
Publications
9,349
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,529
Citations
Introduction
Current research interests: Mixture Models, Hidden Markov Models, Model-Based Clustering and Classification, Heavy-tailed Distributions
Additional affiliations
February 2013 - July 2013
University of Guelph
Position
  • Visiting professor
January 2012 - present

Publications

Publications (122)
Article
Full-text available
This paper develops a quantile hidden semi-Markov regression to jointly estimate multiple quantiles for the analysis of multivariate time series. The approach is based upon the Multivariate Asymmetric Laplace (MAL) distribution, which allows to model the quantiles of all univariate conditional distributions of a multivariate response simultaneously...
Article
Full-text available
Hidden Markov models (HMMs) have been extensively used in the univariate and multivariate literature. However, there has been an increased interest in the analysis of matrix-variate data over the recent years. In this manuscript we introduce HMMs for matrix-variate balanced longitudinal data, by assuming a matrix normal distribution in each hidden...
Article
We propose the family of dimension-wise scaled normal mixtures (DSNMs) to model the joint distribution of a d-variate random variable with real-valued components. Each member of the family generalizes the multivariate normal (MN) distribution in two directions. Firstly, the DSNM has a more general type of symmetry with respect to the elliptical sym...
Article
Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the...
Article
Full-text available
We propose a general approach to detect measurement non-invariance in latent Markov models for longitudinal data. We define different notions of differential item functioning in the context of panel data. We then present a model selection approach based on the Bayesian information criterion (BIC) to choose both the number of latent states and the m...
Article
In the original publication of the article, the line after equation (5) has been published incorrectly
Preprint
Cluster-weighted models (CWMs) extend finite mixtures of regressions (FMRs) in order to allow the distribution of covariates to contribute to the clustering process. In a matrix-variate framework, the matrix-variate normal CWM has been recently introduced. However, problems may be encountered when data exhibit skewness or other deviations from norm...
Article
Much work has been done in the area of the cluster weighted model (CWM), which extends the finite mixture of regression model to include modelling of the covariates. Although many types of distributions have been considered for both the response(s) and covariates, to our knowledge skewed distributions have not yet been considered in this paradigm....
Article
Analysis of matrix-variate data is becoming ever more prevalent in the literature, especially in the area of clustering and classification. Real data, including real matrix-variate data, are often contaminated by potential outlying observations. Their detection, as well as the development of models insensitive to their presence, is particularly imp...
Preprint
Full-text available
Hidden Markov models (HMMs) have been extensively used in the univariate and multivariate literature. However, there has been an increased interest in the analysis of matrix-variate data over the recent years. In this manuscript we introduce HMMs for matrix-variate longitudinal data, by assuming a matrix normal distribution in each hidden state. Su...
Article
Full-text available
Finite mixtures of regressions with fixed covariates are a commonly used model-based clustering methodology to deal with regression data. However, they assume assignment independence, i.e., the allocation of data points to the clusters is made independently of the distribution of the covariates. To take into account the latter aspect, finite mixtur...
Article
Full-text available
Despite recent methodological advances in hidden Markov regression models and a rapid increase in their application in a wide range of empirical settings, complex clustering-based research questions that include the contribution of the covariates set to the classification and the presence of atypical observations are often addressed ignoring the po...
Article
Full-text available
Many statistical problems involve the estimation of a d×d orthogonal matrix Q. Such an estimation is often challenging due to the orthonormality constraints on Q. To cope with this problem, we use the well-known PLU decomposition, which factorizes any invertible d×d matrix as the product of a d×d permutation matrix P, a d×d unit lower triangular ma...
Preprint
Full-text available
Finite mixtures of regressions with fixed covariates are a commonly used model-based clustering methodology to deal with regression data. However, they assume assignment independence, i.e. the allocation of data points to the clusters is made independently of the distribution of the covariates. In order to take into account the latter aspect, finit...
Preprint
Much work has been done in the area of the cluster weighted model (CWM), which extends the finite mixture of regression model to include modelling of the covariates. Although many types of distributions have been considered for both the response and covariates, to our knowledge skewed distributions have not yet been considered in this paradigm. Her...
Chapter
One of the challenges in cluster analysis is the evaluation of the obtained clustering results without using auxiliary information. To this end, a common approach is to use internal validity criteria. For mixtures of linear regressions whose parameters are estimated via the maximum likelihood approach, we propose a three-term decomposition of the t...
Article
The search of appropriate models for describing the currency return distribution is one of the main interests not only in finance, but also in the more recent trans-disciplinary econophysics research field. Such a search is recently focusing on cryptocurrencies, due to their proliferation. Although there is no agreement of what theoretical models a...
Article
In allometric studies, the joint distribution of the log-transformed morphometric variables is typically symmetric and with heavy tails. Moreover, in the bivariate case, it is customary to explain the morphometric variation of these variables by fitting a convenient line, as for example the first principal component (PC). To account for all these p...
Article
The expectation–maximization (EM) algorithm is a familiar tool for computing the maximum likelihood estimate of the parameters in hidden Markov and semi‐Markov models. This paper carries out a detailed study on the influence that the initial values of the parameters impose on the results produced by the algorithm. We compare random starts and parti...
Article
The research objective of this paper is to handle situations where the empirical distribution of multivariate real-valued data is elliptical and with heavy tails. Many statistical models already exist that accommodate these peculiarities. This paper enriches this branch of literature by introducing the multivariate tail-inflated normal (MTIN) distr...
Article
A correct modelization of the insurance losses distribution is crucial in the insurance industry. This distribution is generally highly positively skewed, unimodal hump-shaped, and with a heavy right tail. Compound models are a profitable way to accommodate situations in which some of the probability masses are shifted to the tails of the distribut...
Article
Two matrix-variate distributions, both elliptical heavy-tailed generalization of the matrix-variate normal distribution, are introduced. They belong to the normal scale mixture family, and are respectively obtained by choosing a convenient shifted exponential or uniform as mixing distribution. Moreover, they have a closed-form for the probability d...
Preprint
This paper introduces the multivariate tail-inflated normal (MTIN) distribution, an elliptical heavy-tails generalization of the multivariate normal (MN). The MTIN belongs to the family of MN scale mixtures by choosing a convenient continuous uniform as mixing distribution. Moreover, it has a closed-form for the probability density function charact...
Preprint
Analysis of three-way data is becoming ever more prevalent in the literature, especially in the area of clustering and classification. Real data, including real three-way data, are often contaminated by potential outlying observations. Their detection, as well as the development of robust models insensitive to their presence, is particularly import...
Article
This article shows how multivariate elliptically contoured (EC) distributions, parameterized according to the mean vector and covariance matrix, can be built from univariate standard symmetric distributions. The obtained distributions are referred to as moment-parameterized EC (MEC) herein. As a further novelty, the article shows how to polynomiall...
Article
In allometric studies, the joint distribution of the log‐transformed morphometric variables is typically elliptical and with heavy tails. To account for these peculiarities, we introduce the multivariate shifted exponential normal (MSEN) distribution , an elliptical heavy‐tailed generalization of the multivariate normal (MN). The MSEN belongs to th...
Article
Full-text available
Mixtures of regression models (MRMs) are widely used to investigate the relationship between variables coming from several unknown latent homogeneous groups. Usually, the conditional distribution of the response in each mixture component is assumed to be (multivariate) normal (MN-MRM). To robustify the approach with respect to possible elliptical h...
Article
The multivariate contaminated normal (MCN) distribution represents a simple heavy-tailed generalization of the multivariate normal (MN) distribution to model elliptical contoured scatters in the presence of mild outliers (also referred to as ‘bad’ points herein) and automatically detect bad points. The price of these advantages is two additional pa...
Article
We propose a model-based clustering procedure where each component can take into account cluster-specific mild outliers through a flexible distributional assumption, and a proportion of observations is additionally trimmed. We propose a penalized likelihood approach for estimation and selection of the proportions of mild and gross outliers. A theor...
Chapter
The Mincer human capital earnings function is a regression model that relates individual’s earnings to schooling and experience. It has been used to explain individual behavior with respect to educational choices and to indicate productivity on a large number of countries and across many different demographic groups. However, recent empirical studi...
Article
The contaminated Gaussian distribution represents a simple heavy-tailed elliptical generalization of the Gaussian distribution; unlike the often-considered t-distribution, it also allows for automatic detection of mild outlying or “bad” points in the same way that observations are typically assigned to the groups in the finite mixture model context...
Chapter
In many countries, income inequality has reached its highest level over the past half century. In the labor market, the technological progress has widened the earnings gap between high- and low-skilled workers. Changes in the structure of households, with a growing percentage of single-headed households, and in family formation, with an increased e...
Article
Full-text available
While latent class (LC) models with distal outcomes are becoming popular in literature as a consequence of the increasing use of stepwise estimators, these models still suffer from severe shortcomings. Namely, using the currently available stepwise estimators the direct effects between the distal outcome and the indicators of the LC membership cann...
Article
One of the challenges in cluster analysis is the evaluation of the obtained clustering results without using auxiliary information. To this end, a common approach is to use internal validity criteria. For mixtures of linear regressions whose parameters are estimated by maximum likelihood, we propose a three-term decomposition of the total sum of sq...
Preprint
Many statistical problems involve the estimation of a $\left(d\times d\right)$ orthogonal matrix $\textbf{Q}$. Such an estimation is often challenging due to the orthonormality constraints on $\textbf{Q}$. To cope with this problem, we propose a very simple decomposition for orthogonal matrices which we abbreviate as PLR decomposition. It produces...
Article
The empirical distribution of the loss given default (LGD) has support [0,1], contains an excess of 0s and 1s, and is often multimodal on (0,1). Though some parametric models have been used in the credit risk literature to model the LGD distribution, these peculiarities call for more flexible approaches. Thus, we introduce a zero‐and‐one inflated m...
Article
We introduce multivariate models for the analysis of stock market returns. Our models are developed under hidden Markov and semi-Markov settings to describe the temporal evolution of returns, whereas the marginal distribution of returns is described by a mixture of multivariate leptokurtic-normal (LN) distributions. Compared to the normal distribut...
Article
Mixtures of multivariate contaminated shifted asymmetric Laplace distributions are developed for handling asymmetric clusters in the presence of outliers (also referred to as bad points herein). In addition to the parameters of the related non-contaminated mixture, for each (asymmetric) cluster, our model has one parameter controlling the proportio...
Article
Insurance and economic data are often positive, and we need to take into account this peculiarity in choosing a statistical model for their distribution. An example is the inverse Gaussian (IG), which is one of the most famous and considered distributions with positive support. With the aim of increasing the use of the IG distribution on insurance...
Preprint
The multivariate contaminated normal (MCN) distribution represents a simple heavy-tailed generalization of the multivariate normal (MN) distribution to model elliptical contoured scatters in the presence of mild outliers, referred to as "bad" points. The MCN can also automatically detect bad points. The price of these advantages is two additional p...
Article
Full-text available
Cluster-weighted models (CWMs) are mixtures of regression models with random covariates. However, besides having recently become rather popular in statistics and data mining, there is still a lack of support for CWMs within the most popular statistical suites. In this paper, we introduce flexCWM, an R package specifically conceived for fitting CWMs...
Article
Full-text available
We introduce the R package ContaminatedMixt, conceived to disseminate the use of mixtures of multivariate contaminated normal distributions as a tool for robust clustering and classification under the common assumption of elliptically contoured groups. Thirteen variants of the model are also implemented to introduce parsimony. The expectationcondit...
Preprint
We explore the possibility of discovering extreme voting patterns in the U.S. Congressional voting records by drawing ideas from the mixture of contaminated normal distributions. A mixture of latent trait models via contaminated normal distributions is proposed. We assume that the low dimensional continuous latent variable comes from a contaminated...
Article
A time‐varying latent variable model is proposed to jointly analyze multivariate mixed‐support longitudinal data. The proposal can be viewed as an extension of hidden Markov regression models with fixed covariates (HMRMFCs), which is the state of the art for modelling longitudinal data, with a special focus on the underlying clustering structure. H...
Article
The Gaussian cluster-weighted model (CWM) is a mixture of regression models with random covariates that allows for flexible clustering of a random vector composed of a response variable and some covariates. In each mixture component, a Gaussian distribution is adopted for both the covariates and the response given the covariates. To make the approa...
Article
Insurance and economic data are frequently characterized by positivity, skewness, leptokurtosis, and multi-modality; although many parametric models have been used in the literature, often these peculiarities call for more flexible approaches. Here, we propose a finite mixture of contaminated gamma distributions that provides a better characterizat...
Article
Full-text available
Usually in Latent Class Analysis (LCA), external predictors are taken to be cluster conditional probability predictors (LC models with covariates), and/or score conditional probability predictors (LC regression models). In such cases, their distribution is not of interest. Class specific distribution is of interest in the distal outcome model, when...
Article
The distribution of insurance losses has a positive support and is often unimodal hump-shaped, right-skewed and with heavy tails. In this work, we introduce a 3-parameter compound model to account for all these peculiarities. As conditional distribution, we consider a 2-parameter unimodal hump-shaped distribution with positive support, parameterize...
Article
The inverse Gaussian (IG) is one of the most famous and considered distributions with positive support. We propose a convenient mode-based parameterization yielding the reparametrized IG (rIG) distribution; it allows/simplifies the use of the IG distribution in various statistical fields, and we give some examples in nonparametric statistics, robus...
Article
Surveys are used to infer the level of social integration of immigrants. Item response theory helps to describe the relationship among responses to test items and latent traits of interest. However, in the presence of nonignorable missing data, which are omitted responses depending on the latent traits to be measured, estimates of the model paramet...
Article
Full-text available
The Ljung–Box test is typically used to test serial independence even if, by construction, it is generally powerful only in presence of pairwise linear dependence between lagged variables. To overcome this problem, Bagnato et al. recently proposed a simple statistic defining a serial independence test which, differently from the Ljung–Box test, is...
Article
en This article proposes the elliptical multivariate leptokurtic‐normal (MLN) distribution to fit data with excess kurtosis. The MLN distribution is a multivariate Gram–Charlier expansion of the multivariate normal (MN) distribution and has a closed‐form representation characterized by one additional parameter denoting the excess kurtosis. It is ob...
Article
Portmanteau tests are typically used to test serial independence even if, by construction, they are generally powerful only in presence of pairwise dependence between lagged variables. In this paper we present a simple statistic defining a new serial independence test which is able to detect more general forms of dependence. In particular, differen...
Article
In recent years, increasing attention has been directed toward problems inherent to quality control in healthcare services. In particular, it is necessary to measure effectiveness with respect to improving healthcare outcomes of diagnostic procedures or specific treatment episodes. The performance of hospitals is usually evaluated by multilevel mod...
Article
Full-text available
The modelling of animal movement is an important ecological and environmental issue. It is well-known that animals change their movement patterns over time, according to observable and unobservable factors. To trace the dynamics of behaviors, to identify factors influencing these dynamics and unobserved characteristics driving intra-subjects correl...
Article
Full-text available
BACKGROUND In broad terms, and apart from ethnic discriminatory rules enforced in some places and at some times, residential segregation may be ascribed both to economic inhomogeneities in the urban space (e.g., in the cost of rents, or in occupation opportunities) and to spatial attraction among individuals sharing the same group identity and cult...
Article
We introduce the R package ContaminatedMixt, conceived to disseminate the use of mixtures of multivariate contaminated normal distributions as a tool for robust clustering and classification under the common assumption of elliptically contoured groups. Thirteen variants of the model are also implemented to introduce parsimony. The expectation-condi...
Article
A class of multivariate linear models under the longitudinal setting, in which unobserved heterogeneity may evolve over time, is introduced. A latent structure is considered to model heterogeneity, having a discrete support and following a first-order Markov chain. Heavy-tailed multivariate distributions are introduced to deal with outliers. Maximu...
Article
The autodependogram is a graphical device recently proposed in the literature to analyze autodependencies. It is defined computing the classical Pearson -statistics of independence at various lags in order to point out the presence lag-depedencies. This paper proposes an improvement of this diagram obtained by substituting the -statistics with an e...
Article
Gaussian mixture models with eigen-decomposed covariance structures, i.e. the Gaussian parsimonious clustering models (GPCM), make up the most popular family of mixture models for clustering and classification. Although the GPCM family has been used for almost 20 years, selecting the best member of the family in a given situation remains a troubles...
Article
The analysis of the decision boundaries plays an important role in understanding the characteristics of a classifier in the framework of model-based clustering and discriminant analysis. The wider is the family of decision boundaries generated by a classifier the larger is its flexibility for classification purposes. In this paper, we present rigor...
Article
The Gaussian hidden Markov model (HMM) is widely considered for the analysis of heterogeneous continuous multivariate longitudinal data. To robustify this approach with respect to possible elliptical heavy-tailed departures from normality, due to the presence of outliers, spurious points, or noise (collectively referred to as bad points herein), th...
Article
The cluster-weighted model (CWM) is a mixture model with random covariates that allows for flexible clustering/classification and distribution estimation of a random vector composed of a response variable and a set of covariates. Within this class of models, the generalized linear exponential CWM is here introduced especially for modeling bivariate...
Article
Full-text available
Cluster-weighted models (CWMs) are a flexible family of mixture models for fitting the joint distribution of a random vector composed of a response variable and a set of covariates. CWMs act as a convex combination of the products of the marginal distribution of the covariates and the conditional distribution of the response given the covariates. I...
Article
Cluster-weighted models represent a convenient approach for model-based clustering, especially when the covariates contribute to defining the cluster-structure of the data. However, applicability may be limited when the number of covariates is high and performance may be affected by noise and outliers. To overcome these problems, common/uncommon \(...
Article
Full-text available
Detecting and measuring lag-dependencies is very important in time-series analysis. This study is commonly carried out by focusing on the linear lag-dependencies via the well-known autocorrelogram. However, in practice, there are many situations in which the autocorrelogram fails because of the nonlinear structure of the serial dependence. To cope...
Article
Full-text available
A family of parsimonious Gaussian cluster-weighted models (CWMs) is presented. This family concerns a multivariate extension to cluster-weighted modelling that can account for correlations between multivariate response. Parsimony is attained by constraining parts of an eigen-decomposition imposed on the component covariance matrices. A sufficient c...
Article
Full-text available
The Gaussian cluster-weighted model (CWM) is a mixture of regression models with random covariates that allows for flexible clustering of a random vector composed of response variables and covariates. In each mixture component, it adopts a Gaussian distribution for both the covariates and the responses given the covariates. To robustify the approac...
Article
Full-text available
The contaminated Gaussian distribution represents a simple robust elliptical generalization of the Gaussian distribution; differently from the often-considered $t$-distribution, it also allows for automatic detection of outliers, spurious points, or noise (collectively referred to as bad points herein). Starting from this distribution, we propose t...
Article
Full-text available
This article reviews some nonparametric serial independence tests based on measures of divergence between densities. Among others, the well-known Kullback–Leibler, Hellinger, Tsallis, and Rosenblatt divergences are analyzed. Moreover, their copula-based version is taken into account. Via a wide simulation study, the performances of the considered s...
Article
Various parametric/nonparametric techniques have been proposed in literature to graduate mortality data as a function of age. Nonparametric approaches, as for example kernel smoothing regression, are often preferred because they do not assume any particular mortality law. Among the existing kernel smoothing approaches, the recently proposed (univar...
Article
The dissimilarity index of Duncan and Duncan is widely used in a broad range of contexts to assess the overall extent of segregation in the allocation of two groups in two or more units. Its sensitivity to random allocation implies an upward bias with respect to the unknown amount of systematic segregation. In this article, following a multinomial...
Article
Full-text available
Item response theory (IRT) models are a class of statistical models used to describe the response behaviors of individuals to a set of items having a certain number of options. They are adopted by researchers in social science, particularly in the analysis of performance or attitudinal data, in psychology, education, medicine, marketing and other �...
Article
In the context of mixture models with random covariates, this article presents the polynomial Gaussian cluster-weighted model (CWM). It extends the linear Gaussian CWM, for bivariate data, in a twofold way. First, it allows for possible nonlinear dependencies in the mixture components by considering a polynomial regression. Second, it is not restri...
Article
Full-text available
Gaussian mixture models with eigen-decomposed covariance structures make up the most popular family of mixture models for clustering and classification, i.e., the Gaussian parsimonious clustering models (GPCM). Although the GPCM family has been used for almost 20 years, selecting the best member of the family in a given situation remains a troubles...
Article
Full-text available
We introduce the R package DBKGrad, conceived to facilitate the use of kernel smoothing in graduating mortality rates. The package implements univariate and bivariate adaptive discrete beta kernel estimators. Discrete kernels have been preferred because, in this context, variables such as age, calendar year and duration, are pragmatically considere...
Article
Full-text available
A mixture of contaminated Gaussian distributions is developed for robust mixture model-based clustering. In addition to the usual parameters, each component of our contaminated mixture has a parameter controlling the proportion of outliers, spurious points, or noise (collectively referred to as bad points herein) and one specifying the degree of co...
Article
A novel family of twelve mixture models with random covariates, nested in the linear $t$ cluster-weighted model (CWM), is introduced for model-based clustering. The linear $t$ CWM was recently presented as a robust alternative to the better known linear Gaussian CWM. The proposed family of models provides a unified framework that also includes the...
Article
Full-text available
Contaminated mixture models are developed for model-based clustering of data with asymmetric clusters as well as spurious points, outliers, and/or noise. Specifically, we introduce a contaminated mixture of contaminated shifted asymmetric Laplace distributions and a contaminated mixture of contaminated skew-normal distributions. In each case, mixtu...
Article
It is well known that non ignorable item non response may occur when the cause of the non response is the value of the latent variable of interest. In these cases, a refusal by a respondent to answer specific questions in a survey should be treated sometimes as a non ignorable item non response. The Rasch-Rasch model (RRM) is a new two-dimensional...
Article
Full-text available
This paper enlarges the covariance configurations, on which the classical linear discriminant analysis is based, by considering the four models arising from the spectral decomposition when eigenvalues and/or eigenvectors matrices are allowed to vary or not between groups. Similarly to the classical approach, the assessment of these configurations i...