Vito MuggeoUniversity of Palermo | UNIPA · Dipartimento di Scienze Economiche, Aziendali e Statistiche (SEAS)
Vito Muggeo
PhD
Professor of Statistics, University of Palermo, ITALY
About
101
Publications
133,988
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,947
Citations
Introduction
Skills and Expertise
Publications
Publications (101)
Background/Objectives: Stroke is a leading cause of mortality and disability worldwide, ranking as the second most common cause of death and the third in disability-adjusted life-years lost. Ischaemic stroke, which constitutes the majority of cases, poses significant public health and economic challenges. This study evaluates trends in ischaemic st...
Background: Serum albumin is crucial for critically ill patients. To date, several reports have focused on the influence of lower albumin levels on poorer prognosis and disease outcome in different subsets of critical clinical conditions varying from sepsis, to cirrhosis, renal failure, and cancer. In the last few years, investigators reported the...
We present a unified framework able to fit the entire quantile process, namely to estimate simultaneously multiple non-crossing quantile curves. The framework relies on assuming each regression parameter varies smoothly across the percentile direction according to B-splines whose coefficients obey proper restrictions. Multiple linear and penalized...
Maternal-Fetal Attachment (MFA) delineates the emotional, cognitive, and behavioral aspects that mothers develop toward the unborn baby during pregnancy. The literature indicates that optimal attachment in pregnancy represents a protective factor for the mother-child attachment bond after birth and child development outcomes. To date, there are few...
In high-dimensional regression modelling, the number of candidate covariates to be included in the predictor is quite large, and variable selection is crucial. In this work, we propose a new penalty able to guarantee both sparse variable selection, i.e. exactly zero regression coefficient estimates, and quasi-unbiasedness for the coefficients of 's...
Childhood acute lymphoblastic leukemia (ALL) survivors who underwent chemotherapy with anthracyclines have an increased cardiovascular risk. The aim of the study was to evaluate left and right cardiac chamber performances and vascular endothelial function in childhood ALL survivors. Fifty-four ALL survivors and 37 healthy controls were enrolled. Al...
We propose a new non-convex penalty in linear regression models. The new penalty function can be considered a competitor of the LASSO, SCAD or MCP penalties, as it guarantees sparse variable selection while reducing bias for the non-null estimates. We introduce the methodology and present some comparisons among different approaches.
We propose a new adaptive penalty for smoothing via penalized splines. The new form of adaptive penalization is based on penalizing the differences of the coefficients of adjacent bases using penalties based on the L1 norm. This makes possible to estimate curves with varying amounts of smoothness. Comparisons with respect to some competitors are pr...
Purpose
Childhood acute lymphoblastic leukemia (ALL) survivors who underwent chemotherapy with anthracyclines have an increased cardiovascular risk. Few are data about right ventricle function (RV) in these patients. The aim of the study was to evaluate left and right cardiac chambers and vascular endothelial function in ALL survivors.
Methods
We...
Purpose. Childhood acute lymphoblastic leukemia (ALL) survivors who underwent chemotherapy with anthracyclines have an increased cardiovascular risk. Few are data about right ventricle function (RV) in these patients. The aim of the study was to evaluate left and right cardiac chambers and vascular endothelial function in ALL survivors.
Methods. We...
We discuss a statistical framework to monitor and predict the COVID-19 epidemic outbreak. More specifically, we use segmented regression to quantify the deceleration of epidemic spreading likely due to the effectiveness of lockdown policy and parametric nonlinear regression to predict the number of confirmed cases in the short-to-medium term.
Computing the power level or the sample size for segmented regression analysis with unknown breakpoint
This paper deals with power analysis in segmented regression, namely estimation of sample size or power level when the study data being collected focus on a covariate expected to affect the mean response via a piecewise relationship with unknown breakpoint. The approach relies on the recently proposed pseudo Score statistic, of which the sampling d...
Background: Evidence concerning the impact of COVID-19-related stress exposure on prenatal attachment in pregnant women is unknown. In this study we sought to assess the effect of psychological distress and risk perception of COVID-19 on prenatal attachment in a Italian sample of pregnant women.
Methods: 1179 pregnant women completed an anonymous o...
The prevalence of ideal cardiovascular health (CVH) among adults in the United States is low, and decreases with age. Our objective was to identify specific age windows when the loss of CVH accelerates, to ascertain preventive opportunities for intervention. This study pools data from five longitudinal cohorts (Project Heartbeat!, Cardiovascular Ri...
Background:
Asthma patterns are not well established in epidemiological studies.
Aim:
To assess asthma patterns and risk factors in an adult general population sample.
Methods:
In total, 452 individuals reporting asthma symptoms/diagnosis in previous surveys participated in the AGAVE survey (2011-2014). Latent transition analysis (LTA) was per...
When performing Principal Components Analysis, one is confronted with a sort of dilemma, namely whether the covariance or correlation matrix should be used to extract eigenvalues and eigenvectors. This paper provides some issues, known but somewhat little stressed, on using such conventional choices by also proposing a new simple alternative to cov...
Selecting number of breakpoints in segmented regression: implementation in the R package segmented
R code to select number of breakpoints in segmented regression. The R package segmented is requested.
A model-based approach is developed to establish clinically relevant cut-offs for bronchodilator response in children. A dynamic nomogram is proposed for predicting the probability of asthma in childhood clinical practice.
We propose an iterative algorithm to select the smoothing parameters in additive quantile regression, wherein the functional forms of the covariate effects are unspecified and expressed via B-spline bases with difference penalties on the spline coefficients. The proposed algorithm relies on viewing the penalized coefficients as random effects from...
A few R functions for COVID-19 modelling and monitoring. Based on the R package segmented. A technical report is available (on RG too) to illustrate such functions.
We discuss a statistical framework to monitor COVID-19 epidemic outbreak. More specifically we use segmented regression to quantify the deceleration of epidemic spreading likely due to effectiveness of lockdown policy. We present R code to analyze daily time series of COVID-19 total cases in some countries.
Background:
Few population-based studies on the effects of environmental exposure variation exist.
Aim:
Assessing respiratory symptom/disease incidence related to risk factor exposure changes.
Methods:
A longitudinal general population sample from two surveys (PISA2:1991-1993; PISA3:2009-2011; no. = 970), aged ≥20 years at baseline, completed...
In this short note we present and briefly discuss the R package islasso dealing with regression models having a large number of covariates. Estimation is carried out by penalizing the coefficients via a quasi-lasso penalty, wherein the nonsmooth lasso penalty is replaced by its smooth counterpart determined iteratively by data according to the indu...
In this short note we present and briefly discuss the R package islasso dealing with regression models having a large number of covariates. Estimation is carried out by penalizing the coefficients via a quasi-lasso penalty, wherein the nonsmooth lasso penalty is replaced by its smooth counterpart determined iteratively by data according to the indu...
Background
Vitamin D (25-OHD) has a role in bone health after treatment for cancer. 25-OHD deficiency has been associated with risk factors for cardiovascular disease, but no data focusing on this topic in childhood cancer survivors have been published. We investigated the 25-OHD status in children treated for acute lymphoblastic leukemia (ALL), an...
This paper focuses on hypothesis testing in lasso regression, when one is interested in judging statistical significance for the regression coefficients in the regression equation involving a lot of covariates. To get reliable p-values, we propose a new lasso-type estimator relying on the idea of induced smoothing which allows to obtain appropriate...
The scientific literature has considered the role of
perioral forces in orthodontic treatment planning in
children as well as in adults. Although contradictory
results have been reported, several authors have
demonstrated the relationship between lip and
tongue pressure related to Angle classification,
occlusion characteristics and oral habits. The...
Hereditary angioedema (HAE) is a rare autosomic-dominant disorder characterized by a deficiency of C1 esterase inhibitor which causes episodic swellings of subcutaneous tissues, bowel walls and upper airways that are disabling and potentially life-threatening. We evaluated n = 17 patients with confirmed HAE diagnosis during attack and remission sta...
We provide some comments about regression with log Normal or log Normal-type data. R code with some examples are presented to illustrate fitting of linear and segmented linear regression models. Details, including references, are intentionally skipped.
An R package to perform simple multiple linear regression with log Normal errors and additive regression equation. The .rar file includes .tar.gz (source) and .zip (windows binary) files.
Change-point detection in abrupt change models is a very challenging research topic in many fields of both methodological and applied Statistics. Due to strong irregularities, discontinuity and non-smootheness, likelihood based procedures are awkward; for instance, usual optimization methods do not work, and grid search algorithms represent the mos...
Importance
Given that hypertension remains a leading risk factor for chronic disease globally, there are substantial ongoing efforts to define the optimal range of blood pressure (BP).
Objective
To identify a common threshold level above which BP rise tends to accelerate in progression toward hypertension.
Design, Setting, and Participants
This l...
We provide some examples and comments in fitting nonparametric quantile regression via the R package quantregGrowth. This is a short note meant to be a quick reference for the practitioner. Neither theory nor reference are reported.
This paper is concerned with interval estimation for the breakpoint parameter in segmented regression. We present score-type confidence intervals derived from the score statistic itself and from the recently proposed gradient statistic. Due to lack of regularity conditions of the score, non-smoothness and non-monotonicity, naive application of the...
Introduction:
Bias may occur in randomized clinical trials in favor of the new experimental treatment because of unblinded assessment of subjective endpoints or wish bias. Using results from published trials, we analyzed and compared the treatment effect of hepatitis C antiviral interferon therapies experimental or control.
Methods:
Meta-regress...
We present simple R code to carry out score inference on the regression coefficients of logit regression estimated via the Firth penalized likelihood. An example is presented to show the relevant R function.
We present simple R code to carry out score inference on the regression coefficients of logit regression estimated via the Firth penalized likelihood. An example is presented to show the relevant R function.
Growth performance of rhizomes has become among the most used descriptors for monitoring Posidonia oceanica seagrass dynamics and population status. However, ability to detect any change of growth in space or in time is often confounded by natural age-induced decline. To overcome this problem, we have produced reference growth charts, which in othe...
This paper focuses on inferential tools in the logistic regression model fitted by the Firth penalized likelihood. In this context, the Likelihood Ratio statistic is often reported to be the preferred choice as compared to the ‘traditional’ Wald statistic. In this work, we consider and discuss a wider range of test statistics, including the robust...
We introduce a score-type statistic to test for a non-zero regression coefficient when the relevant term involves a nuisance parameter present only under the alternative. Despite the non-regularity and complexity of the problem and unlike the previous approaches, the proposed test statistic does not require the nuisance to be estimated. It is simpl...
R code (source .R) and data (.Rdata) to fit segmented mixed models in R. For details see: i) Muggeo at al., Statist Modell 2014; ii) https://www.researchgate.net/publication/292629179
Hereditary angioedema (HAE) is a rare autosomal dominant disorder, due to C1-inhibitor deficiency, which causes episodic swellings of subcutaneous tissues, bowel walls and upper airways which are disabling and potentially life-threatening. We evaluated n = 17 patients with confirmed HAE diagnosis in basal and crisis state and n = 19 healthy subject...
The presentations of the well-known likelihood ratio, Wald and score test statistics in textbooks appear to lack a unified graphical and geometrical interpretation. We present two simple graphical representations on a common scale for these three test statistics, and also the recently proposed gradient test statistic. These unified graphical displa...
We present a simple and effective iterative procedure to estimate segmented mixed models in a likelihood based framework. Random effects and covariates are allowed for each model parameter, including the changepoint. The method is practical and avoids the computational burdens related to estimation of nonlinear mixed effects models. A conventional...
Hereditary angioedema (HAE) is a rare autosomic dominant disorder characterized by a C1 esterase inhibitor (C1INH) deficiency which causes episodic swellings of subcutaneous tissues (extremities, genitals, face, trunk, or elsewhere) bowel walls (with intestinal swellings associated to abdominal pain, nausea, vomiting or diarrhea) and upper airways...
We discuss a practical and effective framework to estimate reference growth charts via regression quantiles. Inequality constraints are used to ensure both monotonicity and non-crossing of the estimated quantile curves and penalized splines are employed to model the nonlinear growth patterns with respect to age. A companion R package is presented a...
A novel statistical method for estimating the stages of maturity in male sharks and skates based on a segmented regression (SRM) is proposed. We hypothesize that this method is able to find the transition points in the three-phase relationship between total length (TL) and clasper length (CL). We applied an SRM to TL–CL data of nine species, from l...
We present an estimating framework for quantile regression where the usual L 1-norm objective function is replaced by its smooth parametric approximation. An exact path-following algorithm is derived, leading to the well-known ‘basic’ solutions interpolating exactly a number of observations equal to the number of parameters being estimated. We disc...
Background:
The identification of temporal thresholds or shifts in animal movement informs ecologists of changes in an animal's behaviour, which contributes to an understanding of species' responses in different environments. In African savannas, rainfall, temperature and primary productivity influence the movements of large herbivores and drive c...
Table 1 Results for speed, local rainfall and regional rainfall breakpoints from all collars, obtained using multi-year piecewise regression models.
(DOC)
Table 1 Average speed, local rainfall and regional rainfall breakpoints for all collars, obtained using multiyear piecewise regression models.
(DOC)
Evaluation of a service on the basis of consumer opinion is a widespread practice in many fields. The assessment of perceived
quality [7] of a service is generally carried out through administration of a questionnaire, composed of several items with
responses posed on an ordinal scale, whereby each item represents an important feature of the evalua...
The transcription factor Yin Yang 1 (YY1) can favor several aspects of tumorigenesis. In turn, Raf-1 Kinase Inhibitor Protein (RKIP) inhibits the oncogenic activities of MAPK and NF-κB pathways and promotes drug-induced apoptosis. Mutual influences between YY1 and RKIP may exist, and there are already separate evidences that relevant increases in Y...
A core task in analyzing randomized clinical trials based on lon-gitudinal data is to find the best way to describe the change over time for each treatment arm. We review the implementation and estimation of a flexible piecewise Hierarchical Linear Model (HLM) to model change over time. The flexible piecewise HLM consists of two phases with differi...
Knowing the exact locations of multiple change points in genomic sequences serves several biological needs, for instance when data represent aCGH profiles and it is of interest to identify possibly damaged genes involved in cancer and other diseases. Only a few of the currently available methods deal explicitly with estimation of the number and loc...
We propose a simple and flexible framework for the crossing hazards problem. The method is not confined to two-sample problems, but may also work with continuous exposure variables whose effect changes its sign at some time-point of the observed follow-up time. Penalized partial likelihood estimation relies upon the assumption of a smooth hazard ra...
We discuss some issues relevant to paper of Clegg and co-authors published in Statistics in Medicine; 28, 3670-3682. Emphasis is on computation of the variance of the sum of products of two estimates, slopes and breakpoints.
Here we present and discuss the R package modTempEff including a set of functions aimed at modelling temperature effects on mortality with time series data. The functions fit a particular log linear model which allows to capture the two main features of mortality- temperature relationships: nonlinearity and distributed lag effect. Penalized splines...
In this article we propose a parsimonious parameterisation to model the so-called erosion of the covariate effect in the Cox model, namely a covariate effect approaching to zero as the follow-up time increases. The proposed parameterisation is based on the segmented relationship where proper constraints are set to accomodate for the erosion. Releva...
Exposure to ambient temperature can affect mortality levels for days or weeks following exposure, making modelling such effects in regression analysis of daily time-series data complex.
We propose a new approach involving a multi-lag segmented approximation to account for the non-linear effect of temperature and the use of two different penalized s...
We propose a segmented discrete-time model for the analysis of event history data in demographic research. Through a unified regression framework, the model provides estimates of the effects of explanatory variables and jointly accommodates flexibly non-proportional differences via segmented relationships. The main appeal relies on ready availabili...
This article was submitted without an abstract, please refer to the full-text PDF file.
This article was submitted without an abstract, please refer to the full-text PDF file.
We present a model for estimation of temperature effects on mortality that is able to capture jointly the typical features of every temperature-death relationship, that is, nonlinearity and delayed effect of cold and heat over a few days. Using a segmented approximation along with a doubly penalized spline-based distributed lag parameterization, es...
In a regression context, the dichotomization of a continuous outcome variable is often motivated by the need to express results
in terms of the odds ratio, as a measure of association between the response and one or more risk factors. Starting from the
recent work of Moser and Coombs (Stat Med 23:1843–1860, 2004) in this article we explore in a mix...
Generalized linear models (GLMs) outline a wide class of regression models where the effect of the explanatory variables on the mean of the response variable is modelled throughout the link function. The choice of the link function is typically overlooked in applications and the canonical link is commonly used. The estimation of GLMs with unspecifi...
This paper introduces Bivariate Distributed Lags Models (BDLMs) to investigate synergic effect of temperature and airborne particles on mortality. These models seem particulary attractive since they allow to model interactions between such environmental variables accounting for possible delayed effects. A B-spline framework is used to approximate t...
The temperature-mortality relationship follows a well-known J-V shaped pattern with mortality excesses recorded at cold and hot temperatures, and minimum at some optimal value, referred as Minimum Mortality Temperature (MMT). As the MMT, which is used to measure the population heat-tolerance, is higher for people living in warmer places, it has bee...
A simple algorithm to fit mean shift model has been illustrated in this paper. The method is particularly efficient with any number of breaks and it works also when explanatory variables with fixed coefficients have to be considered. A few simulations have shown that the resulting estimator is approximatively unbiased with variance decreasing as sa...
Hemorrhoidectomy is usually associated with significant pain during the postoperative period. The spasm of the internal sphincter seems to play an important role in the origin of pain. This study was designed to evaluate the effectiveness of intrasphincter injection of botulinum toxin after hemorrhoidectomy in reducing the maximum resting pressure...