
Xavier Puig- Ph.D. in Statistics
- Professor at Polytechnic University of Catalonia
Xavier Puig
- Ph.D. in Statistics
- Professor at Polytechnic University of Catalonia
Abracadabra: let the data speak! I always work with real data as main figure. I love Bayesian data analysis.
About
59
Publications
5,896
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,758
Citations
Introduction
Abracadabra: let the data speak! I love the Bayesian approach and working with real data and problems. I analyze data on topics such as epidemiology, ecology, stylometry, electoral data, statistical process control, and marketing.
Current institution
Publications
Publications (59)
The analysis of time series studies linking daily counts of a health indicator with environmental variables (e.g., mortality or hospital admissions with air pollution concentrations or temperature; or motor vehicle crashes with temperature) is usually conducted with Poisson regression models controlling for long-term and seasonal trends using tempo...
We use a Bayesian spatio-temporal model, first to smooth small-area initial life expectancy estimates in Barcelona for 2020, and second to predict what small-area life expectancy would have been in 2020 in absence of covid-19 using mortality data from 2007 to 2019. This allows us to estimate and map the small-area life expectancy loss, which can be...
Objective:
To assess whether alcohol intake is associated with the onset of migraine attacks up to 2 days after consumption in individuals with episodic migraine (EM).
Background:
Although alcohol has long been suspected to be a common migraine trigger, studies have been inconclusive in proving this association.
Methods:
This was an observatio...
Remotely monitoring industrial printers for an unexpected increase of warning and error messages reduces equipment downtime and increases customer satisfaction. Directly tracking raw error messages rates during a given observation period poses some issues. Firstly, when a printer has not been used much during the observation period, its actual prin...
Modelling customer behaviour to predict their future purchase frequency and value is crucial when selecting customers for marketing activities. The profitability of a customer and their risk of inactivity are two important factors in this selection process. These indicators can be obtained using the well‐known Pareto/NBD model. Here we cluster cust...
When mapping life expectancy, and investigating its local variation in time, there is a conflict between using large areas and/or mortality data from long periods of time to have low variance life expectancy estimates, and using small areas and single‐year mortality data to explore the space–time variation of life expectancy in detail, without bias...
Background
The banning of mass-gathering indoor events to prevent SARS-CoV-2 spread has had an important effect on local economies. Despite growing evidence on the suitability of antigen-detecting rapid diagnostic tests (Ag-RDT) for mass screening at the event entry, this strategy has not been assessed under controlled conditions. We aimed to asses...
The banning of mass-gathering indoor events to prevent SARS-CoV-2 spread cause an important impact on local economies. We designed a randomized-controlled open-label trial to assess the effectiveness of a comprehensive preventive intervention for a mass-gathering indoor event (a live concert) based on systematic screening of attendees with antigen-...
In 2012 Catalan politics came to a standstill, and a push for independence was triggered by a huge pro-independence rally. That push led, in 2017, to the Spanish government taking over the Catalan government, and top Catalan officials either going into exile or being jailed, tried, and convicted for sedition in 2019. This article investigates how C...
Standard statistical tests for Hardy–Weinberg equilibrium assume the equality of allele frequencies in the sexes, whereas tests for the equality of allele frequencies in the sexes assume Hardy–Weinberg equilibrium. This produces a circularity in the testing of genetic variants, which has recently been resolved with new frequentist likelihood and ex...
The detection of outlying rows in a contingency table is tackled from a Bayesian perspective, by adapting the framework adopted by Box and Tiao for normal models to multinomial models with random effects. The solution assumes a 2–component mixture model of 2 multinomial continuous mixtures for them, one for the nonoutlier rows and the second one fo...
The X chromosome is a relatively large chromosome, harboring a lot of genetic information. Much of the statistical analysis of X-chromosomal information is complicated by the fact that males only have one copy. Recently, frequentist statistical tests for Hardy–Weinberg equilibrium have been proposed specifically for dealing with markers on the X ch...
p>We proposed statistical analysis of the heterogeneity of literary style in a set of texts that simultaneously use different stylometric characteristics, like word length and the frequency of function words. The data set consists of several tables with the same number of rows, with the i-th row of all tables corresponding to the i-th text. The ana...
In authorship attribution one assigns texts from an unknown author to either one of two or more candidate authors by comparing the disputed texts with texts known to have been written by the candidate authors. In authorship verification one decides whether a text or a set of texts could have been written by a given author. These two problems are us...
The statistical analysis of the heterogeneity of the style of a text often leads to the analysis of contingency tables of ordered rows. When multiple authorship is suspected, one can explore that heterogeneity through either a change-point analysis of these rows, consistent with sudden changes of author, or a cluster analysis of them, consistent wi...
When in geography one reconstructs individual behavior starting from aggregated data through ecological inference, a crucial aspect is the spatial variation of individual behavior. Basic ecological inference methods treat areas as if they were all exchangeable, which in geographical applications is questionable due to the existence of contextual ef...
To help settle the debate triggered the day after any election around the origin and destination of the vote of winners and losers, a Bayesian analysis of the results in a pair of consecutive elections is proposed. It is based on a model that simultaneously carries out a cluster analysis of the areas in which the results are broken into and links t...
A Bayesian cluster analysis for the results of an election based on multinomial mixture models is proposed. The number of clusters is chosen based on the careful comparison of the results with predictive simulations from the models, and by checking whether models capture most of the spatial dependence in the results. By implementing the analysis on...
This paper presents the results obtained using two strategies to eliminate fluctuations in the value of a critical dimension of a car's braking system critical component. The quality levels required in this product demand great precision in some dimensions and, due to the peculiarities of its manufacturing process, this can only be achieved by maki...
Objectives
With a system of qualitative indicators, to analyse the pharmaceutical prescription of general practitioners (GPs), and to evaluate the relationship of these indicators to the overall pharmaceutical prescription expenditure per inhabitant.
Design
Retrospective descriptive study.
Setting
Primary care.
Measurements and main results
The...
Mantle cell lymphoma (MCL) is a heterogeneous disease with most patients following an aggressive clinical course, whereas others having an indolent behavior. We conducted an integrative and multidisciplinary analysis of 177 MCL to determine whether the immunogenetic features of the clonotypic B-cell receptors (BcR) may identify different subsets of...
The aim of this article is to estimate the disabilty prevalence for the activities of daily living (ADL), the socioeconomic and demographic characteristics and the use of health services, distinguishing between the population receiving assistance for ADL and not. Cross-sectional study (Encuesta de Salud de Cataluña [ESCA] 2006). We have analyzed 17...
The aim of this article is to estimate the disabilty prevalence for the activities of daily living (ADL), the socioeconomic and demographic characteristics and the use of health services, distinguishing between the population receiving assistance for ADL and not. Cross-sectional study (Encuesta de Salud de Cataluña [ESCA] 2006). We have analyzed 17...
The zero truncated inverse Gaussian–Poisson model, obtained by first mixing the Poisson model assuming its expected value has an inverse Gaussian distribution and then truncating the model at zero, is very useful when modelling frequency count data. A Bayesian analysis based on this statistical model is implemented on the word frequency counts of v...
The analysis of word frequency count data can be very useful in authorship attribution problems. Zero-truncated generalized inverse Gaussian-Poisson mixture models are very helpful in the analysis of these kinds of data because their model-mixing density estimates can be used as estimates of the density of the word frequencies of the vocabulary. It...
Modelling word or species frequency count data through zero truncated Poisson mixture models allows one to interpret the model mixing distribution as the distribution of the word or species frequencies of the vocabulary or population. As a consequence, estimates of their mixing density can be used as a fingerprint of the style of the author in his...
The contribution of microRNAs (miR) to the pathogenesis of mantle cell lymphoma (MCL) is not well known. We investigated the expression of 86 mature miRs mapped to frequently altered genomic regions in MCL in CD5(+)/CD5(-) normal B cells, reactive lymph nodes, and purified tumor cells of 17 leukemic MCL, 12 nodal MCL, and 8 MCL cell lines. Genomic...
The inverse Gaussian-Poisson mixture model is very useful when modelling highly skewed non-negative integer data in fields as diverse as linguistics, ecology, market research, bibliometry, engineering and insurance. When using this statistical model on the frequency of word or species frequency data, one typically truncates its sample space at zero...
To evaluate the impact of avoidable mortality on the changes in life expectancy at birth in Spain.
Standard life table techniques and the Arriaga method were used to calculate and to decompose life expectancy (LE) changes by age, effects and groups of causes of avoidable mortality among three periods (1987-91, 1992-6 and 1997-2001). A list of cause...
Les diferències en la distribució geogràfica de les causes de mortalitat són una informació de gran interès per lluitar contra elles. Les primeres hipòtesis sobre les causes de moltes malalties han estat establertes a partir de la identificació d’una major freqüència d’aparició en àmbits geogràfics on hi ha presència o absència de certs factors, si...
Objective
To analyze time trends in Catalonia (1986-2002) and Spain (1986-2001) in suicide mortality and its geographical variation by health areas in Catalonia.
To analyze time trends in Catalonia (1986-2002) and Spain (1986-2001) in suicide mortality and its geographical variation by health areas in Catalonia.
Standard annual mortality rates were calculated by the direct method for Catalonia (1986-2002) and Spain (1986-2001) (standard population of Catalonia 1991). The adjusted annual percent change was a...
Fundamento: Para planificar las necesidades de servicios sanitarios es fundamental conocer la distribución de la morbilidad por trastornos psicológicos en el territorio así como los factores que la determinan. El objetivo es identificar los factores que pueden explicar la variabilidad geográfica de estos trastornos en Cataluña. Métodos: Los datos p...
To know the geographic distribution of the prevalence of psychological distress is important for mental health services planning. This study is aimed at identifying the individual factors and those related to the area of residence which may explain the geographic variability of psychological distress (by healthcare districts) in Catalonia.
The data...
The aims of this study are to describe the time trends and the changes in the spatial distribution of stomach cancer mortality by gender, in Catalonia, Spain, in the period 1986-2000.
The mortality data comes from the Mortality Register for Catalonia at the Health Department and the population data from the Institute of Statistics for Catalonia. To...
Background and objective
The aims of this study are to describe the time trends and the changes in the spatial distribution of stomach cancer mortality by gender, in Catalonia, Spain, in the period 1986-2000.
Material and method
The mortality data comes from the Mortality Register for Catalonia at the Health Department and the population data from...
Aurora-A and hMPS1 are kinases involved in spindle checkpoint and centrosome duplication regulation and whose alterations have been associated with cell transformation and chromosome instability in different tumor models. In this study, we have examined the possible alterations of these genes in 58 mantle cell lymphomas (MCLs) and 4 MCL-related cel...
In this work it is shown how generalized linear models allow one to describe different patterns of temporary evolution of mortality data, while at the same time allow for an easy interpretation. As a practical application, the evolution of the female breast cancer mortality in Catalonia from 1986 to 2000 is analyzed. Remarkably, the mortality from...
Gene-expression profiling has identified 3 major subgroups of diffuse large B-cell lymphoma (DLBCL): germinal center B-cell-like (GCB), activated B-cell-like (ABC), and primary mediastinal DLBCL (PMBCL). Using comparative genomic hybridization (CGH), we investigated the genetic alterations of 224 cases of untreated DLBCL (87 GCB-DLBCL, 77 ABC-DLBCL...
In this work it is shown how generalized linear models allow one to describe different patterns of temporary evolution of mortality data, while at the same time allow for an easy interpretation. As a practical application, the evolution of the female breast cancer mortality in Catalonia from 1986 to 2000 is analyzed. Remarkably, the mortality from...
Objectives
To analyze time trends and geographical variation in avoidable mortality by health areas in Catalonia.
To analyze time trends and geographical variation in avoidable mortality by health areas in Catalonia.
Avoidable mortality was analyzed according to the classification used by the Health Department of the Regional Government of Catalonia from 1986-2001 for health areas and causes were grouped as treatable and preventable. Standardized mortality rat...
To determine the clinicopathologic significance and prognostic value of chromosomal imbalances in diffuse large B-cell lymphomas (DLBCL).
We have examined 64 tumors at diagnosis using comparative genomic hybridization and real-time quantitative polymerase chain reaction (PCR), single-stranded conformational polymorphism, and DNA sequencing for the...
Background and objective: An operative health measure must include aspects such as life duration an its quality. The main purpose of this paper is to analize DFLE and HALE evolution between 1994 and 2000. We also assess its potential applications to the evaluation of the Health Plan for Catalonia for the year 2000 objectives.Subjects and method: Mo...
The CHK2 gene codifies for a serine/threonine kinase that plays a central role in DNA damage response pathways. To determine the potential role of CHK2 alterations in the pathogenesis of lymphoid neoplasms we have examined the gene status, protein, and mRNA expression in a series of tumors and nonneoplastic lymphoid samples. A heterozygous Ile157Th...
Chromosomal imbalances were examined by comparative genomic hybridization in 30 cases of B-cell chronic lymphocytic leukemia (CLL) at diagnosis, in sequential samples from 17 of these patients, and in 6 large B-cell lymphomas transformed from CLL [Richter's syndrome (RS)] with no available previous sample. The most common imbalances in CLL at diagn...
The aims of this study were to describe the trends of mortality from dementias according to gender and age in Catalonia (Spain) and to estimate their evolution from 1979 to 2003.
The dementia death data (ICD-9: 290-290.9 298.9, 294.9, 331.0, and 331.2) between 1979 and 1998 come from the Catalonian Mortality Register of the Department of Health as...
We hare analysed the influence of patient and hospital characteristics and region, on the use of breast-conserving surgery (BCS) in Catalonia (Spain). Data for this study was obtained from the Catalan Hospital Discharge Data Base. The study period was 1995-1998. The Mantel - Haenszel test was used to examine overall trends in the use of BCS. A regr...
The BMI-1 gene is a putative oncogene belonging to the Polycomb group family that cooperates with c-myc in the generation of mouse lymphomas and seems to participate in cell cycle regulation and senescence by acting as a transcriptional repressor of the INK4a/ARF locus. The BMI-1 gene has been located on chromosome 10p13, a region involved in chrom...
With a system of qualitative indicators, to analyse the pharmaceutical prescription of general practitioners (GPs), and to evaluate the relationship of these indicators to the overall pharmaceutical prescription expenditure per inhabitant.
Retrospective descriptive study.
Primary care.
The drugs prescription of 285 GPs from 32 primary care teams wa...