Javier E. Contreras-ReyesUniversity of the Americas | UDLA · Facultad de Ingeniería
Javier E. Contreras-Reyes
Ph.D. in Statistics
Faculty (Academic & Researcher)
About
112
Publications
35,202
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,363
Citations
Introduction
My research interests include the study of time series models, statistical information theory, skew-elliptical distributions, Bayesian analysis, multivariate analysis, non-linear regression, and computational statistics. I am developed statistical models applied to the interpretation of economic and social time series, oceanographic and marine biological phenomena, and information tools for non-linear dynamics.
Additional affiliations
March 2021 - July 2024
April 2019 - February 2021
September 2018 - January 2019
Education
March 2015 - December 2017
University of Valparaíso
Field of study
- Statistics
January 2008 - December 2009
January 2003 - December 2006
Publications
Publications (112)
In practice, several time series exhibit long-range dependence or persistence
in their observations, leading to the development of a number of estimation and
prediction methodologies to account for the slowly decaying autocorrelations.
The autoregressive fractionally integrated moving average (ARFIMA) process is
one of the best-known classes of lon...
The entropy and mutual information index are important concepts developed by Shannon in the context of information theory. They have been widely studied in the case of the multivariate normal distribution. We first extend these tools to the full symmetric class of multivariate elliptical distributions and then to the more flexible families of multi...
The aim of this work is to provide the tools to compute the well-known
Kullback–Leibler divergence measure for the flexible family of multivariate skew-normal
distributions. In particular, we use the Jeffreys divergence measure to compare the
multivariate normal distribution with the skew-multivariate normal distribution, showing that
this is equiva...
The Self-Exciting Threshold Autoregressive model (SETAR) is non-linear and considers threshold values to model time series affected by regimes. It is extended through the Multivariate SETAR (MSETAR) model, where the threshold variable can also be a multivariate process. The stationary marginal density (smd) of an MSETAR process of order one corresp...
Rényi entropy based on characteristic function has been used as an information measure contained in wide-sense and real stationary vector autoregressive and moving average (VARMA) processes. These classes of processes have been extended by fractionally integrated VARMA (VARFIMA) ones, composed of a VARMA process, a vector of fractional differencing...
In this paper, we propose the Weighted Lindley (NWLi) model for the analysis of extreme historical insurance claims. It extends the classical Lindley distribution by incorporating a weight parameter, enabling more flexibility in modeling insurance claim severity. We provide a comprehensive theoretical overview of the new model and explore two pract...
In this article, a new extension of the standard Laplace distribution is introduced for house price modeling. Certain important properties of the new distribution are deducted
throughout this study. We used the new extension of the Laplace model to conduct a thorough economic risk assessment utilizing several metrics, including the value-at-risk (V...
The Jensen-variance (JV) information based on Jensen’s inequality and variance has been previously proposed to measure the distance between two random variables. Based on the relationship between JV distance and autocorrelation function of two weakly stationary process, the Jensen-autocovariance and Jensen-autocorrelation functions are proposed in...
Variance has an important role in statistics and information theory fields, by forming the basis for many well-known information measures. Based on Jensen's inequality and variance , the Jensen-variance information has been previously proposed to measure the distance between two random variables. Jensen-variance distance is based on the convexity p...
The purpose of this paper is twofold. In the first part, we introduce a novel information measure known as the mixture Fisher-Shannon information measure, motivated by de Bruijn's identity. We also propose and study a specific case of this measure called the difference information measure along with its Jensen version. Subsequently, the paper delve...
In information theory, mutual information reduces the uncertainty of a random variable due to the given knowledge of another random variable. The mutual information matrix method is developed to analyze the non-linear interactions in case of the high-dimensional time series. The mutual information matrix is based on the entropy and mutual informati...
The Self-Exciting Threshold Autoregressive model (SETAR) is non-linear and considers threshold values to model time series affected by regimes. It is extended through the Multivariate SETAR (MSETAR) model, where the threshold variable can also be a multivariate process. The stationary marginal density (smd) of an MSETAR process of order one corresp...
The Pareto-Feller distribution has been widely used across various disciplines to model "heavy-tailed" phenomena, where extreme events such as high incomes or large losses are of interest. In this paper, is presented a new bivariate distribution based on Appell Hypergeometric function with marginal distributions Pareto-Feller obtained from two inde...
Fisher information is a measure to quantify information and have important inferential, scaling and uncertainty properties. Kharazmi and Asadi (Braz. J. Prob. Stat. 32, 795-814, 2018) presented the time-dependent Fisher information of any density function. Specifically, they considered a nonnegative continuous random (life-time) variable X and defi...
A new class of distributions known as the Extended Odd Log-Logistic (GOLL-G) family of distributions is proposed in this paper. The model parameters are estimated through the Maximum likelihood estimation technique. Asymptotic distributions and order statistics of GOLL-G family are also presented. Additionally, two particular distributions called t...
Stock market indices are important tools to measure and compare stock market performance. The Selective Stock Price (SSP) index reflects fluctuations in a set value of financial instruments of Santiago de Chile's stock exchange. Stock indices also reflect volatility linked to high uncertainty or potential investment risk. However, economic shocks a...
We introduce a new inaccuracy measure in terms of Fisher information. The proposed information measure is referred to as Fisher-based inaccuracy information measure. Next, we examine some properties of this information measure and specifically examine it for escort and equilibrium distributions. Further, we propose Bayes Fisher-based inaccuracy inf...
In this paper, the generalized mixture of standard logistic and skew logistic is introduced as a new class of distribution. Some important mathematical properties of this novel distribution are discussed along with a graphical presentation of the density function. These properties include moment generating function, m th order moment, mean deviatio...
This work introduces belief inaccuracy information (BII) and Jensen-belief inaccuracy information (JBII), grounded in the foundational concept of basic probability assignment. The significance of these novel methodologies lies in their potential to advance the field by providing valuable insights into belief accuracy assessment. The research not on...
Threshold risk analysis based on extreme stress data assess the probability of events that exceeds a certain threshold in a dataset characterized by extreme values or stresses. This type of analysis is often used in finance, insurance, environmental science, and engineering, where understanding and managing extreme events are crucial. This article...
This article introduces a novel class of skew-logistic distribution, providing flexibility to fit data with up to three modes. Various important properties of this new distribution are thoroughly examined, including its moment generating function, moments, entropy, and characterizations. Considering the location and scale parameters, a model extens...
The Jensen-variance (JV) distance measure is introduced and some properties are developed. The JV distance measure can be expressed using two interesting representations, the first one is based on mixture covariances, and the second one is in terms of the scaled variance of the absolute difference of two random variables. The connections between th...
The purpose of this work is to introduce fractional cumulative residual inaccuracy (FCRI) information, Jensen-cumulative residual inaccuracy (JCRI), and Jensen-fractional cumulative residual inaccuracy (JFCRI) information measure. Further, we study the FCRI information for some well-known models used in reliability, economics and survival analysis....
Gaussian processes (GPs) are a powerful machine learning tool to reveal hidden patterns in data. GPs hyperparam-eters are estimated from data, providing a framework for regression and classification tasks. We capitalize on the power of GPs to drive insights about the biophysical mechanisms underpinning metastable brain oscillations from observable...
In this paper, we measured the uncertainty synchrony level of Chilean business economic perception and consumer economic perception, both affected by common external factors reflected in the Global Economy Perception Index (GEPI), unemployment, inflation, interest rate, Monthly Economic Activity (MEAI) and the Economic Policy Uncertainty (EPUI) ind...
Long-term dependence is an essential feature for the predictability of time series. Estimating the parameter that describes long memory is essential to describing the behavior of time series models. However, most long memory estimation methods assume that this parameter has a constant value throughout the time series, and do not consider that the p...
This paper introduce the belief Fisher-Shannon (BFS) information plane based on basic probability assignment concept. {Moreover, upper and lower bounds of BFS information are also presented. In addition, BFS information plane is extended to belief Fisher-Rényi and fractional belief Fisher-Shannon ones, whose are based on Generalized Rényi and fract...
Several information and divergence measures existing in the literature assist in measuring the knowledge contained in sources of information. Studying an information source from both positive and negative aspects will result in more accurate and comprehensive information. In many cases, extracting information through the positive approach could be...
Missing or unavailable data (NA) in multivariate data analysis is often treated with imputation methods and, in some cases, records containing NA are eliminated, leading to loss of information. This paper addresses the problem of NA in Multiple Factor Analysis (MFA) without resorting to eliminating records or using imputation techniques. For this p...
The purpose of this work is to introduce Deng-Fisher information (DFI), Deng-Fisher information distance (DFID) and Jensen-Deng-Fisher (JDF) information distance measures based on the basic probability assignment concept. We also present results associated with these proposed information measures and examine DFI and DFID measures for escort of basi...
In this work, we consider the Fisher information and some of its well-known extended versions and then establish some results based on infinite mixture density functions for the proposed information measures. Specifically we introduce the Jensen-type of Fisher information (parametric-type and density-based) and generalized Fisher information measur...
In this study, we analyzed how urban, housing, and socioeconomic variables are related to COVID-19 incidence. As such, we have analyzed these variables along with demographic, education, employment, and COVID-19 data from 32 communes in Santiago de Chile between March and August of 2020, before the release of the vaccines. The results of our Princi...
Models with time-varying parameters have become more popular for time series analysis. Among these models, Generalized Autoregressive Score (GAS) models are based on the specification of the mechanism through which past observations of the variable of interest affect the current value of the time-varying parameters. GAS models allow capturing the d...
Among several models proposed in the time series literature, the Self-Exciting Threshold Autoregressive (SETAR) model is non-linear and considers threshold values to model time series affected by regimes. Beyond the linear models, the computation of information and dependence metrics in non-linear time series is of great interest to compare process...
Expected Maximization (EM) algorithm is often used for estimation in semiparametric models with non-normal observations, as the M-step involves only complete data for the maximum likelihood estimation and because it is computationally feasible. However, the EM algorithm's main disadvantage is its slow (linear) convergence rate. In this paper, we pr...
The purpose of this work is twofold. The first part is to introduce Jensen-relative information generating function and examine its connection to Jensen-Shannon entropy measure. Then, the α-mixture distribution is shown to be an optimal solution to three optimization problems based on relative information generating (RIG) function. We further study...
In this paper, an Autoregressive Moving Average (ARMA) model with Threshold Generalized Autoregressive Conditional Heteroscedasticity (TGARCH) innovations is considered to model Chilean economic uncertainty time series. Uncertainty is measured through the Business Confidence Index (BCI) and Consumer Perception Index (CPI). BCI time series provide u...
Deriving loss distribution from insurance data is a challenging task, as loss distribution is strongly skewed with heavy tails with some levels of outliers. This paper extends the weighted exponential (WE) family to the contaminated WE (CWE) family, which offers many flexible features, including bimodality and a wide range of skewness and kurtosis....
In this paper, a new class of the continuous distributions is established via compounding the arctangent function with generalized log-logistic class of distributions. Some structural properties of the suggested model such as distribution function, hazard function, quantile function, asymptotics and a useful expansion for the new class are given in...
Shang et al. (Commun. Nonlinear Sci. 94, 105556, 2022) proposed an efficient and robust synchronization estimation between two not necessarily stationary time series, namely the refined cross-sample entropy (RCSE). This method considered the empirical cumulative distribution function of distances using histogram estimator. In contrast to classical...
Quantifying information dynamics in a nonlinear system is crucial in complex dynamics. The Mutual Information Matrix (MIM) method was developed to study nonlinear interactions in high-dimensional time series. In this paper, MIM analysis is extended from a Shannon entropy approach to a Rényi entropy one. Specifically, this paper presents the MIM for...
In this paper, we provide a new bivariate distribution obtained from a Kibble-type bivariate
gamma distribution. The stochastic representation was obtained by the sum of a Kibble-type bivariate random vector and a bivariate random vector builded by two independent gamma random variables. In addition, the resulting bivariate density considers an inf...
Purpose
This paper combines the objective information of six mixed-frequency partial-activity indicators with assumptions or beliefs (called priors) regarding the distribution of the parameters that approximate the state of the construction activity cycle. Thus, this paper uses Bayesian inference with Gibbs simulations and the Kalman filter to esti...
In several applications, the assumption of normality is often violated in data with some level of skewness, so skewness affects the mean’s estimation. The class of skew–normal distributions is considered, given their flexibility for modeling data with asymmetry parameter. In this paper, we considered two location parameter (μ) estimation methods in...
Conway’s Game of Life (GoL) is a biologically inspired computational model which can approach the be- havior of complex natural phenomena such as the evolution of ecological communities and populations. The GoL frequency distribution of events on log-log scale has been proved to satisfy the power-law scaling. In this work, GoL is connected to the e...
We show statistical evidence that pension fund withdrawals and the Emergency Family Income (EFI) increased the likelihood that a laid off construction worker would reject a proposal for a formal employment contract. This favors the hypothesis that pension fund withdrawals and government subsidies related to the health crisis have, to some extent, c...
Fisher information is a measure to quantify information and estimate system-defining parameters. The scaling and uncertainty properties of this measure, linked with Shannon entropy, are useful to characterize signals through the Fisher-Shannon plane. In addition, several non-gaussian distributions have been exemplified, given that assuming gaussian...
Conway's Game of Life (GoL) is a biologically inspired computational model which can approach the behavior of complex natural phenomena such as the evolution of ecological communities and populations. The GoL frequency distribution of events on log-log scale has been proved to satisfy the power-law scaling. In this work, GoL is connected to the ent...
In this paper, we approached the concept of real estate bubble, analyzing the risk its bursting could generate for the Chilean financial market. Specifically, we analyzed the relationship between real housing prices, the economic activity index, and mortgage interest rates denominated in inflation-linked units from 1994 to 2020. The analysis was ba...
Real index of housing prices variables in Santiago, Chile (see "Definition of data" sheet for details).
The copper price is a leading indicator of real estate activity. Price increases are statistically related to increasing numbers of applications for residential building permits. However, this reciprocity is not instantaneous as permit numbers lag price rises by 9 to 10 months. This dynamic is implicit in various transmission channels: from the fir...
In this paper, we study further properties of the modified skew-normal-Cauchy (MSNC) distribution. MSNC distribution corresponds to a reformulation of skew-normal-Cauchy distribution that allows to obtain a non-singular Fisher Information Matrix at skewness parameter equal zero. We suggest a hierarchical representation which allows alternative deri...
Cross-sample entropy (CSE) allows to analyze the association level between two time series that are not necessarily stationary. The current criteria to estimate the CSE are based on the normality assumption, but this condition is not necessarily satisfied in reality. Also, CSE calculation is based on a tolerance and an embedding dimension parameter...
Zhao et al. (Nonlin. Dyn. 88, 477-487, 2017) presented the mutual information matrix (MIM) analysis for the study of nonlinear interactions in multivariate time series as an extension of Random
Matrix Theory analysis. They considered the histogram estimation of mutual information based on Shannon entropy for discrete distributions. This paper is mo...
Growth in fishes is usually modelled by a function encapsulating a common growth mechanism across ages. However, several theoretical works suggest growth may comprise of two distinct mechanistic phases arising from changes in reproductive investment, diet or habitat. These models are termed two-state or biphasic, where acceleration in growth typica...
Yilmaz et al. (Fluct. Noise Lett. 17, 1830002, 2018) investigated the stochastic phenomenological bifurcations of a generalized Chua circuit driven by Skew-Gaussian distributed noise. They proved it is possible to decrease the number of scrolls by properly choosing the stochastic excitation, manipulating the skewness and noise intensity parameters....
Shannon and Rényi entropies are two important measures of uncertainty for data analysis. These entropies have been studied for multivariate Student-t and skew-normal distributions. In this paper, we extend the Rényi entropy to multivariate skew-t and finite mixture of multivariate skew-t (FMST) distributions. This class of flexible distributions al...
In this paper we addressed the problem of backcasting and forecasting non-stationary time series. Given that traditional backcasting are based on the classical cross-correlation function assuming two stationary time series, assumptions based on two non-stationary time series need a new approach. Thus, Detrended Cross-Correlation Analysis (DCCA) pro...
The southern Humboldt Current ecosystem is an important topic among researchers working on the drivers of pelagic species’ biological indicators. While sea surface temperature is believed to be a major driver in anchovies’ (Engraulis ringens) reproductive and body condition indicators, this paper shows that regional drivers such as Pacific decadal...
Section 3.3 of "Contreras-Reyes, J.E.; Cortés, D.D. Bounds on Rényi and Shannon Entropies for Finite Mixtures of Multivariate Skew-Normal Distributions: Application to Swordfish (Xiphias gladius Linnaeus). Entropy 2016, 18, 382" contains errors. Therefore, this section is retracted. However, these changes do not influence the conclusions and the ot...
Monthly Chilean cement production (1991–2015), available at the Instituto Nacional de Estadística (INE, Santiago, Chile). See the article: http://dx.doi.org/10.1007/s00181-018-1506-8, for a description of the time series.
Detecting bimodality of a frequency distribution is of considerable interest in several fields. Classical inferential methods for detecting bimodality focused in third and fourth moments through the kurtosis measure. Nonparametric approach-based asymptotic test (DIPtest) for comparing the empirical distribution function with a unimodal one are also...
There remains a lack of holistic approaches for analyzing how different density-independent and density-dependent (endogenous) mechanisms interact to drive the dynamics of the small pelagic fish populations of the southern Humboldt Current ecosystem. In this study, we analyzed the drivers of the small pelagic fishes (SPF) off the coast of Chile fro...
The univariate gamma (chi-squared) superstatistics has been used in several applications by assuming independence between systems. However, in some cases it seems more reasonable to consider a dependence structure. This fact motivates the introduction of a family of bivariate superstatistics based on an extension of the gamma distribution, defined...
Sea urchin (Loxechinus albus) is one of the economically most important species in the northeast of Chilean Patagonia, forming part of the highly diverse benthic community. This resource is being harvested under selective fishing pressure, which suppresses growth rates. In response, the National Standards Institute established Regulation 44 as a qu...
The aim of this work is to extend the results of Perez et al. (Physica A (2006), 365 (2), 1 282-288) to the two dimensional fractional Brownian field. In particular, we define the Shannon entropy using the wavelet spectrum from which the Hurst exponent is estimated by the regression of the logarithm of the square coefficients over the levels of res...
Vector Auto-regressive (VAR) models are commonly used for modelling multivariate time series and the typical distributional form is to assume a multivariate normal. However, the assumption of Gaussian white noise in multivariate time series is often not reasonable in applications where there are extreme and/or skewed observations. In this setting,...
Cement is a non-storable input in the medium- and long-term. The evidence in
Chile shows that cement supply and demand are in relative equilibrium, so the
demand or supply of this input can measure the activity in the structural
construction or work. The aim of this paper is to backcast the series of cement
production since January 2009, using...
Changes in phenology are an ecological response to climate changes, in addition to other factors that may stress or dampen phenological responses. For this study, we investigated how climate variability can affect the phenology of the reproductive process of the anchovy Engraulis ringens. We used 29 years of sea surface temperature (SST) data and b...
Dynamic linear models are typically developed assuming that both the observational and system distributions are normal. In this work, we relax this assumption by considering a skew-normal distribution for the observational random errors, providing thus an extension of the standard normal dynamic linear model. Full Bayesian inference is carried out...
Skew-Reflected-Gompertz (SRG) distribution introduced by Hosseinzadeh et al. (J. Comput. Appl. Math. (2019) 349, 132-141), produces two-piece asymmetric behavior of Gompertz (GZ) distribution, which extends the positive to a whole dominion by an extra parameter. SRG distribution also permits a better fit than its well-known classical competitors, n...
In this paper, we examine the finite mixture (FM) model with a flexible class of two-piece distributions based on the scale mixtures of normal (TP-SMN) family components. This family allows developing a robust estimation of FM models. The TP-SMN is a rich class of distributions that covers symmetric /asymmetric and light/heavy tailed distributions....
In this work, we have defined a new family of skew distribution: the Skew-Reflected-Gompertz. We have also derived some of its probabilistic and inferential properties. The maximum likelihood estimates of the proposed distribution parameters are obtained via an EM-algorithm, and performances of the proposed model and its estimates are shown via sim...
This study addresses the problem of age determination of the southern king crab (Lithodes santolla). Given that recapture is difficult for this species and thus age cannot be determined directly with the help of annual marks on the shell, the von Bertalanffy growth function (vBGF) cannot be used to model length-frequency data (LFD) directly. To det...
The von Bertalanffy growth function (VBGF) with random effects has been widely used to estimate growth parameters incorporating individual variability of length-at-age. Trajectories of individual growth can be inferred using either mark-recapture or back-calculation of length-at-age from growth marks in hard body parts such as otoliths. Modern stat...
In the last century, the growing evidence that global fisheries are depleting natural resources
much faster than they can recover has led to negative processes, like overfishing, being addressed
with increasingly complex models and thus mitigating or regulating actions that aim to protect
stocks. Said negative processes contain two components: (i)...
This note analyzes the effects forest fires in Chile have on vegetation and
subsequent ecological restoration. We analyze why forest fires have been a
main factor that affects the environment and causes the ecosystem to
deteriorate, leading to loss of native forests, species extinction, damage to
the urban population, and others. The data exami...
Simplex distribution has been proved useful for modelling double-bounded variables in
data directly. Yet, it is not sufficient for multimodal distributions. This article addresses the problem of
estimating a density when data is restricted to the (0, 1) interval and contains several modes. Particularly,
we propose a simplex mixture model approach t...