Ioannis Ntzoufras

Ioannis Ntzoufras
Athens University of Economics and Business | AUEB · Department of Statistics

Phd in Statistics

About

127
Publications
60,918
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,979
Citations
Additional affiliations
December 2015 - January 2017
Athens University of Economics and Business
Position
  • Professor
January 2007 - present

Publications

Publications (127)
Article
The reliability of the results of network meta‐analysis (NMA) lies in the plausibility of the key assumption of transitivity. This assumption implies that the effect modifiers' distribution is similar across treatment comparisons. Transitivity is statistically manifested through the consistency assumption which suggests that direct and indirect evi...
Article
Full-text available
The goal of this paper is to build and compare methods for the prediction of the final outcomes of basketball games. In this study, we analyzed data from four different European tournaments: Euroleague, Eurocup, Greek Basket League and Spanish Liga ACB. The data-set consists of information collected from box scores of 5214 games for the period of 2...
Chapter
In this paper, we analyze a sport (in)activity case study using a zero inflated bivariate Poisson model. We use the “(in)activity” term in order to embrace both active and passive sport participation (practicing or watching a sport, respectively). The paper investigates the determinants of sport (in)activity: the frequency and the probability of sp...
Article
Full-text available
Competitive balance is of much interest in the sports analytics literature and beyond. We develop a statistical network model based on an extension of the stochastic block model to assess the balance between teams in a league. We represent the outcome of all matches in a football season as a dense network with nodes identified by teams and categori...
Preprint
Full-text available
We consider a flexible Bayesian evidence synthesis approach to model the age-specific transmission dynamics of COVID-19 based on daily age-stratified mortality counts. The temporal evolution of transmission rates in populations containing multiple types of individual are reconstructed via an appropriate dimension-reduction formulation driven by ind...
Preprint
Full-text available
The reliability of the results of network meta-analysis (NMA) lies in the plausibility of key assumption of transitivity. This assumption implies that the effect modifiers' distribution is similar across treatment comparisons. Transitivity is statistically manifested through the consistency assumption which suggests that direct and indirect evidenc...
Article
Full-text available
This paper is concerned with a contemporary Bayesian approach to the effect of temperature on developmental rates. We develop statistical methods using recent computational tools to model four commonly used ecological non-linear mathematical curves that describe arthropods’ developmental rates. Such models address the effect of temperature fluctuat...
Article
The Power–Expected–Posterior (PEP) prior framework provides us a convenient and objective method to deal with variable selection problems, under the Bayesian perspective, in regression models. The PEP prior inherits all of the advantages of Expected–Posterior–Prior. Furthermore, it avoids the need of selection of imaginary data and mitigates their...
Article
Full-text available
A well known identifiability issue in factor analytic models is the invariance with respect to orthogonal transformations. This problem burdens the inference under a Bayesian setup, where Markov chain Monte Carlo (MCMC) methods are used to generate samples from the posterior distribution. We introduce a post-processing scheme in order to deal with...
Article
Full-text available
The Ising model is one of the most widely analyzed graphical models in network psychometrics. However, popular approaches to parameter estimation and structure selection for the Ising model cannot naturally express uncertainty about the estimated parameters or selected structures. To address this issue, this paper offers an objective Bayesian appro...
Article
The focus of this work is to estimate the number of team possessions in Euroleague basketball for seasons 2017–18, 2018–19 and 2019–20. To achieve this goal, we implemented the approaches proposed by Kubatko, et al. (2007). The statistical analysis on Euroleague data suggests a model similar to the one which is currently used in NBA with one, minor...
Article
Sports have been always a fertile area of application of mathematics with a wide range of potential problems, ranging from simple examples which can be effectively used to teach students fundamental ideas about mathematics, probability and statistics (see, e.g. Kvam & Sokol, 2004), up to complicated models that can be utilized to study space creati...
Preprint
Full-text available
Competitive balance is a desirable feature in any professional sports league and encapsulates the notion that there is unpredictability in the outcome of games as opposed to an imbalanced league in which the outcome of some games are more predictable than others, for example, when an apparent strong team plays against a weak team. In this paper, we...
Preprint
Full-text available
A simple and efficient adaptive Markov Chain Monte Carlo (MCMC) method, called the Metropolized Adaptive Subspace (MAdaSub) algorithm, is proposed for sampling from high-dimensional posterior model distributions in Bayesian variable selection. The MAdaSub algorithm is based on an independent Metropolis-Hastings sampler, where the individual proposa...
Preprint
Full-text available
This paper is concerned with a contemporary Bayesian approach to the effect of temperature on developmental rates. We develop statistical methods using recent computational tools to model four commonly used ecological non-linear mathematical curves that describe arthropods' developmental rates. Such models address the effect of temperature fluctuat...
Article
We study and develop Bayesian models for the analysis of volleyball match outcomes as recorded by the set-difference. Due to the peculiarity of the outcome variable (set-difference) which takes discrete values from $-3$ to $3$, we cannot consider standard models based on the usual Poisson or binomial assumptions used for other sports such as footba...
Preprint
Full-text available
Bayes Factors, the Bayesian tool for hypothesis testing, are receiving increasing attention in the literature. Compared to their frequentist rivals ($p$-values or test statistics), Bayes Factors have the conceptual advantage of providing evidence both for and against a null hypothesis and they can be calibrated so that they do not depend so heavily...
Article
A stochastic search method, the so-called Adaptive Subspace (AdaSub) method, is proposed for variable selection in high-dimensional linear regression models. The method aims at finding the best model with respect to a certain model selection criterion and is based on the idea of adaptively solving low-dimensional sub-problems in order to provide a...
Preprint
The Ising model is one of the most widely analyzed graphical models in network psychometrics. Unfortunately, popular approaches to parameter estimation and structure selection for the Ising model cannot naturally express uncertainty about the estimated parameters or selected structures. To address this issue, this paper offers an objective Bayesian...
Preprint
The aim of this paper is to study and develop Bayesian models for the analysis of the volleyball game outcomes as recorded by the difference of the winning sets. Due to the peculiarity of the outcome variable (set difference) which takes discrete values from 􀀀3 to 3, we cannot consider standard models based on the usual Poisson or binomial assumpti...
Article
Volleyball is a team sport with unique and specific characteristics. We introduce a new two-level hierarchical Bayesian model which accounts for these volleyball-specific characteristics. In the first level, we model the set outcome with a simple logistic regression model. Conditionally on the winner of the set, in the second level, we use a trunca...
Article
Full-text available
This paper focuses on the Bayesian model average (BMA) using the power–expected– posterior prior in objective Bayesian variable selection under normal linear models. We derive a BMA point estimate of a predicted value, and present computation and evaluation strategies of the prediction accuracy. We compare the performance of our method with that of...
Preprint
Full-text available
A well known identifiability issue in factor analytic models is the invariance with respect to orthogonal transformations. This problem burdens the inference under a Bayesian setup, where Markov chain Monte Carlo (MCMC) methods are used to generate samples from the posterior distribution. We introduce a post-processing scheme in order to deal with...
Preprint
One of the main approaches used to construct prior distributions for objective Bayes methods is the concept of random imaginary observations. Under this setup, the expected-posterior prior (EPP) offers several advantages, among which it has a nice and simple interpretation and provides an effective way to establish compatibility of priors among mod...
Preprint
Full-text available
Unlike what happens for other popular sports such as football, basketball and baseball, modelling the final outcomes of volleyball has not been thoroughly addressed by the statistical and the data science community. This is mainly due to the complexity of the game itself since the game is played in two levels of outcomes: the sets and the points (w...
Preprint
A stochastic search method, the so-called Adaptive Subspace (AdaSub) method, is proposed for variable selection in high-dimensional linear regression models. The method aims at finding the best model with respect to a certain model selection criterion and is based on the idea of adaptively solving low-dimensional sub-problems in order to provide a...
Preprint
Full-text available
Bayesian methods for graphical log-linear marginal models have not been developed in the same extent as traditional frequentist approaches. In this work, we introduce a novel Bayesian approach for quantitative learning for such models. These models belong to curved exponential families that are difficult to handle from a Bayesian perspective. Furth...
Article
Full-text available
In volleyball, due to the sequential structure of the game, each outcome results from events that follow consistent consecutive patterns: pass–set–attack–outcome, serve–outcome and block–dig–set–counter attack–outcome. There are three possible outcomes: point won, point lost, and rally continuation. With the aim of quantifying the importance of vol...
Article
The hyper‐g prior is a default choice for Bayesian variable selection in normal linear regression models. In this article we provide an overview of the Bayesian variable selection framework and explain in detail the specification for the hyper‐g prior setup. The practical implementation of the methods under consideration is demonstrated through the...
Article
Full-text available
In this work, the problem of transformation and simultaneous variable selection is thoroughly treated via objective Bayesian approaches by the use of default Bayes factor variants. Four uniparametric families of transformations (Box–Cox, Modulus, Yeo-Johnson and Dual), denoted by T, are evaluated and compared. The subjective prior elicitation for t...
Article
Full-text available
We provide a review of prior distributions for objective Bayesian analysis. We start by examining some foundational issues and then organize our exposition into priors for: i) estimation or prediction; ii) model selection; iii) highdimensional models. With regard to i), we present some basic notions, and then move to more recent contributions on di...
Article
Full-text available
The power-expected-posterior (PEP) prior provides an objective, automatic, consistent and parsimonious model selection procedure. At the same time it resolves the conceptual and computational problems due to the use of imaginary data. Namely, (i) it dispenses with the need to select and average across all possible minimal imaginary samples, and (ii...
Article
Full-text available
Thermodynamics have been shown to have direct applications in Bayesian model evaluation. Within a tempered transitions scheme, the Boltzmann–Gibbs distribution pertaining to different Hamiltonians is implemented to create a path which links the distributions of interest at the endpoints. As illustrated here, an optimal temperature exists along the...
Article
Full-text available
Epidemic data often possess certain characteristics, such as the presence of many zeros, the spatial nature of the disease spread mechanism, environmental noise, serial correlation and dependence on time varying factors. This paper addresses these issues via suitable Bayesian modelling. In doing so we utilise a general class of stochastic regressio...
Article
Full-text available
Power-expected-posterior (PEP) priors have been recently introduced as generalized versions of the expected-posterior-priors (EPPs) for variable selection in Gaussian linear models. They are minimally-informative priors that reduce the effect of training samples under the EPP approach, by combining ideas from the power-prior and unit-information-pr...
Article
Full-text available
The power-expected-posterior (PEP) prior is an objective prior for Gaussian linear models, which leads to consistent model selection inference and tends to favor parsimonious models. Recently, two new forms of the PEP prior where proposed which generalize its applicability to a wider range of models. We examine the properties of these two PEP varia...
Poster
Full-text available
In this work we develop novel hypothesis tests for association models for two way contingency tables. We focus on conjugate analysis for the uniform, row and column effect model which can be considered as Poisson log-linear or Multinomial logit models. For the row-column model we will develop an MCMC based approach which will try to explore conditi...
Conference Paper
Full-text available
In this work we present Bayesian hypothesis tests for the independence between two categorical variables in contingency tables. Initially we implement conjugate analysis based on the Multinomial-Dirichlet setup. We compute the Bayes factor and assess the sensitivity of the results to the prior distribution. Then we focus on log-linear models. We co...
Chapter
Full-text available
The stochastic search variable selection (SSVS), introduced by George and McCulloch[1], is one of the prominent Bayesian variable selection approaches for regression problems. Some of the basic principles of modern Bayesian variable selection methods were first introduced via the SSVS algorithm such as the use of a vector of variable inclusion indi...
Article
Full-text available
The power-conditional-expected-posterior (PCEP) prior developed for variable selection in normal regression models combines ideas from the power-prior and expected-posterior prior, relying on the concept of random imaginary data, and provides a consistent variable selection method which leads to parsimonious inference. In this paper we discuss the...
Article
Full-text available
Although competitive balance is an important concept for professional team sports, its quantification still remains an issue. The main objective of this study is to identify the best or optimal index for the study of competitive balance in European football using a number of economic variables and data from eight domestic leagues from 1959 to 2008....
Conference Paper
Full-text available
Volleyball is a competitive team sport whose main objective is to score the most points by grounding the ball to the opponents side of the court. The numbers of points a team scores is primarily based on the execution of the skills of the game. Due to the hierarchical structure of the game events follow stable patterns: serve outcome, pass-set-atta...
Article
In this work we consider Cloninger's psychobiological model, which measures two dimensions of personality: character and temperament. Temperament refers to the biological basis of personality and its characteristics, while character refers to an individual's attitudes towards own self, towards humanity and as part of the universe. The Temperament a...
Article
The problem of transformation selection is thoroughly treated from a Bayesian perspective. Several families of transformations are considered with a view to achieving normality: the Box-Cox, the Modulus, the Yeo and Johnson and the Dual transformation. Markov Chain Monte Carlo algorithms have been constructed in order to sample from the posterior d...
Article
Full-text available
Epidemic data often possess certain characteristics, such as the presence of many zeros, the spatial nature of the disease spread mechanism or environmental noise. This paper addresses these issues via suitable Bayesian modelling. In doing so we utilise stochastic regression models appropriate for spatio-temporal count data with an excess number of...
Article
Full-text available
Competitive balance is a key issue for any professional sport league substantiated by its effect on demand for league games or other associated products. This work focuses on the measurement of between-seasons competitive balance, the longest time-wise dimension, which captures the relative quality of teams across seasons. The review of the existin...
Article
Full-text available
The problem of transformation selection is thoroughly treated from a Bayesian perspective. Several families of transformations are considered with a view to achieving normality: the Box-Cox, the Modulus, the Yeo & Johnson and the Dual transformation. Markov chain Monte Carlo algorithms have been constructed in order to sample from the posterior dis...
Article
Full-text available
We investigate the efficiency of a marginal likelihood estimator where the product of the marginal posterior distributions is used as an importance-sampling function. The approach is generally applicable to multi-block parameter vector settings, does not require additional Markov Chain Monte Carlo (MCMC) sampling and is not dependent on the type of...
Article
Full-text available
In latent variable models the parameter estimation can be implemented by using the joint or the marginal likelihood, based on independence or conditional independence assumptions. The same dilemma occurs within the Bayesian framework with respect to the estimation of the Bayesian marginal (or integrated) likelihood, which is the main tool for model...
Article
Full-text available
Within path sampling framework, we show that probability distribution divergences, such as the Chernoff information, can be estimated via thermodynamic integration. The Boltzmann-Gibbs distribution pertaining to different Hamiltonians is implemented to derive tempered transitions along the path, linking the distributions of interest at the endpoint...
Article
Full-text available
In this paper we implement a Markov chain Monte Carlo algorithm based on the stochastic search variable selection method of George and McCulloch (1993) for identifying promising subsets of manifest variables (items) for factor analysis models. The suggested algorithm is constructed by embedding in the usual factor analysis model a normal mixture pr...
Article
Full-text available
Expected-posterior priors (EPP) have been proved to be extremely useful for testing hypothesis on the regression coefficients of normal linear models. One of the advantages of using EPPs is that impropriety of baseline priors causes no indeterminacy. However, in regression problems, they based on one or more \textit{training samples}, that could in...
Article
Full-text available
In the context of the expected-posterior prior (EPP) approach to Bayesian variable selection in linear models, we combine ideas from power-prior and unit-information-prior methodologies to simultaneously produce a minimally-informative prior and diminish the effect of training samples. The result is that in practice our power-expected-posterior (PE...
Article
Full-text available
The Zellner's g-prior and its recent hierarchical extensions are the most popular default prior choices in the Bayesian variable selection context. These prior set-ups can be expressed power-priors with fixed set of imaginary data. In this paper, we borrow ideas from the power-expected-posterior (PEP) priors in order to introduce, under the g-prior...
Conference Paper
Epidemic data often arise along with certain characteristics, such as the presence of many zeros, the spatial nature of disease spread mechanism or the environmental noise. This presentation addresses these issues via suitable Bayesian modelling. In doing so we utilize stochastic regression models appropriate for spatio-temporal count data with an...
Article
Full-text available
In this paper, we focus on the variable selection problem in normal regression models using the expected-posterior prior methodology. We provide a straightforward MCMC scheme for the derivation of the posterior distribution, as well as Monte Carlo estimates for the computation of the marginal likelihood and posterior model probabilities. Additional...
Article
Full-text available
The marginal likelihood can be notoriously difficult to compute, and particularly so in high-dimensional problems. Chib and Jeliazkov employed the local reversibility of the Metropolis–Hastings algorithm to construct an estimator in models where full conditional densities are not available analytically. The estimator is free of distributional assum...
Article
Full-text available
We propose a Bayesian implementation of the lasso regression that accomplishes both shrinkage and variable selection. We focus on the appropriate specification for the shrinkage parameter λ through Bayes factors that evaluate the inclusion of each covariate in the model formulation. We associate this parameter with the values of Pearson and partial...
Article
Full-text available
The common competitive balance indices have not been designed to fully account for the complex structure of European football leagues. Domestic championships are multi-prize tournaments since, in addition to the competition for the championship, the best teams also compete to qualify for the lucrative European tournaments, whereas the worst teams s...
Article
Full-text available
Mystery shopping is a well known marketing technique used by companies and marketing analysts to measure quality of service, and gather information about products and services. In this article, we analyse data from mystery shopping surveys via Bayesian Networks in order to examine and evaluate the quality of service offered by the loan departments...
Article
Full-text available
We consider the specification of prior distributions for Bayesian model comparison, focusing on regression-type models. We propose a particular joint specification of the prior distribution across models so that sensitivity of posterior model probabilities to the dispersion of prior distributions for the parameters of individual models (Lindley's p...
Article
Full-text available
Introduction: Bayesian modeling in the 21st centuryDefinition of statistical modelsBayes theoremModel-based Bayesian inferenceInference using conjugate prior distributionsNonconjugate analysisProblems
Article
Full-text available
We propose a conjugate and conditional conjugate Bayesian analysis of models of marginal independence with a bi-directed graph representation. We work with Markov equivalent directed acyclic graphs (DAGs) obtained using the same vertex set with the addition of some latent vertices when required. The DAG equivalent model is characterised by a minima...
Article
Full-text available
This paper deals with the Bayesian analysis of graphical models of marginal independence for three way contingency tables. Each marginal independence model corresponds to a particular factorization of the cell probabilities and a conjugate analysis based on Dirichlet prior can be performed. We illustrate a comprehensive Bayesian analysis of such mo...
Article
Full-text available
Competitive balance is an important concept in professional team sports; its measurement is, therefore, a critical issue. One of the most widely used indices, which was introduced for the estimation of seasonal competitive balance is the Concentration Ratio, which is a relatively simple index and measures the extent to which a league is dominated b...
Article
The reinvention of Markov chain Monte Carlo (MCMC) methods and their implementation within the Bayesian framework in the early 1990s has established the Bayesian approach as one of the standard methods within the applied quantitative sciences. Their extensive use in complex real life problems has lead to the increased demand for a friendly and easi...
Conference Paper
Full-text available
Football is one of the most popular professional team sports in the world and a very profitable business, as professional leagues (especially in Europe) show considerable growth in annual turnover figures. Despite its substantial growth, there are important issues that the industry has to address in order to ensure its long-term success. One of the...
Article
Full-text available
Existing methods for the prediction of the final scores in football games focus on modelling the numbers of goals scored by the two competitors with parameter estimation of the assumed model usually based on the maximum likelihood approach. Although this approach allows for sufficiently accurate prediction of the final score, it does not account fo...
Article
Full-text available
Crime is disproportionally concentrated in few areas. Though long established, there remains uncertainty about the reasons for variation in the concentration of similar crime (repeats) or different crime (multiples). Wholly neglected have been composite crimes when more than one crime types coincide as parts of a single event. The research reported...
Article
Full-text available
This paper provides a survey on studies that analyze the macroeconomic effects of intellectual property rights (IPR). The first part of this paper introduces different patent policy instruments and reviews their effects on R&D and economic growth. This part also discusses the distortionary effects and distributional consequences of IPR protection a...
Article
Full-text available
In the field of quality of health care measurement, one approach to assessing patient sickness at admission involves a logistic regression of mortality within 30 days of admission on a fairly large number of sickness indicators (on the order of 100) to construct a sickness scale, employing classical variable selection methods to find an ``optimal''...
Article
Full-text available
The measurement and improvement of the quality of health care are important areas of current research and development. A judgement of appropriateness of medical outcomes in hospital quality-of-care studies must depend on an assessment of patient sickness at admission to hospital. Indicators of patient sickness often must be abstracted from medical...
Article
Full-text available
The primary aim of the current article was the evaluation of the factorial composition of the Aggression Questionnaire (AQ(29)) in the Greek population. The translated questionnaire was administered to the following three heterogeneous adult samples: a general population sample from Athens, a sample of young male conscripts and a sample of individu...