Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models

Source: RePEc


We present several modifications of the Poisson and negative binomial models for count data to accommodate cases in which the number of zeros in the data exceed what would typically be predicted by either model. The excess zeros can masquerade as overdispersion. We present a new test procedure for distinguishing between zero inflation and overdispersion. We also develop a model for sample selection which is analogous to the Heckman style specification for continuous choice models. An application is presented to a data set on consumer loan behavior in which both of these phenomena are clearly present.

Download full-text


Available from: William H Greene
  • Source
    • "Zero Inflated Models combine two sources of zero outcomes which are called " true zeros " and " excess zeros " . Greene(1994) has investigated Zero Inflated Models as modifications of the Poisson and the Negative Binomial models. He also presents the test procedure to separate the zero inflation and overdispersion. "

    Preview · Article · Dec 2015
    • "The zero-inflated Poisson (ZIP) distribution was introduced by Lambert [17] as a solution to modelling such data. In addition, Greene [18] introduced the zero-inflated negative binomial (ZINB) distribution for a similar setting in which extra-Poisson variation is also present (see also [19] [20] [21] [22]). These distributions belong to a family of mixed Poisson distributions in which ω is a binary variable, taking 1 with probability p for zeros and 0 with probability 1 − p for all other counts. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Mixed Poisson distributions are widely used in various applications of count data mainly when extra variation is present. This paper introduces an extension in terms of a mixed strategy to jointly deal with extra-Poisson variation and zero-inflated counts. In particular, we propose the Poisson log-skew-normal distribution which utilizes the log-skew-normal as a mixing prior and present its main properties. This is directly done through additional hierarchy level to the lognormal prior and includes the Poisson lognormal distribution as its special case. Two numerical methods are developed for the evaluation of associated likelihoods based on the Gauss–Hermite quadrature and the Lambert's W function. By conducting simulation studies, we show that the proposed distribution performs better than several commonly used distributions that allow for over-dispersion or zero inflation. The usefulness of the proposed distribution in empirical work is highlighted by the analysis of a real data set taken from health economics contexts.
    No preview · Article · Nov 2015 · Journal of Statistical Computation and Simulation
    • "We present a model fitting the distribution of the noise, using an extension of the zero-inflated negative binomial model [4] with the aim of creating dynamic thresholds removing noise reads and preserving the informative information for later use in e.g. DNA mixture analysis. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a model fitting the distribution of non-systematic errors in STR second generation sequencing, SGS, analysis. The model fits the distribution of non-systematic errors, i.e. the noise, using a one-inflated, zero-truncated, negative binomial model. The model is a two component model. The first component models the excess of singleton reads, while the second component models the remainder of the errors according to a truncated negative binomial distribution.
    No preview · Article · Oct 2015 · Forensic Science International Genetics Supplement Series
Show more