Comparing Population Distributions from Bin-aggregated Sample Data: An Application to Historical Height Data from France
We develop a methodology to estimate underlying (continuous) population distributions from bin-aggregated sample data through the estimation of the parameters of mixtures of distributions that allow for maximal parametric flexibility. The statistical approach we develop enables comparisons of the full distributions of height data from potential army conscripts across France's 88 departments for most of the nineteenth century. These comparisons are made by testing for differences-of-means stochastic dominance. Corrections for possible measurement errors are also devised by taking advantage of the richness of the data sets. Our methodology is of interest to researchers working on bin-aggregated or histogram-type data, something that is still widely done since much of the information that is publicly available is in that form, often due to restrictions based on confidentiality concerns.