Smoothed Histograms for Frequency Data on Irregular Intervals

ArticleinThe American Statistician 62(August):256-261 · February 2008with8 Reads
Impact Factor: 0.92 · DOI: 10.1198/000313008X335581 · Source: RePEc

    Abstract

    Frequency tables are often constructed on intervals of irregular width. When plotted as bar charts, the underlying true density information may be quite distorted. The majority of introductory statistics texts recommend tabulating data into intervals of equal width, but seldom caution the consequences of failing to do so. An occasional introductory text correctly emphasizes that area rather than frequency should be plotted. Nevertheless, the correctly scaled density figure is often visually less informative than one might expect, with wide bins at constant height. In many cases, the rightmost bin interval has no well-defined end-point, making its depiction somewhat arbitrary. In this note, we introduce a regular histogram approximation that matches the frequencies and also minimizes a roughness criterion for visual and exploratory appeal. The resulting estimate can reveal the density structure much more clearly. We also formulate an alternative criterion that explicitly takes account of the uncertainty in the bin frequencies.