A general purpose univariate probability model for environmental data analysis.

Office of Research and Development U.S. Environmental Protection Agency, Washington, D.C. 20460, U.S.A.
Computers & OR 01/1976; 3:209-216. DOI: 10.1016/0305-0548(76)90029-0
Source: DBLP

ABSTRACT Analysis of environmental quality data for decision making purposes (evaluation of compliance with standards, examination of environmental trends, determination of confidence intervals) generally requires a suitable univariate probability model. It sometimes is difficult, when many probability models are available, to select the most appropriate one for a given data set. The underlying physical laws which generate pollutant concentrations—diffusion processes—offer insight into which model may be most appropriate for a variety of situations. Treating the diffusion equation as a stochastic differential equation, the time series of pollutant concentration data from diffusion phenomena is shown to have a distribution that is best approximated by the censored, 3-parameter lognormal probability model (LN3C). The model is applied to 10 air quality data sets (SO2, O3, CO, participate, hydrocarbons, and NO2 from the United States, France, West Germany, and Denmark) and 9 water quality data sets (BOD, coliform, chloride, and sulfate from the Ohio River). The authors conclude that the LN3C probability model offers data analysts a superior, general purpose model suitable for a large variety of environmental phenomena.

  • [Show abstract] [Hide abstract]
    ABSTRACT: The study was carried out to predict the size separated particulate matter below 10 microm size (SSPM10) from vehicular exhausts at traffic intersections using modified general finite line source model (GFLSM). Two air quality control regions (AQCRs) were selected in Mumbai City for this study. One was industrial area (AQCR1) containing the busy intersection, i.e. Marol link road, with the heavy inflow of two-three wheelers. And, the other was commercial busy district area (AQCR2) containing the busy intersection, i.e. Dadar circle, with a heavy traffic flow especially cars. The model was applied at both the traffic intersections. The data were collected for modelling study for three winter months in 1995 using cascade impactor of nine size ranges. The prediction results revealed that modified GFLSM underpredicted the SSPM10 concentrations for all the size ranges. However, showed considerable correlation between observed and predicted values for the size range below 4.7 microm at both the intersections. The relative high concentrations observed in the coarser range of 10-4.7 microm are attributed to the resuspension of the roadside particulate matter. Hence, the amount of underprediction was more for this range, which was due to the characteristics of model that does not take into account the factor for resuspension of roadside particulate matter caused by traffic movements. The model was also applied to predict the total particulate matter for downwind distances from the road intersection. The statistical evaluation of model was done, which indicated that the model's performance was good for the finer range of particles (below 4.7 microm) with r-square values of 0.49 and 0.57 found at both the intersections in AQCR1 and AQCR2, respectively. However, it is not unusual that the model uncertainty is likely to exist due to data input errors and stochastic fluctuations irrespective of the models accurateness. The statistical distribution model was therefore identified using Kolmogorov-Smirnov test. At both the intersections, SSPM10 concentration data were found lognormally distributed.
    Environmental Monitoring and Assessment 12/2004; 98(1-3):23-40. · 1.59 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Air pollutant concentrations are essentially random variables and can be well described by statistical distribution models. The statistical distribution models are, therefore, useful tools in predicting the distribution of air pollutant concentrations. The statistical distributional form, fitting to the concentrations data, is based upon several factors, i.e. source types, pollutant types, emission patterns, meteorological conditions, and averaging times [Taylor, J.A., Jakeman, A.J., Simpson, R.W., 1986. Modeling distributions of air pollutant concentrations - I: identification of statistical models. Atmospheric Environment 20 (9), 1781-1789]. The statistical characteristics of dispersion of air pollutants in the atmosphere are represented by successive random dilution process [Ott, W.R., 1995. Environmental Statistics and Data Analysis. Lewis publishers]. This process may, however, differ depending upon the location of pollutant dispersion, i.e. near roadways, at intersections or in street canyons. Further, the distributional form may also differ. Several investigators, in the past, presumed lognormal distribution (LND) for the air quality data. While, a few found other distributional form when carried out the actual data analysis. The present paper develops the statistical distribution model fitting to carbon monoxide (CO) concentrations for the heterogeneous traffic pattern at the urban hotspots in Delhi, India. Three years of 1-h average CO concentration data (from 1997 to 1999), at the traffic intersection and near a roadway, are examined using goodness-of-fit tests for the suitable statistical distributional form. The results showed that the log logistic distribution model (LLD) best fit the CO concentration data at both the intersection and the roadway. It can therefore be deduced that 'heterogeneity in traffic' and 'emission patterns' may be affecting the statistical distributional form significantly.
    Environmental Modelling and Software. 01/2007; 22:526-535.
  • Source
    Journal of Exposure Science and Environmental Epidemiology 10/2007; 17(6):499-500. · 3.19 Impact Factor


Available from
May 27, 2014