PosterPDF Available

Modeling and Forecasting Particulate Matter (PM10) Concentrations in the Caribbean Area

Authors:

Abstract

The aim of this study is to model and predict the concentration of particulate matter (PM10) data in the Caribbean area using a coupled SARIMA-GARCH model. This latter models PM10 concentration in Guadeloupe (GPE) and Puerto Rico (PR) from 2006 to 2010 based on the seasonality of African dust and the extreme events. The SARIMA process is representative of the main PM10 sources, while the heteroskedasticity of its residual errors is also taken into account by the GARCH process. Firstly, the issue of missing data is addressed using algorithms that we have proposed. Then, the coupled SARIMA-GARCH model was developed and compared with empirical PM10 data. The Akaike's information criterion (AIC) helped us to choose the best model. The Forecast evaluation indexes such as MASE and Theil's statistic provided significant results. To sum up, this coupled model is an effective tool for predicting the behavior of PM10 in the Caribbean area.
Modeling and Forecasting
Particulate Matter (PM10) Concentrations in the Caribbean Area
Esdra Alexis1,2,★ ,Thomas Plocoste3,4and Silvere Paul Nuiro1
1. LAMIA (UR1_1), Université des Antilles (UA), Fouillole - BP 250 97157 Pointe-à-Pitre, France
2. Université d’État d’Haïti (UEH), HT6110 Port-au-Prince, Haïti
3. Department of Research in Geoscience, KaruSphère, 97139 Abymes, Guadeloupe (F.W.I.)
4. LaRGE (UR2_1), Université des Antilles (UA), Fouillole - BP 250 97157 Pointe-à-Pitre, France
Correspondence : Esdra.Alexis@etu.univ-antilles.fr; esdraalexis@yahoo.fr
Abstract
The aim of this study is to model and predict the concentration
of particulate matter (PM10) data in the Caribbean area us-
ing a coupled SARIMA-GARCH model. This latter models PM10
concentration in Guadeloupe (GPE) and Puerto Rico (PR) from
2006 to 2010 based on the seasonality of African dust and the ex-
treme events. The SARIMA process is representative of the main
PM10 sources, while the heteroskedasticity of its residual errors is
also taken into account by the GARCH process. Firstly, the issue of
missing data is addressed using algorithms that we have proposed.
Then, the coupled SARIMA-GARCH model was developed and
compared with empirical PM10 data. The Akaike’s information
criterion (AIC) helped us to choose the best model. The Forecast
evaluation indexes such as MASE and Theil’s 𝑈statistic provided
significant results. To sum up, this coupled model is an effective
tool for predicting the behavior of PM10 in the Caribbean area
[1].
Keywords: PM10 concentration ; SARIMA-GARCH model ; het-
eroskedasticity ; forecast; Caribbean area.
Introduction
In the literature, many stochastic models have been developed, and some of them are implemented in air pollution modeling. Thus,
the ARIMA (Autoregressive Integrated Moving Average) models of Box-Jenkins are firstly introduced. Then, the GARCH model is
used to model the temporal variability of the aforementioned processes. In this framework, the PM10 data from GPE (respectively PR)
are considered respectively as achievements of an 𝑋𝑡(respectively 𝑌𝑡) process decomposed into the sum of the background atmosphere
𝐵𝑡(anthropogenic activities + marine aerosol), African dust seasonality 𝑆𝑡(mineral dust) [2], and the extreme events processes 𝐶𝑡(e.g.,
fire, volcanic eruption, . . .). For example, for all positive integers 𝑑, 𝐷, 𝑝, 𝑃, 𝑞, 𝑄, 𝛼 and 𝛽, we have:
𝑡Z, 𝑋𝑡=𝐵𝑡+𝑆𝑡+𝐶𝑡SARIMA(p,d,q)(P,D,Q)[s] +𝐺 𝐴𝑅𝐶 𝐻 (𝛼, 𝛽).(1)
In this first relationship, the 𝐵𝑡, 𝑆𝑡and 𝐶𝑡terms are mainly taken into account by the SARIMA process. The GARCH model is used
to analyze the heteroscedastic behavior of the residuals of the latter and 𝑠denotes the period of seasonality. The SARIMA and GARCH
processes have been applied in various fields such as mobile communication networks, climatology, tourism, and economics, to mention
a few. No study has yet investigated the behavior of PM10 concentration using this coupled model, which is considered a combination
of the SARIMA and GARCH processes.
Materials
The PM10 data used come from GPE (16.242 °N, -61.541°E)
and PR (18.431°N, -66.142°E) which air quality networks
are respectively Gwad’Air and AirNow. Several statistical ele-
ments were necessary to carry out the research. For example, the
two algorithms (2) and (3) were implemented, depending on the
low (From Oct. to Apr.) or high (From May to Sep.) season of the African
dust, to correct the data for missing values.
Missing value correction algorithms (low dust season)
𝑃𝑀10𝑖=
1
2𝑃𝑀10inf +𝑃𝑀10sup,(2)
where 𝑃𝑀10inf is the last observed value before the missing one
value, and 𝑃𝑀10sup is the first one after.
Missing value correction algorithms (high dust season)
For a fixed time 𝑖, we note 𝑋𝑖(resp. 𝑌𝑖), the missing value in
the PM10 concentration for GPE (resp. PR). It can be estimated
according to the stochastic regression equation (3) from the value
present in the PM10 data for PR (resp. GPE) at time 𝑖+𝛿𝑖(resp.
𝑖𝛿𝑖).(𝑌𝑖=exp 𝛾+𝛿·ln 𝑋𝑖𝛿𝑖+𝑢𝑖,(PM10PR data)
𝑋𝑖=exp 𝛾+𝛿·ln 𝑌𝑖+𝛿𝑖+𝑣𝑖,(PM10GPE data)
(3)
where 𝛿𝑖 {1,2}denotes the travel time of a particle between
the two sites ; 𝑢𝑖and 𝑣𝑖are numerical hazards. Figure 1illustrates
the temporal fluctuations of PM10 concentration over one of the
two study locations, i.e. GPE.
Figure 1: Chronogram of PM10 in GPE.
Result 1 : The SARIMA model selected
The autocorrelation functions illustrated in Figures 2and 3helped us to select the SARIMA model orders of PM10 concentrations for
GPE. For example, Figure 2exhibits the existence of significant seasonality in the data. Table 1shows goodness of fit and selection of
the two SARIMA processes as models with the smallest AIC intended to analyze the behavior of PM10 concentrations for GPE and PR.
In Table 2, we present a parametrization of the PM10 models for GPE and PR.
Table 1: Checking of the PM10 model information criteria in GPE and PR.
PM10GPE PM10PR
Model AIC Model AIC
𝑆 𝐴𝑅𝐼 𝑀 𝐴(3,0,1) (0,1,0) [365] 2964.79 𝑆 𝐴𝑅 𝐼 𝑀 𝐴(1,0,1) (0,1,0) [365] 3112.59
𝑆 𝐴𝑅𝐼 𝑀 𝐴(2,0,1) (0,1,0) [365] 2961.21 𝑆 𝐴𝑅 𝐼 𝑀 𝐴(2,0,1) (0,1,0) [365] 3108.94
𝑆 𝐴𝑅𝐼 𝑀 𝐴(4,0,2) (0,1,0) [365] 2961.61 𝑆 𝐴𝑅 𝐼 𝑀 𝐴(0,0,1) (0,1,0) [365] 3005.81
𝑆 𝐴𝑅𝐼 𝑀 𝐴(2,0,0) (0,1,0) [365] 2960.96 𝑆 𝐴𝑅 𝐼 𝑀 𝐴(1,0,2) (0,1,0) [365] 3109.79
𝑆 𝐴𝑅𝐼 𝑀 𝐴(1,0,3) (0,1,0) [365] 2965.30 𝑆 𝐴𝑅𝐼 𝑀 𝐴(0,0,5) (0,1,0) [365] 3114.69
Table 2: SARIMA model parameters on PM10 in (a) GPE and (b) PR.
Model Coef. Estimate Std. Error t-test p-value
(a) : SARIMA(1,0,3)(0,1,0)[365]
𝐴𝑅1 0.8275 0.0933 8.87 0.000000
𝑀 𝐴10.2186 0.0989 2.21 0.027081
𝑀 𝐴20.2338 0.0698 3.35 0.000811
𝑀 𝐴30.0931 0.0463 2.01 0.044123
(b) : SARIMA(0,0,5)(0,1,0)[365]
𝑀 𝐴1 0.5947 0.0261 22.75 0.000000
𝑀 𝐴2 0.2878 0.0304 9.46 0.000000
𝑀 𝐴3 0.1457 0.0313 4.66 0.000003
𝑀 𝐴4 0.1112 0.0295 3.77 0.000161
𝑀 𝐴5 0.0959 0.0259 3.70 0.000216
(10.8275𝐵) (1𝐵365 )𝑋𝑡=(10.2186𝐵0.2338𝐵20.0931𝐵3)𝜁𝑡
(1𝐵365)𝑌𝑡=(1+0.5947𝐵+0.2878𝐵2+0.1457𝐵3+0.1112𝐵4+0.0959𝐵5)𝜁
𝑡,
Figure 2: ACF of PM10GPE concentrations.
Figure 3: PACF of PM10GPE concentrations.
where 𝐵: backward operator ; 𝜁𝑡and 𝜁
𝑡: residual errors of the SARIMA model for GPE and PR.
Result 2 : The GARCH model parameterized for the residuals
Figure 4: Clustering of variability and heteroskedasticity
of PM10 model residuals for PR.
The heteroskedasticity or leptokurticity of the residual errors from the SARIMA model is
illustrated in Figure 4. Table 3gives the different parameters of the GARCH model of the
residuals from the SARIMA process , as well as a study of their statistical significance.
System (4) describes their conditional variance which depends on the temporal dynamics
𝑡.
𝜁𝑡=𝜎𝑡𝑍𝑡;𝜁
𝑡=𝜎
𝑡𝑍𝑡;𝑍𝑡𝑊 𝑁 (0,1)
𝜎2
𝑡=𝑉𝜁𝑡/F𝑡1=8.652 ×1007 +0.1622𝜁2
𝑡1+0.8368𝜎2
𝑡1
𝜎2
𝑡=𝑉𝜁
𝑡/F𝑡1=8.567 ×1008 +0.1951𝜁
𝑡12+0.8039𝜎
𝑡12(4)
The 𝜁𝑡and 𝜁
𝑡residuals satisfy the properties of (G)ARCH processes :
martingale difference ;
time-dependent conditional variance 𝜎2
𝑡=𝑉𝜁𝑡/F𝑡;
conditional zero self-covariance ;
leptokurtic residual distribution.
F𝑡1denotes the history of the processes 𝜁𝑡and 𝜁
𝑡up to the date excluding 𝑡.
Table 3: GARCH model parameters on SARIMA model residuals.
PM10GPE PM10PR
Estimate Std. Error t value Pr(>|t|) Estimate Std. Error t value Pr(>|t|)
𝜔8.652 ×1070.00 ×1002.18 0.02 8.567 ×1081.00 ×1060.06 0.94
𝑎11.622 ×1019.48 ×10317.10 0.00 1.951 ×1019.91 ×10319.68 0.00
𝑏18.368 ×1018.30 ×103100.81 0.00 8.039 ×1018.67 ×10392.71 0.00
The parameters 𝜔, 𝑎1and 𝑏1are such that :
𝜔:lower boundary
𝑎1:Effect of an extreme event
𝑏1:Persistence of the variability
Result 3 : The forecast of the coupled SARIMA-GARCH model derived
Figure 5illustrates the 1-year horizon forecasts computed from the coupled SARIMA-GARCH model values, based on the series after transformations (Box-Cox and differentiation). Table 4presents the indexes of forecast accuracy.
Figure 5: Forecast from the SARIMA-GARCH model in GPE.
The coupled SARIMA-GARCH model built is better than the SARIMA process taken separately. It concerns
the case where the residuals of this latter do not obey a Gaussian distribution or exhibit heteroskedastic behavior
despite the transformation functions used ( e.g., the Box-Cox transformation formula). Extreme events related
to dust outbreaks or volcanic eruption may explain this behavior. The values predicted by this coupled model
are obtained by summing the prediction values of both models. All the forecast evaluation indexes in Table
4confirm that the coupled model is a suitable approach to predict PM10 behavior in the Caribbean area [1,3].
Table 4: Forecast accuracy of PM10 models in GPE and PR.
PM10GPE PM10PR
Models n MAPE
(%)MASE 𝑈1𝑈2nMAPE
(%)MASE 𝑈1𝑈2
SARIMA 350 3.74 0.03 0.08 0.17 350 2.31 0.01 0.03 0.07
GARCH 365 134 0.78 0.56 0.88 365 142 0.77 0.55 0.88
SARIMA-GARCH 350 15 0.06 0.04 0.09 337 2.39 0.01 0.03 0.06
Conclusion
The coupled SARIMA-GARCH model highlights the special features of PM10 concentrations for the Caribbean area. Thus, the SARIMA-GARCH
combination is a good tool to forecast PM10 behavior in this region [1]. The modeling results could be extended to the nearby islands of GPE and
PR to better understand the seasonal impact of dust outbreaks on the environment and human health. The main difficulty encountered during
the modeling process concerns the choice of the model. Although our model provides significant results, it is based on an approach with fixed
seasonality. A future application of this coupled model could be to investigate the impact of PM10 on human health and the environment in Haiti.
Acknowledgements
This work is supported by the Bank of the Republic of Haiti (BRH) and the
French Embassy in Haiti. We also thank Gwad’Air (http://www.gwadair.fr/)
and AirNow (https://www.airnow.gov) air quality networks for providing
PM10 data.
References
[1] E. Alexis, T. Plocoste, and S. P. Nuiro, “Analysis of Particulate Matter (PM10) Behavior in the Caribbean Area Using a Coupled SARIMA-GARCH Model, Atmosphere, vol. 13, no. 6, p. 862, 2022.
[2] T. Plocoste, “Multiscale analysis of the dynamic relationship between particulate matter (PM10) and meteorological parameters using ceemdan: A focus on “godzilla” african dust event, Atmospheric Pollution Research, vol. 13,
no. 1, p. 101 252, 2022.
[3] R. J. Hyndman and A. B. Koehler, “Another look at measures of forecast accuracy, International journal of forecasting, vol. 22, no. 4, pp. 679–688, 2006.
biblio
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The aim of this study was to model the behavior of particles with aerodynamic diameter lower or equal to 10 µm (PM10) in the Caribbean area according to African dust seasonality. To carry out this study, PM10 measurement from Guadeloupe (GPE) and Puerto Rico (PR) between 2006 and 2010 were used. Firstly, the missing data issues were addressed using algorithms that we elaborated. Thereafter, the coupled SARIMA-GARCH (Seasonal Autoregressive Integrated Moving Average and Generalized Autoregressive Conditional Heteroscedastic) model was developed and compared to PM10 empirical data. The SARIMA process is representative of the main PM10 sources, while the heteroskedasticity is also taken into account by the GARCH process. In this framework, PM10 data from GPE and PR are decomposed into the sum of the background atmosphere (Bt = anthropogenic activities + marine aerosol), African dust seasonality (St = mineral dust), and extreme events processes (Ct). Akaike’s information criterion (AIC) helped us to choose the best model. Forecast evaluation indexes such as the Mean Absolute Percentage Error (MAPE), the Mean Absolute Scale Error (MASE), and Theil’s U statistic provided significant results. Specifically, the MASE and U values were found to be almost zero. Thus, these indexes validated the forecasts of the coupled SARIMA-GARCH model. To sum up, the SARIMA-GARCH combination is an efficient tool to forecast PM10 behavior in the Caribbean area.
Article
This study investigates the dynamic relationship between particulate matter with less than 10μm aerodynamic diameter (PM10) and meteorological parameters (i.e., solar radiation (SR), air temperature (T), wind speed and direction (U and D), rainfall (R), relative humidity (Rh), and visibility (V)), while using time-dependent intrinsic correlation (TDIC) analysis based on complete ensemble empirical mode decomposition with adaptive noise. The TDIC analysis captured both negative and positive correlations between PM10 and the meteorological parameters at all examined time scales; nevertheless, as high PM10 concentrations were mainly related to synoptic scale sources, the correlations were more significant for a mean time period ranging from 1 to 7 d. In the high dust season (i.e., from May to September), results showed that dust outbreaks have a major impact on climate. Trends differ among meteorological parameters: At daily scale, positive (negative) correlations were found between PM10 and SR, T, U, and V (R and Rh), while correlation strength may change with increasing time scale. In addition, transition periods between the low (i.e., from October to April) and high dust season, but also before and after the passages of rainy events, were identified by the TDIC analysis. The impact of the largest African dust storm in the last 50 years on climate has also been identified locally at a time scale between 1 and 4 d, which corresponds to the duration of its passage.
Article
We discuss and compare measures of accuracy of univariate time series forecasts. The methods used in the M-competition as well as the W-competition, and many of the measures recommended by previous authors on this topic, are found to be degenerate in commonly occurring situations. Instead, we propose that the mean absolute scaled error become the standard measure for comparing forecast accuracy across multiple time series. (c) 2006 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.