Content uploaded by Thomas Plocoste
Author content
All content in this area was uploaded by Thomas Plocoste on Jul 02, 2023
Content may be subject to copyright.
Modeling and Forecasting
Particulate Matter (PM10) Concentrations in the Caribbean Area
Esdra Alexis1,2,★ ,Thomas Plocoste3,4and Silvere Paul Nuiro1
1. LAMIA (UR1_1), Université des Antilles (UA), Fouillole - BP 250 97157 Pointe-à-Pitre, France
2. Université d’État d’Haïti (UEH), HT6110 Port-au-Prince, Haïti
3. Department of Research in Geoscience, KaruSphère, 97139 Abymes, Guadeloupe (F.W.I.)
4. LaRGE (UR2_1), Université des Antilles (UA), Fouillole - BP 250 97157 Pointe-à-Pitre, France
★Correspondence : Esdra.Alexis@etu.univ-antilles.fr; esdraalexis@yahoo.fr
Abstract
The aim of this study is to model and predict the concentration
of particulate matter (PM10) data in the Caribbean area us-
ing a coupled SARIMA-GARCH model. This latter models PM10
concentration in Guadeloupe (GPE) and Puerto Rico (PR) from
2006 to 2010 based on the seasonality of African dust and the ex-
treme events. The SARIMA process is representative of the main
PM10 sources, while the heteroskedasticity of its residual errors is
also taken into account by the GARCH process. Firstly, the issue of
missing data is addressed using algorithms that we have proposed.
Then, the coupled SARIMA-GARCH model was developed and
compared with empirical PM10 data. The Akaike’s information
criterion (AIC) helped us to choose the best model. The Forecast
evaluation indexes such as MASE and Theil’s 𝑈statistic provided
significant results. To sum up, this coupled model is an effective
tool for predicting the behavior of PM10 in the Caribbean area
[1].
Keywords: PM10 concentration ; SARIMA-GARCH model ; het-
eroskedasticity ; forecast; Caribbean area.
Introduction
In the literature, many stochastic models have been developed, and some of them are implemented in air pollution modeling. Thus,
the ARIMA (Autoregressive Integrated Moving Average) models of Box-Jenkins are firstly introduced. Then, the GARCH model is
used to model the temporal variability of the aforementioned processes. In this framework, the PM10 data from GPE (respectively PR)
are considered respectively as achievements of an 𝑋𝑡(respectively 𝑌𝑡) process decomposed into the sum of the background atmosphere
𝐵𝑡(anthropogenic activities + marine aerosol), African dust seasonality 𝑆𝑡(mineral dust) [2], and the extreme events processes 𝐶𝑡(e.g.,
fire, volcanic eruption, . . .). For example, for all positive integers 𝑑, 𝐷, 𝑝, 𝑃, 𝑞, 𝑄, 𝛼 and 𝛽, we have:
∀𝑡∈Z, 𝑋𝑡=𝐵𝑡+𝑆𝑡+𝐶𝑡∼SARIMA(p,d,q)(P,D,Q)[s] +𝐺 𝐴𝑅𝐶 𝐻 (𝛼, 𝛽).(1)
In this first relationship, the 𝐵𝑡, 𝑆𝑡and 𝐶𝑡terms are mainly taken into account by the SARIMA process. The GARCH model is used
to analyze the heteroscedastic behavior of the residuals of the latter and 𝑠denotes the period of seasonality. The SARIMA and GARCH
processes have been applied in various fields such as mobile communication networks, climatology, tourism, and economics, to mention
a few. No study has yet investigated the behavior of PM10 concentration using this coupled model, which is considered a combination
of the SARIMA and GARCH processes.
Materials
The PM10 data used come from GPE (16.242 °N, -61.541°E)
and PR (18.431°N, -66.142°E) which air quality networks
are respectively Gwad’Air and AirNow. Several statistical ele-
ments were necessary to carry out the research. For example, the
two algorithms (2) and (3) were implemented, depending on the
low (From Oct. to Apr.) or high (From May to Sep.) season of the African
dust, to correct the data for missing values.
•Missing value correction algorithms (low dust season)
𝑃𝑀10𝑖=
1
2𝑃𝑀10inf +𝑃𝑀10sup,(2)
where 𝑃𝑀10inf is the last observed value before the missing one
value, and 𝑃𝑀10sup is the first one after.
•Missing value correction algorithms (high dust season)
For a fixed time 𝑖, we note 𝑋𝑖(resp. 𝑌𝑖), the missing value in
the PM10 concentration for GPE (resp. PR). It can be estimated
according to the stochastic regression equation (3) from the value
present in the PM10 data for PR (resp. GPE) at time 𝑖+𝛿𝑖(resp.
𝑖−𝛿𝑖).(𝑌𝑖=exp 𝛾+𝛿·ln 𝑋𝑖−𝛿𝑖+𝑢𝑖,(PM10PR data)
𝑋𝑖=exp 𝛾′+𝛿′·ln 𝑌𝑖+𝛿𝑖+𝑣𝑖,(PM10GPE data)
(3)
where 𝛿𝑖∈ {1,2}denotes the travel time of a particle between
the two sites ; 𝑢𝑖and 𝑣𝑖are numerical hazards. Figure 1illustrates
the temporal fluctuations of PM10 concentration over one of the
two study locations, i.e. GPE.
Figure 1: Chronogram of PM10 in GPE.
Result 1 : The SARIMA model selected
The autocorrelation functions illustrated in Figures 2and 3helped us to select the SARIMA model orders of PM10 concentrations for
GPE. For example, Figure 2exhibits the existence of significant seasonality in the data. Table 1shows goodness of fit and selection of
the two SARIMA processes as models with the smallest AIC intended to analyze the behavior of PM10 concentrations for GPE and PR.
In Table 2, we present a parametrization of the PM10 models for GPE and PR.
Table 1: Checking of the PM10 model information criteria in GPE and PR.
PM10GPE PM10PR
Model AIC Model AIC
𝑆 𝐴𝑅𝐼 𝑀 𝐴(3,0,1) (0,1,0) [365] −2964.79 𝑆 𝐴𝑅 𝐼 𝑀 𝐴(1,0,1) (0,1,0) [365] −3112.59
𝑆 𝐴𝑅𝐼 𝑀 𝐴(2,0,1) (0,1,0) [365] −2961.21 𝑆 𝐴𝑅 𝐼 𝑀 𝐴(2,0,1) (0,1,0) [365] −3108.94
𝑆 𝐴𝑅𝐼 𝑀 𝐴(4,0,2) (0,1,0) [365] −2961.61 𝑆 𝐴𝑅 𝐼 𝑀 𝐴(0,0,1) (0,1,0) [365] −3005.81
𝑆 𝐴𝑅𝐼 𝑀 𝐴(2,0,0) (0,1,0) [365] −2960.96 𝑆 𝐴𝑅 𝐼 𝑀 𝐴(1,0,2) (0,1,0) [365] −3109.79
𝑆 𝐴𝑅𝐼 𝑀 𝐴(1,0,3) (0,1,0) [365] −2965.30 𝑆 𝐴𝑅𝐼 𝑀 𝐴(0,0,5) (0,1,0) [365] −3114.69
Table 2: SARIMA model parameters on PM10 in (a) GPE and (b) PR.
Model Coef. Estimate Std. Error t-test p-value
(a) : SARIMA(1,0,3)(0,1,0)[365]
𝐴𝑅1 0.8275 0.0933 8.87 0.000000
𝑀 𝐴1−0.2186 0.0989 −2.21 0.027081
𝑀 𝐴2−0.2338 0.0698 −3.35 0.000811
𝑀 𝐴3−0.0931 0.0463 −2.01 0.044123
(b) : SARIMA(0,0,5)(0,1,0)[365]
𝑀 𝐴1 0.5947 0.0261 22.75 0.000000
𝑀 𝐴2 0.2878 0.0304 9.46 0.000000
𝑀 𝐴3 0.1457 0.0313 4.66 0.000003
𝑀 𝐴4 0.1112 0.0295 3.77 0.000161
𝑀 𝐴5 0.0959 0.0259 3.70 0.000216
(1−0.8275𝐵) (1−𝐵365 )𝑋𝑡=(1−0.2186𝐵−0.2338𝐵2−0.0931𝐵3)𝜁𝑡
(1−𝐵365)𝑌𝑡=(1+0.5947𝐵+0.2878𝐵2+0.1457𝐵3+0.1112𝐵4+0.0959𝐵5)𝜁′
𝑡,
Figure 2: ACF of PM10GPE concentrations.
Figure 3: PACF of PM10GPE concentrations.
where 𝐵: backward operator ; 𝜁𝑡and 𝜁′
𝑡: residual errors of the SARIMA model for GPE and PR.
Result 2 : The GARCH model parameterized for the residuals
Figure 4: Clustering of variability and heteroskedasticity
of PM10 model residuals for PR.
The heteroskedasticity or leptokurticity of the residual errors from the SARIMA model is
illustrated in Figure 4. Table 3gives the different parameters of the GARCH model of the
residuals from the SARIMA process , as well as a study of their statistical significance.
System (4) describes their conditional variance which depends on the temporal dynamics
𝑡.
𝜁𝑡=𝜎𝑡𝑍𝑡;𝜁′
𝑡=𝜎′
𝑡𝑍𝑡;𝑍𝑡∼𝑊 𝑁 (0,1)
𝜎2
𝑡=𝑉𝜁𝑡/F𝑡−1=8.652 ×10−07 +0.1622𝜁2
𝑡−1+0.8368𝜎2
𝑡−1
𝜎′2
𝑡=𝑉𝜁′
𝑡/F𝑡−1=8.567 ×10−08 +0.1951𝜁′
𝑡−12+0.8039𝜎′
𝑡−12(4)
The 𝜁𝑡and 𝜁′
𝑡residuals satisfy the properties of (G)ARCH processes :
•martingale difference ;
•time-dependent conditional variance 𝜎2
𝑡=𝑉𝜁𝑡/F𝑡−;
•conditional zero self-covariance ;
•leptokurtic residual distribution.
F𝑡−1denotes the history of the processes 𝜁𝑡and 𝜁′
𝑡up to the date excluding 𝑡.
Table 3: GARCH model parameters on SARIMA model residuals.
PM10GPE PM10PR
Estimate Std. Error t value Pr(>|t|) Estimate Std. Error t value Pr(>|t|)
𝜔8.652 ×10−70.00 ×1002.18 0.02 8.567 ×10−81.00 ×10−60.06 0.94
𝑎11.622 ×10−19.48 ×10−317.10 0.00 1.951 ×10−19.91 ×10−319.68 0.00
𝑏18.368 ×10−18.30 ×10−3100.81 0.00 8.039 ×10−18.67 ×10−392.71 0.00
The parameters 𝜔, 𝑎1and 𝑏1are such that :
𝜔:lower boundary
𝑎1:Effect of an extreme event
𝑏1:Persistence of the variability
Result 3 : The forecast of the coupled SARIMA-GARCH model derived
Figure 5illustrates the 1-year horizon forecasts computed from the coupled SARIMA-GARCH model values, based on the series after transformations (Box-Cox and differentiation). Table 4presents the indexes of forecast accuracy.
Figure 5: Forecast from the SARIMA-GARCH model in GPE.
The coupled SARIMA-GARCH model built is better than the SARIMA process taken separately. It concerns
the case where the residuals of this latter do not obey a Gaussian distribution or exhibit heteroskedastic behavior
despite the transformation functions used ( e.g., the Box-Cox transformation formula). Extreme events related
to dust outbreaks or volcanic eruption may explain this behavior. The values predicted by this coupled model
are obtained by summing the prediction values of both models. All the forecast evaluation indexes in Table
4confirm that the coupled model is a suitable approach to predict PM10 behavior in the Caribbean area [1,3].
Table 4: Forecast accuracy of PM10 models in GPE and PR.
PM10GPE PM10PR
Models n MAPE
(%)MASE 𝑈1𝑈2nMAPE
(%)MASE 𝑈1𝑈2
SARIMA 350 3.74 0.03 0.08 0.17 350 2.31 0.01 0.03 0.07
GARCH 365 134 0.78 0.56 0.88 365 142 0.77 0.55 0.88
SARIMA-GARCH 350 15 0.06 0.04 0.09 337 2.39 0.01 0.03 0.06
Conclusion
The coupled SARIMA-GARCH model highlights the special features of PM10 concentrations for the Caribbean area. Thus, the SARIMA-GARCH
combination is a good tool to forecast PM10 behavior in this region [1]. The modeling results could be extended to the nearby islands of GPE and
PR to better understand the seasonal impact of dust outbreaks on the environment and human health. The main difficulty encountered during
the modeling process concerns the choice of the model. Although our model provides significant results, it is based on an approach with fixed
seasonality. A future application of this coupled model could be to investigate the impact of PM10 on human health and the environment in Haiti.
Acknowledgements
This work is supported by the Bank of the Republic of Haiti (BRH) and the
French Embassy in Haiti. We also thank Gwad’Air (http://www.gwadair.fr/)
and AirNow (https://www.airnow.gov) air quality networks for providing
PM10 data.
References
[1] E. Alexis, T. Plocoste, and S. P. Nuiro, “Analysis of Particulate Matter (PM10) Behavior in the Caribbean Area Using a Coupled SARIMA-GARCH Model,” Atmosphere, vol. 13, no. 6, p. 862, 2022.
[2] T. Plocoste, “Multiscale analysis of the dynamic relationship between particulate matter (PM10) and meteorological parameters using ceemdan: A focus on “godzilla” african dust event,” Atmospheric Pollution Research, vol. 13,
no. 1, p. 101 252, 2022.
[3] R. J. Hyndman and A. B. Koehler, “Another look at measures of forecast accuracy,” International journal of forecasting, vol. 22, no. 4, pp. 679–688, 2006.
biblio