Available via license: CC BY 4.0
Content may be subject to copyright.
Modeling Non-Homogenous Poisson Process and
Estimating the Intensity Function for Earthquake
Occurrences in Iraq using Simulation
Khwazbeen Saida Fatah ( Khwazbeen.fatah@su.edu.krd )
Salahaddin University-Erbil
Research Article
Keywords: Nonhomogeneous Poisson process, Intensity function, Earthquake Occurrence, Simulation
Posted Date: December 1st, 2023
DOI: https://doi.org/10.21203/rs.3.rs-3675569/v1
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
Additional Declarations: No competing interests reported.
1
Modeling Non-Homogenous Poisson Process and Estimating the Intensity
Function for Earthquake Occurrences in Iraq using Simulation
Khwazbeen Saida Fatah
Department of Mathematics/ College of Science/
University of Salahaddin, Kurdistan Region, Iraq,
Khwazbeen.fatah@su.edu.krd
Abstract
The Non-homogeneous Poisson Process, with time dependent intensity functions, is commonly
used to model the scenarios of counting the number of events that appear to take place in a given
time interval. The identification of the process relies on the functional form for the intensity
function, which can be difficult to determine. In this paper, a Non-Homogenous Poisson Process
model is proposed to predict the intensity function for the number of earthquake occurrences in
Iraq; the constructed model allows anticipating the number of earthquakes occur at any time
interval with a specific time length. Then, to estimate the model parameters, the data obtained from
the annual reports of the Iraqi Meteorological Organization and Seismology (IMOS) from January,
1st 2018 to May, 1st 2023 is used. Moreover, a simulation study is conducted and a new algorithm
is introduced to show the performance and the applicability of the proposed model.
Keywords: Nonhomogeneous Poisson process, Intensity function, Earthquake Occurrence,
Simulation.
2
1. Introduction
Poisson process models have been widely used to describe the scenarios of counting the number
of events that occur at a certain rate in which the intensity function is formulated as a time-
dependent variable. The Model with a constant rate of occurrences, which is known as
Homogenous Poisson Process (HPP), does not allow for any changes in the intensity function of
occurrence over a time interval. While a Non-Homogeneous Poisson Process (NHPP) is a more
complex model; it is used in cases when events occur differently within time and the intensity
function is not constant anymore. In addition, for initial state, the probability of no event occur is
one and the probability of n event occur is zero.
Destructive Earthquake Occurrences (EO) are one of the biggest natural disasters that take place
in various countries in the universe, apart from their devastating impact on people’s lives,
earthquakes can have serious effect on economic collapse, and the cultural heritage of urban
regions. Thus, modeling the EO is always a major research object in seismic probabilistic
prediction and various modeling approaches have been developed from different viewpoints, for
example: Stochastic Time-Predictable Model for EO (Anagnos, and Kiremidjian, 1984); a Non-
parametric hazard model to characterize the spatio-temporal occurrence of large earthquakes
(Faenza, et. al, 2003); A statistical analysis and comparison of earthquake and tsunami in Japan
and Indonesia ( Parwanto and Oyama, 2014). furthermore, to model the casualty rate of
earthquakes and related covariates, linear regression, beta regression, and semi-parametric additive
regression are used, (Turkan and Ozel, 2014); testing probabilistic earthquake forecasting and
seismic hazard models (Marzocchi and Jordan, 2018; Masoud, et. al, 2022); Generalized Linear
Model for estimating the probability of EO (Noora, 2019) and many others. In addition, many
stochastic approaches have been used as powerful tools for modelling EO, e.g. Statistical models
for earthquake occurrences and residual analysis for point processes (Ogata, 1988) ; A new
approach, which is obtained by mixing Poisson distribution with quasi-xgamma distribution, to
model the counts of earthquakes (Altun, at el., 2021). Despite that, scientists have never predicted
a major earthquake but can only determine the probability that a significant earthquake will
occur in a specific area within a certain time interval. Therefore, to estimate the number of EO that
may occur in future to predict the consequences of such destructive events, proposing statistical
models to describe the counts of EO is very essential; it helps to take measures and actions based
on the results obtained from implementing the proposed models.
Although earthquakes in Iraq, which are classified as events with medium hazard, are likely to
happen from time to time and their occurrences may cause significant destructions, but there is
always a shortage of data and lack of research studies concerning the assessment of the number of
the earthquakes that may occur in future. Many authors investigated the seismicity, which is the
occurrence or frequency of earthquakes in different regions of Iraq (e.g. Al-Abbasi and Fahmi,
1985; Alheety, 2016; Khagoory and Al-Rahim, 2023; Kagan, 2010; Francizek, 2018) and many
others; unfortunately, none of these studies concern modeling the number of EO or exploring
3
methods for estimating parameters of the intensity function. The main objective of this work, with
no attention paid to the geophysical causes, is to construct a NHPP based approach to model the
EO and then to predict the number of earthquakes that will happen in future as the earthquake
occurrences in Iraq and the border regions from one month to the next are independent of each
other and the numbers are not the same; thus the behavior is characterized by a NHPP.
Before introducing this model, basic concepts and the main properties for Poisson Process and
NHPP are described in addition to the main assumptions (Ross, 2007).
1.1 Poisson Process
In order to introduce a stochastic process known as Poisson Process and then to derive the
model formula, the following are introduced:
1.1.1 Counting Process
A stochastic process , indicates the total number of occurrences that
take place by time t, is a counting process if the following are satisfied:
1.
2. is integer valued;
3. If ,
4. For , represents the total number of occurrences that take place
in
The Poisson process is considered as the most important types of counting process but the
weakness for this process is its assumption that events are just as likely to occur in all intervals of
equal size. A NHPP is a generalization, which reduces this assumption, and represents a non-
stationary process for the number of events that occur by time t with intensity function denoted by
This model allows predicting the number of event occurrences at any time interval with a
specific time length. These processes are defined below:
Definition 1
A counting process , is called a Poisson Process with rate of occurrence if the
following are satisfied:
1. = 0; no occurrence for the initial state;
2. has independent increments, or the number of events that appear to occur in
the disjoint time intervals are independent;
3. means that probability that one event will occur
in a small interval depends on the length of the time interval and does not depend on the
number of events that will occur outside the time interval (on the past history of the
process).
4
4. ; this indicates that Probabilities that more than one
event that occur in a small-time interval can be ignored.
From the above assumptions, the following theorem is derived:
Theorem 1:
If is a Poisson process with rate of occurrence λ>0, then for all
is a Poisson random variable with mean λt. That is, the number of
events in any interval of length t is a Poisson random variable with mean λt ( Ross, 2007).
1.1.2 Arrival and Interarrival Times:
For a Poisson process with the rate of occurrence , if is the time of the first event
that appear to happen and is the time between the (n − 1) and the nth event, then
is called the sequence of interarrival times and ; these are independent
and identically distributed exponential random variables having mean 1/λ.
1.2 Non-Homogenous Poisson Process:
In this section the NHPP, which is a generalization of the Poisson Process obtained by allowing
the occurrence rate or intensity function at time t to be a function of t, is introduced: it is defined
as follows:
Definition 2
The counting process is defined as NHPP having intensity function if
the following are satisfied:
1. = 0;
2. has independent increments;
3.
4. .
In addition, the expected value for the number of occurrences through time t, which is known as
the mean cumulative function denoted by , is defined as
. It is obtained
from the following theorem:
Theorem 2:
If a Poisson Process is non-homogenous with , then is a Poisson
random variable with mean
.
5
Thus, . As is a non-decreasing step function, then the must be non-
decreasing function too. When is differentiable,
can be expressed as the rate of
change in the expected number of occurrences (Ross, 2007).
1.2.1 The intensity function
Let denote the number of events occur in the interval , and that , ( event
occur in an interval ). The intensity function, which is a deterministic function for the number
of occurrences that occur during an interval of length t and denoted by is the probability of
occurrences in a small interval divided by the length of the interval; it is defined as:
(1)
Thus, can be interpreted as the rate of change in the expected number of occurrences; there
will be many events occur over intervals on which is large, and fewer occurrences over
intervals on which is small.
On the basis of the above assumptions and definitions, for a NHPP, the probability that n events
occur through the time interval is given by the following:
(2)
The probabilistic behavior of the NHPP is completely defined by the mean-value functions or the
cumulative intensity function, which is interpreted as the expected number of events by time t, is
defined as:
(3)
The variance
(4)
Hence, the probability of exactly n events occurring in an interval is given by the
following (Cinlar, 1975):
for (5)
Thus, ) is distributed as Poisson with an average value +− with:
the probability of no event in the initial state or (6)
When the time interval increases from then the average mean is defined as
, (7)
6
(8)
Therefore, based on the above assumptions, a NHPP model can be defined such that it provides
concepts that allow to build a probabilistic model concerning incidents or event occurrences over
time. When the NHPP model is constructed, it is essential to estimate both the unknown parameters
and the functional form of the intensity function; this function is usually determined based on the
initial description of the data under investigation.
In this paper, for modeling the data under study, a NHPP, which is often suggested as an
appropriate model when a system whose rate varies over time, is proposed. This model is
characterized by an intensity function to describe the process changes over time. To estimate this
function, the data under study is used: the data represents the number of earthquake occurrences
happened in Iraq and its boarder from January, 1st 2018 to May, 1st 2023 . The following is the
statistical analysis for the data under study with the model description for the data.
2. Statistical Analysis for the Data Source under Study
The following data, which is shown by Table 1, provides information relevant to the number of
earthquake occurrences happened in Iraq and its border from January, 1st 2018 to May, 1st 2023;
the data is reported by the Iraqi Meteorological Organization and Seismology (IMOS).
Table 1: The number of earthquakes in Iraq and its border from January, 1st 2018 to April, 1st
2023.
Table 1 shows the number of occurrences for the years 2018 to May, 2023; it is clear that
earthquake occurrence in Iraq, which is classified with medium hazards, are likely to take place
irregularly as the geographical location of Iraq is the reason that only the eastern and the northeast
parts of the country are directly affected by the seismic activity of the Tauros-Zagros structural
zone; therefore, the number of events occurring is limited and the event is not considered with high
risk. Furthermore, the data show that the total number of occurrences for 2022 is higher compared
to the other periods and in 2020 the least number of occurrences has been recorded.
year
Jan .
Feb.
Mar.
Apr.
May
Jun.
Jul.
Aug.
Sept.
Oct.
Nov.
Dec.
Total
2018
25
41
27
30
18
11
4
23
23
18
36
24
280
2019
55
4
3
61
11
11
9
16
11
6
5
2
194
2020
19
8
9
7
2
13
8
12
6
1
1
7
93
2021
31
29
19
49
19
37
1
25
20
32
22
13
297
2022
19
29
28
24
39
37
27
27
28
19
17
20
314
2023
18
18
24
19
79
7
In addition, the following figure, Figure 1, illustrates the frequency distribution for the data set
under study.
Figure 1: The frequency distribution for the number of earthquakes occurred in Iraq.
Moreover, based on the intensity function, the data set is explained by the following figure, Figure
2.
Figure 2: The number of earthquakes with the intensity function.
Therefore, based on the background of the problem described, this study applied a NHPP to count
the number of earthquakes occurring in Iraq and then predict their probability.
3. Model Description:
-1
0
1
2
3
0500 1000 1500 2000 2500
No. Of Earth-quake per Month
0
10
20
30
40
50
60
70
Jan-18
Apr-18
Jul-18
Oct-18
Jan-19
Apr-19
Jul-19
Oct-19
Jan-20
Apr-20
Jul-20
Oct-20
Jan-21
Apr-21
Jul-21
Oct-21
Jan-22
Apr-22
Jul-22
Oct-22
Jan-23
Apr-23
8
The aim is to propose a model with the smallest set of variables which provides an adequate
description of the data. In this paper, for modeling the dataset under study, a NHPP is proposed. It
is described below.
3.1 The Proposed NHPP Model
In this section, to study the number of earthquakes that occur in Iraq and its boarders over a time
interval say (0,T], a NHPP is proposed. As the intensity function for this process plays a crucial
role in characterizing the probabilistic behavior of the NHPP, the main concern is the identification
of the functional form of the function, which may take various forms, and then the estimation of
its parameters; the parameters are usually estimated based on the initial description of the data
under investigation.
If it is assumed that a counting process , which counts the total number of events that
have occurred up to time T, takes values on {0,1,2,…}, each value represents the number of
earthquakes occurrences in a time interval (0,T]. Due to the nature of these occurrences, pre-
assumption that it is a NHPP with some parameter, seems to be justified.
Therefore, a NHPP is proposed to model the number of earthquakes that occur over time interval
in Iraq and its boarders; it is characterized by a deterministic intensity function describing
how the occurrence rate changes up to time , where the time is measured on a daily scale.
Then, is defined as a NHPP if
with the expected number
of events of NHPP over time interval [0, T], . i.e:
where
Hence, , which defines the expected value of increment for the process,
can be determined from equation (7). These rules can be implemented empirically if the intensity
function is known. To define this function, the data from the annual reports of the IMOS
from January, 1st 2018 to May, 1st 2023 is used. The statistical analysis of the data, which is
described by Table 1, shows that the intensity function can be approximated by the simplest
deterministic mathematical function known as the simple linear regression defined as:
(9)
Hence, the constructed model allows predicting the number of event occurrences at any time
interval with a time length t when the parameter values are estimated. Numerous estimation
methods have been used to predict parameters; the least squares method is a standard approach,
which is used in this study, which is described below.
3.2 Parameter Estimation for the Model:
9
Consider a NHPP with intensity function , which is assumed to be nonnegative, defined by
equation (9) over a fixed interval , . The objective is
to estimate parameters for the function λ(t) using the least squares. The concept of the least squares,
which is a method for estimating the parameters of a statistical model, fitted to sample data by
minimizing an appropriate sum of squared estimation errors. The empirical intensity function is
approximated by the Simple Linear Regression Model described by equation (9) that satisfied the
following:
(10)
Then, the above equation is minimized to estimate each of (Singh and Masuku, 2013).
the least squares estimate of the intercept and of the true regression line is given by:
,
(11)
Hence, the predicted intensity function is defined as
(12)
Then, by using the annual data reported by the IMOS in Iraq, which is explained by table 1, then
by applying equation (11), the estimates for the two parameters are:
Thus, by using the data set given by Table 1 applied to equation (12), the following model is
resulted:
(13)
The following figure, Figure 3, shows the fitted model and the data representation for cumulative
intensity function:
Figure 3: The observed and expected fit for the interval time and cumulative Intensity.
0
5
10
15
20
25
30
35
40
45
0500 1000 1500 2000 2500
Obesravate
Expect
10
4. Simulation
In this section, to assess the model fitting, a simulation study is conducted. The cumulative number
of occurrences for the NHPP is generated by considering equally spaced intervals over the
time range ; the process of generating random variables is repeated for 30 times. Then, the
same procedure is applied to different forms of intensity function ; each with different values
of . Finally, based on the results obtained from implementing the simulation procedure
each of are estimated using equation (11), which is an estimate procedure for estimating
parameters for regression model that describes the behavior of the intensity function. The proposed
algorithm is different than the rest of algorithms written for simulating a NHPP (Lewis and
Shedler, 1979); equally spaces intervals are considered so there is no need to simulate time
intervals but it generates the number of occurrences that occur in each interval. For this algorithm,
Matlab code is used and then to estimate , SPSS (Statistical Package for Social Sciences) is
implemented. The algorithm describing the main steps for the simulating a NHPP is introduced
below.
4.1 Generating the number of occurrence for the NHPP
Assume that N(t) is the cumulative number of occurrence for a NHPP with intensity function
described by for then to simulate N(t) the following algorithm is defined:
Algorithm
Step 1: let j=1
Step 2: Set initial value for ; Let
Step 3: Generate random number
Step 4: Compute
, if then go to step 7;
Step 5 ; ;
Step 6: Set : go to step 3;
Step 7: Let go to step 2;
Step 8: Set average for each row of event array ;
Step 9: end
4.2 Simulation results and discussion
The following figures, Figure 4, 5, 6, show the implementation for the algorithm for simulating
the number of occurrences for a NHPP with intensity function , for different
values of T and .
11
Figure 4. Number of occurrence
N(t)
for
Figure 5. Number of occurrence
N(t)
for
Figure 6 . Number of occurrence
N(t)
for
12
Thus, all these figures show that the estimated intensity functions which are resulted from
simulation process for different values of and , can be approximated by a simple linear
regression model.
The following table, Table 2, displays the results for generating for the time interval:
, for different functional forms for (with different values for and ); from the
regression model, based on simulated data, each of are estimated. Finally, the mean
square error (MSE) for the observed and estimates model is obtained.
Table 2: Parameter estimation based on simulated data with Mean Square Error
MSE
7.359338
4.972138
3.925877
10.18737
4.3 Modeling the number of occurrences based on real data and predictions
Hence, the cumulative number for EO in Iraq and its boarder is determined using equation (12)
and then the linear intensity function for EO is obtained as:
(13)
When equation (3) is applied to the data set, the by time t, which is defined as , is
, with
Hence, (14)
Therefore, modeling the number of earthquakes that occur in Iraq can be considered as a NHPP
with parameters for ; and from equation (2):
which implies that for the data under study this becomes:
Then, from equation (5), the number of earthquakes at a time interval with the length of the interval
can be predicted as follows:
13
For example, the expected number of earthquakes that will occur from 1 to 7 November 2023, as
in equation (7), can be anticipated as follows:
For the data set under study, the time interval was reported as the last interval for
the data up to April 2023, then the time interval for 1 to 7 December 2023 will be
Then, the length of the interval is 6 at time .
Thus, by implementing the proposed model defined by a NHPP with to the data under study,
the expected number of earthquakes that will occur in any time interval can be predicted. The
above application is an example that in the first week of December 2023, it is expected that about
4 earthquakes will occur in Iraq and its border.
Conclusion
The NHPP is widely used to model the counts of event occurrences. The major feature of Non-
Homogeneous Poisson Process (NHPP) is that the occurrence of event is allowed to change with
time. Thus, the occurrence rate of events is a function of time λ(t). In this paper a NHPP model
has been used to model the number of earthquakes occur in Iraq by defining the functional form
for λ(t) and then estimating its parameters. To estimate model parameters, the data obtained from
the annual reports of the Iraqi Meteorological Organization and Seismology (IMOS) from January,
1st 2018 to April, 1st 2023 is used. Moreover, a simulation study is performed to show the
performance and the applicability of the proposed model. Finally, theoretical results are applied
to predict the expected number of earthquakes that may occur in Iraq and its boarder.
References:
Al-Abbasi, N. and Fahmi, K. (1985) “Estimating Maximum Magnitude Earthquake in Iraq using
Extreme Value Statistics”, Geophysical Journal of the Royal Astronomical Society, 5(2), 535-
548;
Al-Heety, E. (2016) “Spatial Analysis of Earthquakes in Iraq using Statistical and Data Mining
techniques”, Iraqi Geological Journal, 30-49(2), 1-15;
Anagnos, T. and Kiremidjian, A. (1984) “Stochastic Time-Predictable Model for Earthquake
occurrences”, Bulletin of the Seismological Society of America, 74(6), 2593-2611;
Altun, E., Bhati, D. and Khan, N. (2021), “A new approach to model the counts of earthquakes:
INARPQX(1) process”, SN Applied Sciences, 3(274), 1-17;
14
Cinlar, E. (1975) “Introduction to Stochastic Processes”, Prentice–Hall, Englewood Cliffs, NJ;
Faenza, L., Marzocchi, W. and Boschi, E. (2003) “A non-parametric hazard model to characterize
the spatio-temporal occurrence of large earthquakes; an application to the Italian catalogue”,
Geophysical journal International, 155(1), 521-53;
Franciszek, G. (2018) “Nonhomogeneous Compound Poisson Process Application to Modeling
of Random Processes related to Accidents in the Baltic Sea Waters and Ports”, Journal of Kones,
26(1), 11-24;
Kagan, Y. (2010) “Statistical distributions of earthquake numbers: consequence of branching
process”, Geophysical journal International, 180(1), 1313–1328;
Khagoory, F. and Al-Rahim, A. (2023) “Estimation of Coda Q Wave Attenuation Factors for
Northern Iraq”, Iraqi Geological Journal, 56 (1A), 149- 160;
Lewis, P. and Shedler, G. (1979) “Simulation of Nonhomogeneous Poisson Processes by
Thinning”, Naval Research Logistics, 26(3), 403–413;
Marzocchi, W. and Jordan, T. (2018), “Experimental concepts for testing probabilistic earthquake
forecasting and seismic hazard models”, Geophysical journal International, 215(1), 780-789;
Masoud, Y., Afshar, A., Mahmoudvan, R. and Nasir, P. (2022) “Predicting intensity function of
nonhomogeneous Poisson process”, Journal of Statistical Modelling: Theory and Applications,
3(2), 39-50;
Noora, S. (2019) “Estimating the Probability of Earthquake Occurrence and Return Period Using
Generalized Linear Models”, Journal of Geoscience and Environment Protection, 7(9), 11-24;
Ogata, Y (1988) “Statistical models for earthquake occurrences and residual analysis for point
processes”, Journal of the American association, 83(401), 9–27;
Parwanto, N. and Oyama, T. (2014) “A statistical analysis and comparison of historical earthquake
and tsunami disasters in Japan and Indonesia”, International Journal of Disaster Risk Reduction,
7(1), 122-141;
Ross, S. (2007) “Introduction to Probability Models”, California, Elsevier;
Turkan, S. and Ozel, G. (2014) “Modeling destructive earthquake casualties based on a
comparative study for Turkey”, Natural Hazards, 72(2), 1093–1110.