ArticlePDF Available

The Comparison Between VAR and ARIMAX Time Series Models in Forecasting

Authors:

Abstract and Figures

The research presented a comparative study in time series analysis and forecasting using VAR models, which depend on the existence of a significant relationship between the studied variables, and ARIMAX models, which depend on the linear effect of the independent variables (model input) on the dependent variable (model output). The models were analyzed using time series data for the Iraqi general budget for the period (2004-2020), which represents foreign reserves and government spending. Time series of government expenditure was forecast for the years (2021-2024) and a comparison was made between the efficiency of the models estimated through the mean square error (MSE) criterion. The analysis was carried out using the MATLAB program, and the results of the analysis concluded that the VAR model was more efficient than the ARIMAX model for this data and the increase in foreign reserves and government spending for the Iraqi will continue during the coming period (2021-2024).
Content may be subject to copyright.
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
249
Iraqi Journal of Statistical Sciences
www.stats.mosuljournals.com
The Comparison Between VAR and ARIMAX Time Series Models in
Forecasting
Esraa Awni Haydier Nasradeen Haj Salih Albarwari and Taha Hussein Ali
Salahaddin University, College of Administration and Economics, Department of Statistics and Informatics. Erbil,
Iraq
Duhok University, College of Administration and Economics, Department of Statistics and Informatics, Duhok,
Iraq
Article information
Abstract
Article history:
Received October 11, 2023
Accepted November 20, 2023
Available online Decmber 1, 2023
The research presented a comparative study in time series analysis and forecasting
using VAR models, which depend on the existence of a significant relationship between
the studied variables, and ARIMAX models, which depend on the linear effect of the
independent variables (model input) on the dependent variable (model output). The
models were analyzed using time series data for the Iraqi general budget for the period
(2004-2020), which represents foreign reserves and government spending. Time series of
government expenditure was forecast for the years (2021-2024) and a comparison was
made between the efficiency of the models estimated through the mean square error
(MSE) criterion. The analysis was carried out using the MATLAB program, and the
results of the analysis concluded that the VAR model was more efficient than the
ARIMAX model for this data and the increase in foreign reserves and government
spending for the Iraqi will continue during the coming period (2021-2024).
Keywords:
Time Series,
VAR,
ARIMAX,
Forecasting,
Iraqi budget.
Correspondence:
Esraa Awni Haydier
esraa.haydier@su.edu.krd
DOI: 10.33899/ iqioss .2023.181260 , ©Authors, 2023, College of Computer and Mathematical Science, University of Mosul.
This is an open access article under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
1. Introduction
Time series is a statistical and data analysis technique used to study and analyze data points collected or
recorded over time. It involves the analysis of data points ordered chronologically, typically at uniform
time intervals, such as daily, monthly, or yearly. Time series data can be found in various fields, including
economics, finance, environmental science, engineering, and many others. Overall, time series analysis is a
powerful tool for understanding and making predictions based on temporal data, allowing businesses and
researchers to make informed decisions and respond to changing trends and patterns over time.
VAR, or Vector Autoregression, is a statistical modelling technique commonly used in time series
analysis, especially in the context of multivariate time series data. It is an extension of the autoregressive
(AR) model, which focuses on the relationship between a single time series and its past values. In VAR
models, you analyze the relationships between multiple time series variables and their past values
simultaneously. VAR models are beneficial when you want to capture the dynamic interactions between
multiple variables over time. They are commonly used in macroeconomics, finance, and other fields where
understanding the joint behaviour of multiple time series is crucial for decision-making and policy analysis
[22].
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
250
ARIMAX, which stands for Autoregressive Integrated Moving Average with exogenous inputs, is a
time series analysis and forecasting model that extends the traditional ARIMA (autoregressive Integrated
Moving Average) model by incorporating exogenous or external predictor variables. ARIMAX is used for
modelling and forecasting time series data when the behaviour of the series is influenced by not only its
past values but also by external factors. ARIMAX modelling is a powerful tool when you need to account
for the impact of external factors or exogenous variables in your time series analysis and forecasting. It
allows for more accurate predictions by considering not only the series' history but also the influence of
other relevant variables [1].
Forecasting in time series analysis is the process of making predictions or estimates about future data
points based on historical observations. Time series forecasting is a critical component in various fields,
including finance, economics, meteorology, and operations management, among others. To perform
accurate forecasting, several methods and techniques are available, depending on the characteristics of the
time series data. The choice of forecasting method depends on the nature of the time series data, including
its stationarity, seasonality, and the presence of exogenous variables. It's often recommended to experiment
with different methods and evaluate their performance to select the most appropriate approach for a given
dataset. Additionally, time series forecasting often requires continuous monitoring and reevaluation as new
data becomes available, allowing for model updates and improvements [4].
The Iraqi budget, like the budget of any other country, is a financial plan that outlines the government's
expected revenues, expenditures, and allocations for a fiscal year. Here, the time series of foreign reserves
and government spending will be analyzed through VAR and ARIMAX models, and then compared
between them to obtain the best model used in forecasting.
2. Theoretical Aspect
The theoretical aspect presented some basic concepts on the subject of research from the statistical side,
as shown in the following paragraphs.
2.1. Time Series
Time series forecasting is widely used across diverse fields, including statistics, inventory management,
and economics. Various forecasting models are available, ranging from simpler techniques like moving
averages and linear regression to more complex approaches like ARIMA and neural networks [21].
Time series analysis is a statistical method used to analyze and model data that is collected and ordered
over time. The information you provided offers a fundamental understanding of time series and its
components [16].
Time Series Definition: A time series is an ordered sequence of observations. These observations are
usually taken at equally spaced intervals over time.
The objective of Time Series Modeling: The primary objective of time series modelling is to study and
analyze historical data from a time series to develop appropriate models that can be used to make
predictions or forecasts for future values of the series [17]. Time series analysis helps uncover patterns,
trends, and relationships within the data.
Components of Time Series: Time series data can be decomposed into four main components, which are
essential for understanding its underlying structure:
Trend (T): The trend component represents the long-term direction or movement in the data. It captures
gradual changes or trends in the data over time, such as an increasing or decreasing pattern.
Periodic (C): The periodic component accounts for regular, repeating patterns in the data, which may not
necessarily follow a linear trend. These patterns can have different time intervals [1].
Seasonal (S): Seasonal patterns are similar to periodic patterns but occur at fixed and known intervals, such
as daily, monthly, or yearly. Seasonal effects are often associated with calendar-related events, like
holidays or weather.
Irregular (I): The irregular component, also referred to as noise or residual, represents the unexplained or
random fluctuations in the data that cannot be attributed to the other components. It includes any random
variations, outliers, or noise in the time series.
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
251
2.2. Stationary Time Series
Stationarity holds a central position in time series, as it acts as a fundamental prerequisite for various
statistical and mathematical techniques. The core principles of stationarity entail maintaining consistent
statistical characteristics over time, which encompass an unchanging mean (indicating the absence of a
trend), a uniform variance, and a stable autocorrelation structure [19].
In practical terms, numerous time series datasets exhibit non-stationary behaviour, wherein their statistical
properties, such as means, variances, or trends, evolve [20]. This non-stationary nature can pose challenges
when applying traditional time series analysis methods, which often assume stationarity [18].
To address this challenge, non-stationary time series data can be converted into a stationary format through
two primary methods:
Differencing: This technique involves computing the differences between sequential data points to
eliminate any trend and achieve a constant mean.
Transformation: In this approach, mathematical operations, such as taking the natural logarithm or square
root, are used to stabilize the variance [2].
The primary objective in ensuring stationarity is to simplify the analysis of time series data. Several
statistical models, including ARIMA (Autoregressive Integrated Moving Average), rely on the assumption
of stationarity. By transforming a time series into a stationary one, it becomes possible to obtain more
accurate forecasts and model estimates [13].
2.3. VAR Model
A VAR (Vector Autoregression) model is a type of statistical model used in time series analysis and
econometrics. It's a multivariate time series model that describes the relationship between multiple time
series variables [14]. In a VAR model, each variable is modelled as a linear combination of its past values
and the past values of all other variables in the system. For example, the system of equations for a VAR (1)
model with two-time series (variables `y1` and `y2`) is as follows:
   󰇛󰇜
   󰇛󰇜
Key features of a VAR model include: Unlike univariate time series models, which analyze a single
variable in isolation, VAR models consider multiple variables simultaneously. This is particularly useful
when you want to capture and model the dynamic interactions and feedback loops among different
variables [22].
A VAR model is typically specified with an order, denoted as VAR(p), where "p" represents the number of
lagged time points considered in the model. The choice of order is an important part of VAR modelling and
affects the complexity of the model. Just like with univariate time series models, stationarity is important
for VAR models. In a VAR(p) model, the variables should be stationary after differencing at least p times.
Estimating the parameters of a VAR model is usually done through techniques like the method of least
squares or maximum likelihood estimation. VAR models are often used for forecasting and assessing the
impact of shocks on the variables within the system. You can calculate impulse response functions to see
how a shock to one variable affects the others over time [15].
VAR models can be extended to VARMA (Vector Autoregressive Moving Average) models to account for
moving average components, and they are also a component of the more comprehensive VARMAX
models, which incorporate exogenous variables. These models can be useful for various tasks, such as
economic forecasting, policy analysis, and risk assessment in financial markets.
2.4. ARIMAX Model
ARIMAX is a time series forecasting model that combines ARIMA principles with exogenous variables. It
extends ARIMA by including external predictors (denoted as X) to improve forecasting accuracy. This
model involves specifying AR, I, and MA components, along with the exogenous variables, estimating
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
252
model parameters, and making forecasts [3]. It's commonly used in economics, finance, and environmental
science to account for external influences on time series data. The ARMAX(p,q,r) model equation
   
  󰇛󰇜
There    represents exogenous variables, their coefficients [12], and r is the
number of exogenous variables. The provided information describes the ARIMAX and ARMAX models,
which are commonly used in time series analysis and forecasting when exogenous variables are involved.
AR Terms: The ϕ1, ϕ2, …, ϕp terms are autoregressive terms, indicating the relationship between the
dependent variable and its past values. MA Terms: The et, θ1e(t−1), θ2e(t−2), …, θqe(t−q) terms are moving
average terms, which account for the influence of past errors in the model.
Application of ARIMAX: ARIMAX models are applied in various fields, including economics, agriculture,
and engineering, to improve predictive performance compared to the basic ARIMA model. The d
parameter in the ARIMAX model indicates the number of times differencing is applied to the time series to
make it stationary. This is often necessary when dealing with non-stationary data [4].
2.5. Efficiency criteria for estimated models
AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) are two commonly used
statistical measures for model selection and evaluation in the context of regression models, including linear
regression and time series models like ARIMA and VAR. They are used to assess the trade-off between
model complexity and goodness of fit. Lower values of AIC and BIC indicate better-fitting and more
parsimonious models [10]. Here's an explanation of each:
The Akaike Information Criterion (AIC) is a statistical measure used for model selection and evaluation.
AIC is commonly used in the context of various statistical models, including linear regression, time series
models, and more [9]. Its primary purpose is to assess the trade-off between the goodness of fit of a model
and its complexity.
The AIC value for a given model is calculated using the following formula [5]:
AIC = -2 log-likelihood + 2 k (4)
Here's what each component of the formula represents:
1. -2 log-likelihood: This term measures how well the model fits the observed data. The log-likelihood is a
measure of how probable the observed data is under the model [8]. The negative sign indicates that you're
trying to maximize the likelihood (i.e., find the model that best fits the data).
2. 2 k: This term represents a penalty for model complexity. "k" is the number of parameters in the model.
The penalty term encourages the selection of simpler models, as adding parameters will increase AIC [11].
Model Selection: AIC is used to compare different models. When comparing models, you typically choose
the model with the lowest AIC value because it represents the best trade-off between goodness of fit and
model complexity [7].
Bayesian Information Criterion (BIC): It is a statistical criterion used for model selection among a finite set
of models. BIC balances the goodness of fit of the model with the number of parameters, penalizing models
with more parameters to avoid overfitting [6].
BIC = -2 log-likelihood + k log (n) (5)
To compare two different models, the mean square error (MSE) is used [23]:
3. Application Aspect
The practical aspect dealt with the two correlated time series that represent two variables, namely foreign
reserves (x1) and government spending (x2) in the general budget of Iraq (in the appendix). There is a
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
253
general trend in the time series of foreign reserves and government spending, as shown in the following
Figure 1.
The linear correlation coefficient between foreign reserves and government spending amounted to 91.4%,
which is positive and significant because the p-value is equal to zero, which is less than the significance
level (0.05). Figure 2 explains the cross-correlation between foreign reserves and government spending.
Special tests are used to ascertain whether the stationarity of the time series exists or not, and one of these
tests is the Augmented Dickey-Fuller Test (ADF) to indicate whether the time series is stable around an
average or linear trend or it is not stable due to the unit root, which tests the following hypothesis:
Null Hypothesis: The data contains a unit root
Alternative Hypothesis: The data no contains a unit root
Figure 1. Time series for the foreign reserves and government spending
Figure 2. Cross-Correlation between Foreign reserves and Government spending
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
254
Table 1. Augmented Dickey-Fuller Test
Null Rejected
p-value
Test Statistics
The Variables
False
0.673
0.1051
Foreign Reserves
True
0.019
-2.4182
Foreign Reserves Detrend
false
0.474
-0.4427
Government Spending
True
0.022
-2.3499
Government Spending Detrend
It is clear from Table 1 that the results of the ADF test indicate that the time series of Foreign Reserves and
Government Spending is stationary at the first difference since the value of the absolute test statistic is
(2.4182, 2.3499), which is greater than the absolute critical value (1.956) and p-value (0.019, 0.022) is less
than significant level (0.05). Figure 3, and 4 explains the sample Autocorrelation and sample Partial
Autocorrelation Function for the variable (x1) and (x2), respectively.
Figure 3. ACF, and PACF for the series Foreign Reserves
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
255
Figure 4. ACF, and PACF for the series Government Spending
3.1. VAR Models for time series data:
As mentioned in the theoretical aspect, the VAR models represent an extended version of the AR model that
includes two variables that allow taking advantage of the cross-correlation that may be present in the VAR
model. Now we will find a suitable model to predict the x1 and x2 together. The AR-stationary 3-
dimensional VAR {3) model with linear time trend (VAR) was chosen depending on the lowest values of
the criteria AIC and BIC. The best model is selected by comparing the three estimated models based on
criteria AIC and BIC as shown in Table 2:
Table 2. VAR models efficiency criteria
Model
AIC
BIC
VRA {1}
1137.2
1143.4
VRA {2}
1068.9
1077.4
VRA {3}
996.58
1006.8
Table 2 shows that the VAR {3) model was the best because the values of criterion AIC and BIC are less
than their value in the other models, so the following third model was relied upon:
󰇛
󰇜  
Table 3 shows the estimation results for the VAR {3} model:
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
256
Table 3. VRA {3} Model Estimation Results
Parameter
Value
Standard Error
t Statistic
P-Value
Constant (1)
38263745.4077
7743689.4858
4.9413
7.7611e-07
Constant (2)
29188085.2295
17475124.0686
1.6703
0.094867
AR {1} (1,1)
0.36293
0.36353
0.99836
0.3181
AR {1} (2,1)
0.84245
0.82037
1.0269
0.30446
AR {1} (1,2)
0.1729
0.21943
0.78794
0.43073
AR {1} (2,2)
0.28585
0.49518
0.57727
0.56376
AR {2} (1,1)
0.066633
0.36065
0.18476
0.85342
AR {2} (2,1)
-0.45441
0.81387
-0.55833
0.57662
AR {2} (1,2)
-0.31267
0.19895
-1.5716
0.11603
AR {2} (2,2)
-0.56248
0.44896
-1.2528
0.21026
AR {3} (1,1)
0.79275
0.31034
2.5545
0.010634
AR {3} (2,1)
1.7824
0.70033
2.5451
0.010924
AR {3} (1,2)
-0.64906
0.2134
-3.0416
0.0023534
AR {3} (2,2)
-0.9322
0.48157
-1.9358
0.052896
Trend (1)
1551051.2794
641607.91
2.4174
0.01563
Trend (2)
1251583.3558
1447911.5998
0.86441
0.38736
Table 3 clearly shows the statistical significance of some estimated parameters, the trend 1 parameter,
which supports the strength of this model. Therefore, the model is
1
23
2
0.36293 0.17290 0.06663 0.31267 0.79275 0.64906
10.84245 0.28585 0.4544 0.56248 1.7824 0.93220
38263745.4077 1551051.2794
29188085.2295 1251583.3558
t
t
y
L L L y






Figure 5 shows the model fit for both variables. Figure 6 shows that the residuals fluctuate around the zero
line. Also, all values of the autocorrelation coefficients fall within the confidence interval for both
variables. Furthermore, most of the residual values fall within the standard curve. After passing all the
necessary tests, the VAR {3} model is ready to forecast future values.
The best model estimated above VAR {3} with MSE equal to (114.5) was used to forecast the Foreign
Reserves and Government Spending for Iraq for the four years (2021-2024), and are summarized in Table
4:
Table 4. Forecasting the Foreign Reserves and Government Spending VAR {3}
Year
Foreign Reserves
Government Spending
2021
65210000
73450000
2022
74940000
90700000
2023
107890000
146650000
2024
111350000
147220000
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
257
Figure 5. Models fit for the VAR {3}
Figure 6. Residual Sample Autocorrelation Function for VAR {3} model
Table 4 shows that there is an expected increase in the coming years in Foreign Reserves and Government
Spending, with the balance of Government Spending remaining higher than Foreign Reserves, which
constitutes the continuation of the general deficit in the budget of Iraq in the coming years, as shown in
Figure 7.
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
258
Figure 7. The time series data with forecasting
3.2. ARIMAX Models for time series data:
As mentioned in the theoretical aspect, the ARIMAX models represent an extended version of the ARIMA
model that includes other independent (predictive) exogenous variables. ARIMAX models are similar to
multiple regression models except that allow taking advantage of the autocorrelation that may be present in
the regression residuals. Now we will find a suitable model to predict the y with the presence of an external
variable x. Out of a total of (32) possible models, the ARIMAX (1,1,0), (2,1,0), and (3,1,0) models were
chosen depending on the significance of the estimated model parameters and the lowest values of the
criteria AIC and BIC as shown in Table 5:
Table 5. ARIMAX models efficiency criteria
Model
AIC
BIC
ARIMAX (1,1,0)
556.6420
559.4742
ARIMAX (2,1,0)
522.1963
525.3915
ARIMAX (3,1,0)
516.0204
519.4101
Table 5 shows that the ARIMAX (3,1,0) model was the best because the values of criterion AIC and BIC
are less than their value in the other models, so the following third model was relied upon:
󰇛
󰇜󰇛
󰇜  
23
1 0.82468 0.77908 0.39604 1 182311541.8493 2.7722
tt
L L L L y x
Table 6 shows the estimation results for the classical ARIMAX (3,1,0) model:
Table 6. ARIMAX (3,1,0) Model Estimation Results
Parameter
Value
Standard Error
t Statistic
P-Value
Constant
-182311541.8493
5.1437e-09
-3.544342513864502e+16
0
AR {1}
-0.82468
0.27898
-2.956
0.0031162
AR {2}
-0.77908
0.244
-3.193
0.0014082
AR {3}
-0.39604
0.23394
-1.6929
0.0404660
Beta (x)
2.7722
0.077775
35.6431
3.0159e-278
Table 6 clearly shows the statistical significance of the estimated parameters and the regression parameter,
Figure 8 shows the model fit.
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
259
Figure 8. Model fit for the ARIMAX (3,1,0) model
Figure 9. Residual Sample Autocorrelation Function for ARIMAX (3,1,0) model
Figure 9 shows that the residuals fluctuate around the zero line. Also, all values of the autocorrelation
coefficients fall within the confidence interval for both variables. Furthermore, most of the residual values
fall within the standard curve. After passing all the necessary tests, the ARIMAX (3,1,0) model is ready to
forecast future values with MSE equal to (223.38). Forecast the government spending for Iraq for the four
years (2021-2024), and are summarized in Table 7:
Table 7. Forecasting the Government Spending ARIMAX (3,1,0)
Year
Government Spending
2021
95662138.267428
2022
111978320.796619
2023
136363227.811626
2024
221399943.396227
Table 7 shows that there is an expected increase in the coming years in government spending, with the
balance of government spending remaining higher than foreign reserves, which constitutes the continuation
of the general deficit in the budget of Iraq in the coming years, as shown in Figure 10.
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
260
Figure 10. The time series of government spending with forecasting
3.3. Comparison between VAR and ARIMAX models:
To compare the VAR and ARIMAX estimated models, the MSE criterion was used, and preference was
given to the VAR model because the value of MSE is equal to (114.5), which is less than its value (223.38)
for the ARIMAX Model. Therefore, the VAR can be relied upon in future forecasting and planning of the
Iraqi budget on its basis.
4. Conclusions & Recommendations
The following main conclusions and recommendations were summarized:
4.1. Conclusions
1. The VAR model is better than the ARIMAX model for this data depending on the MSE criterion.
2. There is a simple linear correlation (and correlation) between foreign reserves and government
spending to 91.4%, which is positive and significant.
3. The forecast for the period (2021-2024) shows an increase in government spending.
4. There is a general trend in the time series of foreign reserves and government spending indicating
a significant increase in the sustainable deficit of the general budget in Iraq.
4.2. Recommendations
1. Approval of the VAR (3) estimated model with detrend and forecasting values for the coming years
to draw plans.
2. Develop financial and economic stability in Iraq to achieve financial sustainability and reduce the
general budget deficit.
3. Conducting a prospective study based on another time series analysis using Wavelet Shrinkage for
foreign reserves and government spending data.
References:
1. Al Wadi S., M. T. Ismail, and S. A. Abdulkarim, “Forecasting financial time series database on wavelet
transforms and ARIMA model” Regional Conference on Applied and Engineering Mathematics, vol.4,
pp.448-453, 2010.
2. Ali, Taha Hussein & Awaz Shahab M. "Uses of Waveshrink in Detection and Treatment of Outlier
Values in Linear Regression Analysis and Comparison with Some Robust Methods", Journal of
Humanity Sciences 21.5 (2017): 38-61.
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
261
3. Ali, Taha Hussein & Mardin Samir Ali. "Analysis of Some Linear Dynamic Systems with Bivariate
Wavelets" Iraqi Journal of Statistical Sciences 16.3 (2019): 85-109.
4. Ali, Taha Hussein & Qais Mustafa. "Reducing the orders of mixed model (ARMA) before and after the
wavelet de-noising with application." Journal of Humanity Sciences 20.6 (2016): 433-442.
5. Ali, Taha Hussein and Jwana Rostam Qadir. "Using Wavelet Shrinkage in the Cox Proportional
Hazards Regression model (simulation study)", Iraqi Journal of Statistical Sciences, 19, 1, 2022, 17-29.
6. Ali, Taha Hussein, "Estimation of Multiple Logistic Model by Using Empirical Bayes Weights and
Comparing it with the Classical Metod with Application" Iraqi Journal of Statistical Sciences 20
(2011): 348-331.
7. Ali, Taha Hussein, and Saleh, Dlshad Mahmood, "Proposed Hybrid Method for Wavelet Shrinkage
with Robust Multiple Linear Regression Model: With Simulation Study" QALAAI ZANIST
JOURNAL 7.1 (2022): 920-937.
8. Ali, Taha Hussein, and Saleh, Dlshad Mahmood,"COMPARISON BETWEEN WAVELET
BAYESIAN AND BAYESIAN ESTIMATORS TO REMEDY CONTAMINATION IN LINEAR
REGRESSION MODEL" PalArch's Journal of Archaeology of Egypt/Egyptology 18.10 (2021): 3388-
3409.
9. Ali, Taha Hussein, Avan Al-Saffar, and Sarbast Saeed Ismael. "Using Bayes weights to estimate
parameters of a Gamma Regression model." Iraqi Journal of Statistical Sciences 20.1 (2023): 43-54.
10. Ali, Taha Hussein, Heyam Abd Al-Majeed Hayawi, and Delshad Shaker Ismael Botani. "Estimation of
the bandwidth parameter in Nadaraya-Watson kernel non-parametric regression based on universal
threshold level." Communications in Statistics-Simulation and Computation 52.4 (2023): 1476-1489.
11. Ali, Taha Hussein, Nasradeen Haj Salih Albarwari, and Diyar Lazgeen Ramadhan. "Using the hybrid
proposed method for Quantile Regression and Multivariate Wavelet in estimating the linear model
parameters." Iraqi Journal of Statistical Sciences 20.1 (2023): 9-24.
12. Ali, Taha Hussein, Saman Hussein Mahmood, and Awat Sirdar Wahdi. "Using Proposed Hybrid
method for neural networks and wavelet to estimate time series model." Tikrit Journal of
Administration and Economics Sciences 18.57 part 3 (2022).
13. Ali, Taha Hussein. "Modification of the adaptive Nadaraya-Watson kernel method for nonparametric
regression (simulation study)." Communications in Statistics-Simulation and Computation 51.2 (2022):
391-403.
14. Ali, Taha Hussien, Nazeera Sedeek Kareem, and Awaz shahab mohammad, (2021), Data de-noise for
Discriminant Analysis by using Multivariate Wavelets (Simulation with practical application), Journal
of Arab Statisticians Union (JASU), 5.3: 78-87.
15. Kareem, Nazeera Sedeek, Taha Hussein Ali, and Awaz shahab M, "De-noise data by using Multivariate
Wavelets in the Path analysis with application", Kirkuk University Journal of Administrative and
Economic Sciences, 10.1 (2020): 268-294.
16. Kozłowski, B. (2005). “Time series denoising with wavelet transform”. Journal of Telecommunications
and Information Technology, (3), 91-95.
17. Mustafa, Qais, and Ali, Taha Hussein. "Comparing the Box Jenkins models before and after the
wavelet filtering in terms of reducing the orders with application." Journal of Concrete and Applicable
Mathematics 11 (2013): 190-198.
18. Omar, Cheman, Taha Hussien Ali, and Kameran Hassn, Using Bayes weights to remedy the
heterogeneity problem of random error variance in linear models, IRAQI JOURNAL OF
STATISTICAL SCIENCES, 17, 2, 2020, 58-67.
19. Percival, D. B., & Walden, A. T. (2000).” Wavelet methods for time series analysis and its statistical
applications”. (pp. 1–613). doi.org/10.1017/cbo9780511841040.
20. Raza, Mahdi Saber, Taha Hussein Ali, and Tara Ahmed Hassan. "Using Mixed Distribution for
Gamma and Exponential to Estimate of Survival Function (Brain Stroke)." Polytechnic Journal 8.1
(2018).
21. Shahla Hani Ali, Heyam A.A. Hayawi, Nazeera Sedeek K., and Taha Hussein Ali, (2023) "Predicting
the Consumer price index and inflation average for the Kurdistan Region of Iraq using a dynamic
model of neural networks with time series", The 7th International Conference of Union if Arab
Statistician-Cairo, Egypt 8-9/3/2023:137-147.
22. Zivot, E., Wang, J. (2003). Vector Autoregressive Models for Multivariate Time Series. In: Modeling
Financial Time Series with S-Plus®. Springer, New York, NY.
Iraqi Journal of Statistical Sciences, Vol. 20, No. 2, 2023, Pp (249-262)
262
23. Ali, T. H., Sedeeq, B. S., Saleh, D. M., & Rahim, A. G. Robust multivariate quality control charts for
enhanced variability monitoring. Quality and Reliability Engineering International (2023).
https://doi.org/10.1002/qre.3472
Appendix
General budget in Iraq
Government Spending (x2)
Foreign Reserves (x1)
Year
32117491
13652193
2004
26375175
19901327
2005
38806679
27763676
2006
39031232
38217704
2007
59403375
58718278
2008
52567025
51872810
2009
70134201
59252271
2010
78757666
71410911
2011
105139575
82001306
2012
119128000
90648557
2013
113473517
77363120
2014
70397515
62810373
2015
67067437
52617915
2016
75490115
58364993
2017
104158183
76481186
2018
111723523
80383896
2019
67082443
78888824
2020
VARARIMAX



VAR 
ARIMAX


 MSE
MATLABARIMAXVAR



Book
Full-text available
The statistician Fisher developed the statistical method of analysis of variance symbolised by (ANOVA) through the analysis of experimental data, and his first use was in agricultural experiments based on data representing different agricultural fertilisers as well as different seeds, while now the use of variance analysis has extended to include all different scientific disciplines. The method of analysis of variance is a statistical test based on the F-distribution by obtaining differences or total variance that consist of several separate components, which are the causes or sources (Sources) of variance to be analyzed, and this method is not much different from regression analysis in terms of objective, however, the main difference is that regression analysis gives numerical values for the effect of independent variables on the follower, while the analysis of variance is sufficient to indicate that these variables affect the dependent variable or not.
Book
Full-text available
The Statistical Package for Social Science, known as SPSS, is one of the most common and used ready-made statistical programs by researchers and students in many different fields, including administrative, economic, accounting, social, educational, medical, engineering, agricultural, chemical, etc.
Book
The statistician Fisher developed the statistical method of analysis of variance symbolised by (ANOVA) through the analysis of experimental data, and his first use was in agricultural experiments based on data representing different agricultural fertilisers as well as different seeds, while now the use of variance analysis has extended to include all different scientific disciplines. The method of analysis of variance is a statistical test based on the F-distribution by obtaining differences or total variance that consist of several separate components, which are the causes or sources (Sources) of variance to be analyzed, and this method is not much different from regression analysis in terms of objective, however, the main difference is that regression analysis gives numerical values for the effect of independent variables on the follower, while the analysis of variance is sufficient to indicate that these variables affect the dependent variable or not.
Book
2025 2 Introduction In practical applications, especially economics, we find that simple linear regression is very rarely used and that most applications include estimating the relationship between more than two variables, that is, the presence of more than one independent variable with one dependent variable, which is included in the multiple linear regression model, so this model is considered a generalization of the simple linear regression model. And success comes from God
Book
2025 2 Introduction Usually, the researcher needs to ensure the efficiency of the standard model that has been estimated and the extent of its conformity with the economic theory and test the significance of the estimated parameters using the method of ordinary least squares (OLS) and on this basis, there are three types of criteria, namely : 1-Theoretical economic criterion : The initial criterion used is the extent to which the size and indicators of the parameters estimated in the standard model conform to economic theory and its relationships. Which is determined and defined by economic theory at the stage of characterization of the standard model (which is the first stage of economic measurement research). 2-Standard Criteria : The theory of economic measurement determines this criterion with the goal defined for all hypotheses used and on the extent to which statistical criteria can be used, such as the hypothesis of one of the values of the random error variable is not related to the other values (the problem of autocorrelation of random error). 3-Statistical Criterion: This criterion is determined by statistical theory and its goal is to assess the degree of dependence on the estimated parameters of the standard model, i.e. the use of statistical tests such as the T-test and the F-test to find out the significance of the estimated parameters or statistical model. And success comes from God
Book
It is sure that your paper will be included in the coming issue of PJS i.e. Volume 41 No.2 (April 2025). However, we will be waiting your next response about the APC amount, please.
Book
2025 2 Introduction This chapter dealt with one of the conditions that must be met in estimating the parameters of the multiple models, which is the absence of the phenomenon of numerous linear relationships (multi-collinearity) between independent variables and this compound reformer consists of numerous (Multi) and common or interconnected (Co) and linear (Linearity). Statistician Frisch is the first to discover this phenomenon in economic studies, which are usually time series, where cases of multiple linear interference with economic variables appear since some independent variables may develop during a certain time To be affected by multiple economic factors.
Book
In general, the economic model is a collection of formulas that translate economic theory based on statistical and mathematical methods to explain the structural makeup of a particular industry or the national economy. Alternatively, the standard model provides a numerical representation of the country's or businesses' economic activity during a given time. Consequently, economic theory determines the model's equation or set of equations and its mathematical form, whereas economic measurement establishes the best format for the data of economic variables based on their linearity or non-linearity, thereby estimating the model's parameters. This chapter will discuss the simple linear model, i.e. the model that includes a dependent variable and an independent variable, or what is called the explanatory variable, in addition to the presence of a random error variable or what is called the unexplained variable.
Book
In general, the economic model is a collection of formulas that translate economic theory based on statistical and mathematical methods to explain the structural makeup of a particular industry or the national economy. Alternatively, the standard model provides a numerical representation of the country's or businesses' economic activity during a given time. Consequently, economic theory determines the model's equation or set of equations and its mathematical form, whereas economic measurement establishes the best format for the data of economic variables based on their linearity or non-linearity, thereby estimating the model's parameters. This chapter will discuss the simple linear model, i.e. the model that includes a dependent variable and an independent variable, or what is called the explanatory variable, in addition to the presence of a random error variable or what is called the unexplained variable.
Book
2025 2 Introduction This chapter dealt with one of the conditions that must be met in estimating the parameters of the multiple models, which is the absence of the phenomenon of numerous linear relationships (multi-collinearity) between independent variables and this compound reformer consists of numerous (Multi) and common or interconnected (Co) and linear (Linearity). Statistician Frisch is the first to discover this phenomenon in economic studies, which are usually time series, where cases of multiple linear interference with economic variables appear since some independent variables may develop during a certain time To be affected by multiple economic factors.
Article
Full-text available
The research presents a new hybrid model that proposes its use for accurate time series prediction, which combines wavelet transformations to remove de-noise of the data before using it in artificial neural network and applied for time series. To find out the effectiveness and efficiency of the proposed method on artificial neural network models in prediction, the proposed method was firstly applied to the generation time series data (first-order auto-regression) through several simulation examples by changing the value of the parameters and sample size with the generation data being repeated 25 times, secondly the application on the real data represents the monthly average of the price of an ounce of gold in the Kurdistan Region, To compare the simulation results and the real data of the proposed and traditional method, then design a program in Matlab language for this purpose and based on the criteria (MSE, MAD, R2). The results of the research concluded that the proposed method is more accurate than the traditional method in estimating the parameters of the time series model.
Article
Full-text available
In this research, it was proposed to create three new robust multivariate charts corresponding to the |S|‐ chart, which are robust to outliers, using three methods, an algorithm, namely the Rousseuw and Leroy algorithm, Maronna and Zamar, and the family of ‘concentration algorithms’ by Olive and Hawkins. Then the comparison between the proposed and classical method of the researcher Shewhart depends on the total variance (trace variance matrix), the general variance (determinant of the variance matrix), and the difference between the upper and lower control limits to obtain the most efficient charts against outliers through simulation and real data and using a program in the MATLAB language designed for this purpose. The study concluded that the proposed charts dealt with the problem of the influence of outliers and were more efficient than the classical method, in addition, the proposed robust chart (Orthogonalized Gnanadesikan‐Kettenring) was more efficient than the rest of the proposed charts.
Article
Full-text available
In this paper, we suggested to use the Bayes approach in calculating the Bayes weights to treat the heterogeneity problem when estimating the gamma regression model parameters depending on the weighted least squares method and iterative weighted least squares method. A comparison with the classical method through an experimental side to simulate the generated data from a gamma distribution is presented. The data is analyzed through a MATLAB code designed for this purpose, in addition to the statistical program SPSS-25 and EasyFit-5.5. The aims of this study are: solving of heteroscedasticity problem random error variance for gamma regression model by a proposed method which depends on Bayes weighted and estimation of the best fit gamma regression model by using Bayes weighted, as well as a comparison between the results from the classical and proposed methods through some statistical criteria, the results provided the preference of the proposed method on the classical method.
Article
Full-text available
In this paper, a hybrid method of quantile regression and multivariate wavelet is proposed to deal with the problem of data contamination or the presence of outliers, which uses the median instead of the mean on which the linear regression model and the estimation method for ordinary least squares depend. The paper included a comparison between the proposed (for several wavelets and different threshold) and classical method based on mean absolute error, to get the best fit quantile regression model for the data. The application part dealt with two types of data representing simulation, real data, and analysis using a program designed for this purpose in the MATLAB language, as well as the statistical program SPSS-26 and EasyFit-5.5. The study concluded that the proposed method is more efficient than the classical method in estimating the parameters of a quantile regression model depending on the coefficient of determination and on the mean absolute error and mean squared error criteria.
Conference Paper
Full-text available
In this research, the Consumer price index and the main sections of the Kurdistan Region of Iraq were predicted, in which the annual inflation rate and the main sections are calculated by using dynamic models of neural networks (non-linear filters) with time series in the formation of linear models to predict the future interval (2023-2025). Based on data from the KRSO for the interval (2008-2022) using MATLAB language, the research found that these models can be used to predict consumer price index that include large fluctuations in their amounts and there is an increase in the level of the consumer price and some of the main sections and a decrease in some of the other predicted, which led to increase in the annual inflation rate and the rise in some basic sections and decrease in others.
Article
Full-text available
The proposed method in this paper dealt with the problem of data contamination in the Cox Proportional Hazards Regression model (CPHRM) by using Wavelet Shrinkage to de-noise data, calculating the discrete wavelet transformation coefficients for wavelets (Symlets and Daubechies), and thresholding methods (Universal, Minimax, and SURE), as well as thresholding rules (Soft and Hard). A software in the MATLAB language built for this propose will compare the proposed and classical method using simulation and real data. All the proposed methods have better efficiency than the classical method in estimating the Cox Proportional hazards model depending on both average of Akaike and Bayesian information criterion. Keywords: Cox PH model, Wavelet Shrinkage, thresholding rules.
Article
Full-text available
In this research, it was suggested to use the InformativeBayes method in calculating the Bayes weights and use them to treat the of heterogeneity problem when estimating the linear regression model parameters using the weighted least squares method (BWLS). And compare it with the classical method through an experimental side to simulate the generated data from a normal distribution and for several different cases as well as an applied side of real data. The results of the research provided the preference of the proposed method on the classical method by relying on some statistical criteria through a program designed for this purpose in the language of MATLAB.
Article
Full-text available
In this paper, we suggest using the multivariate wavelet analysis in higher dimensional space (Symlet, Daubechies' least-asymmetric wavelets) with soft thresholding to de-noise of the data (Shrinkage) before use it in the Discriminant Analysis to obtain more accurate and reliable results by comparing it with the Discriminant analysis used on data before de-noise. And to know the effect of de-noise from data (proposed method) on Discriminant analysis results by simulating random data with normal distribution repeated 1000 times for different combinations of number of variables and sample sizes and real data represent leukemia patients taken from the Nanakele Hospital in Erbil. We analyzed the data depending on MATLAB Language and statistics program (SPSS). One of the most important conclusions reached by the researcher that use proposed method led to the separation between the two groups better than before de-noise (classical method) and this mean that data were classified for proposed method better than classical method.
Article
Full-text available
This paper proposes a new improvement of the Nadaraya-Watson kernel non-parametric regression estimator and the bandwidth of this new improvement is obtained depending on universal threshold level with wavelet of kernel function instead of using fixed bandwidth and variable bandwidth for geometric, arithmetic mean, range and median measurements. A simulation study is presented, including comparisons between the proposed method and five others Nadaraya-Watson kernel estimators (classical methods), as well as using real data depending on a program written in MATLAB language which was designed for this purpose. It was concluded that the proposed method is more accurate than all classical methods for all simulations and real data based on MSE criterion.
Article
Full-text available
In this research, a new improvement of the Nadaraya-Watson kernel non parametric regression estimator is proposed and the bandwidth of this new improvement is obtained depending on the three different statistical indicators: robust mean, median and harmonic mean of kernel function instead of using geometric and arithmetic mean, or R. Simulation study is presented, including comparisons with four others Nadaraya-Watson kernel estimators (classical methods). The proposed estimator in the case of harmonic mean is more accurate than all classical methods for all simulations based on MSE criteria.