Conference PaperPDF Available

Predictive validity of multiple regression analysis: Preliminary results

Authors:

Abstract

Problem: Multiple regression analysis (MRA) is commonly used to develop forecasting models that inform policy and decision making, but the technique does not appear to have been validated for that purpose. Methods: The predictive validity of published least squares MRA models is tested against naive benchmarks, alternative methods that are either plausible or commonly used, and evidence-based forecasting methods. The out-of-sample errors of forecasts from the MRA models are compared with the errors of forecasts from models developed from the same data on the basis of cumulative relative absolute error (CumRAE), and the unscaled mean bounded relative absolute error (UMBRAE). Findings: Results from tests using ten models for diverse problems found that while the MRA models performed well against most of the alternatives tested for most problems, out-of-sample (n-1) forecasts from models estimated using least absolute deviation were mostly more accurate. Originality: This paper presents the first stage of a project to comprehensively test the predictive validity of MRA relative to models derived using diverse alternative methods. Usefulness: The findings of this research will be useful whether they turn out to support or reject the use of MRA models for important policy and decision-making tasks. Validation of MRA for forecasting would provide a stronger argument for the use of the method than is currently available, while the opposite finding would identify opportunities to improve forecast accuracy and hence decisions.
Predictive validity of
multiple regression
analysis: Preliminary results
Kesten C. Green


 !"##$%"!#
&'()!#'*
International Symposium on Forecasting
Evidence on MRA predictive
validity
+$$,
,
-+,.,
/.0+'(1'2/3
,,)1

345'(((2.)5$
$.
6.78
/..
/..,.
,9
!:
Conservative election
forecasting
;
<,
))0
)65,,
3.)
,,<
66
;)56=
,
%:
13 methods compared
>:
Method Description
MRA Ordinary least squares multiple regression analysis
Big_data-8 Stepwise regression (p=0.05) with 8 additional variables
Big_data-64/42* Stepwise regression (p=0.05) with 64/42 additional variables
Naïve Mostly mean or median of in-sample dependent variable obs
TTB_correl Single regression of strongest correlated variable with dependent
TTB_model Single regression of model with largest beta in MRA model
MRA_damped Average of MRA and Naïve forecasts
MRA_eq'zd Average of MRA and MRA_equal forecasts
MRA_equal MRA model with betas averaged (equal weights)
SRA-average Average of forecasts from single regression analysis models
SLAD-average Average of forecasts from single least absolute deviation models
MLAD Multiple least absolute deviation analysis
MLAD_eq'zd Average of MLAD and MRA_equal forecasts
Data series for “big data”,
1901-2000
?:
,



@6
AB6
=,,
,"996669.C=

10 MRA (econometric)
models tested
D:
Model Source Vars Obs*/var MdAPE
Presidential election Lewis-Beck & Tien (2012) 2 8 3.4
TFR, Africa Fox (2014), from U.N. data 2 19 6.7
Mortality rates McDonald & Schwing (1973)§2 29.5 3.1
Icecream sales Hildreth & Lu (1960) §3 9.66r 5.2
TFR, Asia Fox (2014), from U.N. data 3 11 16.1
Fish fertility Gillet (2012) § 3131.33r 13.8
Wine price Ray Fair (2002) 4 7.75 18.6
Professor salary Weisberg (2005) § 4 12.75 6.1
Credit scores Hyndman & Athenopoulos, Obook 4 124.75 17.0
Presidential election Ray Fair (2002) 7 33.4
E/../$'
§ Via Czerlinski et al. (1999) & personal communication
CumRAE vs UMBRAE
3.
,,.0
3,'((!
$..
,,.3)&6)0
.!#'1
1:
*:
Method CumRAE
(GeoMean)
UMBRAE
(GeoMean
)
Rank by
GeoMean
of
CumRAE
Rank by
GeoMean
of
UMBRAE
MRA '## '## ! !
Big_data-8 ''! ''! ? ?
Big_data-
64/42* ''( ''> D D
Naïve 'D? '>( '% '%
TTB_correl '!? ''* ( 1
TTB_model '!1 '!% '# '#
MRA_damped '!# '!! 1 (
MRA_eq'zd '#1 '#1 > >
MRA_equal '!> '!# * *
SRA-average '%> '%D '! '!
SLAD-average '%% '%> '' ''
MLAD 0.98 0.95 1 1
MLAD_eq'zd '#D '#1 % %
*Results for 7/3 models
Relative accuracy of forecasts from
13 methods
applied to 10 models by 2 measures
(:
Method Mean rank
by MdAPE
Mean rank
by 90%APE
Provided smallest
MAE forecasts
(n/10 models)
Beat MRA by CumRAE,
UMBRAE, or MAPE
(n/10 models)
MRA %> ?! $
Big_data-8 >% ?* !
Big_data-64/42* ' '
Naïve '#1 '#%
TTB_correl 1% D1 ' %
TTB_model 1D D' ' %
MRA_damped 1% ?* '
MRA_eq'zd >? D* ! !
MRA_equal * *( !
SRA-average ( *1
SLAD-average (% 11 '
MLAD 2.2 3.1 6 8
MLAD_eq'zd >> ?1 1
*Results for 7/3
models
Relative accuracy of forecasts from 13
methods
applied to 10 models
'#:
Model Obs/var Methods that provided smallest MAE forecasts
Election, Fair 3 MRA_eq'zd MRA_equal MLAD_eq'zd
Wine price 7.75 MRA_eq'zd MLAD_eq'zd MRA_equal
Election, LBT 8 TTB_correl/TTB_model MLAD
Icecream sales 9.66r MLAD MRA MRA_damped
TFR, Asia 11 MLAD MLAD_eq'zd MRA
Professor salary 12.75 Big_data-64/42 MRA MLAD
TFR, Africa 19 MLAD TTB_correl/TTB_model
Mortality rates 29.5 MLAD MLAD_eq'zd MRA/Big_data-
8/Big_data-64/42
Fish fertility 131.33r MLAD Big_data-64/42 MRA
Credit scores 124.75 MLAD MRA/Big_data-8
“+” = Forecasts from vertical method
more accurate than those from
horizontal method
'':
Forecasts from method more
accurate than those from n out of
12 others
'!:
MRA 11
Big_data-8 8
Big_data-64/42* 7
Naïve (appropriate) 0
TTB_correl 4
TTB_model 3
MRA_damped 6
MRA_equalized 9
MRA_equal 5
SRA-average 1
SLAD-average 2
MLAD 12
MLAD_equalized 10
MLAD 12
MRA 11
MLAD_equalized 10
MRA_equalized 9
Big_data-8 8
Big_data-64/42* 7
MRA_damped 6
MRA_equal 5
TTB_correl 4
TTB_model 3
SLAD-average 2
SRA-average 1
Naïve (appropriate) 0
'%:
'>:
... Their ordinary least squares (OLS) multiple regression (MRA) model estimated that the homicide victimisation rate of those countries increased by 3.6 per cent for every one percentage point increase in the immigrant population after allowing for year effects and for country specific influences by using dummy variables . 2 This Research Note asks, does Lott and Varney's MRA model have predictive validity? The answer to that question will contribute to my ongoing research on the predictive validity of multiple regression analysis compared to alternative, simpler and more conservative, methods for estimating forecasting models Green, Graefe, and Armstrong, 2018;Armstrong, Green, and Graefe, 2015;and Graefe, Green, and Armstrong, 2019). ...
Research
Full-text available
Lott and Varney (2022) estimated a model of the effect of immigrant population numbers, as a proportion of the total population, on the homicide victim rate for 31 European countries for which data were available for some or all years between 2010 to 2020 amounting to 315 observations in total. 1 Their ordinary least squares (OLS) multiple regression (MRA) model estimated that the homicide victimisation rate of those countries increased by 3.6 per cent for every one percentage point increase in the immigrant population after allowing for year effects and for country specific influences by using dummy variables. 2 This Research Note asks, does Lott and Varney's MRA model have predictive validity? The answer to that question will contribute to my ongoing research on the predictive validity of multiple regression analysis compared to alternative, simpler and more conservative, methods for estimating forecasting models (Green and Armstrong, 2015; Green, Graefe, and Armstrong, 2018; Armstrong, Green, and Graefe, 2015; and Graefe, Green, and Armstrong, 2019). Method I compared the out-of-sample forecast errors-the 315 errors from forecasting each observation using a model estimated from the other n-1 (314) observations-from the Lott and Varney specified MRA model with the out-of-sample forecast errors from five alternative models. Three of the five were estimated using one of two alternative estimation methods, and one of those alternatives differed from the Lott and Varney model only in the estimation method.
ResearchGate has not been able to resolve any references for this publication.