PresentationPDF Available

Forecasting CPI Inflation Components with Hierarchical Recurrent Neural Network

Authors:

Abstract

We present a hierarchical architecture based on recurrent neural networks for predicting disaggregated inflation components of the Consumer Price Index (CPI). While the majority of existing research is focused on predicting headline inflation, many economic and financial institutions are interested in its partial disaggregated components. To this end, we developed the novel Hierarchical Recurrent Neural Network (HRNN) model, which utilizes information from higher levels in the CPI hierarchy to improve predictions at the more volatile lower levels. Based on a large dataset from the US CPI-U index, our evaluations indicate that the HRNN model significantly outperforms a vast array of well-known inflation prediction baselines. Our methodology and results provide additional forecasting measures and possibilities to policy and market makers on sectoral and component-specific price changes.
Forecasting CPI Inflation Components with
Hierarchical Recurrent Neural Network
Advanced Analytics: New Methods and Applications
for Macroeconomic Policy Conference
Bank of England, European Central Bank & King’s College London
July 21,2022
Jonathan Benchimol
Research Department
Bank of Israel
Coauthors: Oren Barkan,Itamar Caspi,Eliya Cohen,Allon Hammer, and Noam Koenigstein
This presentation does not necessarily reflect the views of the Bank of Israel 1
Outline
Consumer Price Index and Dataset Properties
Recurrent Neural Networks (RNNs)
Hierarchical Recurrent Neural Networks (HRNN)
Evaluation and Results
Policy Implications and Conclusion
2
CPI and Dataset Properties
3
Consumer Price Index
The Consumer Price Index (CPI) measures the average change over
time in the prices consumers pay for a basket of goods and services.
The CPI quantifies the average cost of living in a given country by
estimating the purchasing power of a single unit of currency.
The CPI is the key macroeconomic indicator for measuring inflation (or
deflation).
4
US Consumer Price Index
In the US, the CPI is calculated and reported by the Bureau of Labor
Statistics (BLS) on a monthly basis.
The BLS has classified all expenditure items into more than 200 categories,
arranged into eight major groups: (1) Housing, (2) Food and Beverages, (3)
Medical Care, (4) Apparel, (5) Transportation, (6) Energy, (7) Recreation,
and (8) Services.
The consumer goods and services are grouped in a hierarchy of
increasingly detailed categories (levels).
5
Hierarchical Data Structure
Level 0
Aggregated CPI across all components
Level 1
Aggregated components (e.g., Energy, Apparel)
Mid levels (2-5)
Fine grained components, expenditure classes, item strata (e.g., Insurance)
Lower levels (6-8)
Finer grained components (e.g., Bacon, Tomatoes)
6
Hierarchical Data Structure
7
Example
The White Bread entry is classified under the following eight level
hierarchy:
All Items
Food and Beverages
Food at Home
Cereals and Bakery Products
Cereals and Cereal Products
Bakery Products
Bread
White Bread
8
Forecasting CPI
Central banks conduct monetary policy to achieve price stability (low and
stable inflation).
Investors in fixed income assets (such as government bonds) estimate
future inflation to foresee upcoming trends in discounted real returns.
Government and private debt management depend on the expected path
of inflation.
Policymakers and marketmakers monitor CPI component levels (e.g., core
inflation, oil-related products).
9
Related Work
Most related work deal with predicting the headline CPI only.
Forecasts based on simple averages of past inflation are more accurate than
structural models [1].
ML models based on exogenous features: online prices, house prices,
exchange rates etc. [2].
Feed-forward NN to predict inflation rate in 28 OECD countries.
About 50% of the countries NN were superior to autoregressive models [3].
[1] Makridakis, Assimakopoulos, Spiliotis. Objectivity, reproducibility and replicability in forecasting research.
International Journal of Forecasting (2018).
[2] Medeiros, Vasconcelos, Veiga, Zilberman. Forecasting inflation in a data-rich environment: the benefits of machine learning methods.
Journal of Business & Economic Statistics (2021).
[3] Choudhary, Haider. Neural network models for inflation forecasting: an appraisal.
Applied Economics (2012).
10
Objective
Our goal: Forecast US monthly CPI inflation for all components,
without exogenous features.
Harness the hierarchical pattern of the data to improve prediction at
low levels.
Utilize the sequential pattern of the data employing Recurrent
Neural Networks.
Improve predictions of volatile and non-stationary time series at
lower-level components.
11
Dataset
CPI-U (Urban CPI) from 1994 to 2019 from the BLS.
Monthly prices of 424 components, structured
hierarchically.
Each component is a time series of inflation rates
belonging to a level between 0 and 8.
The train set comprises 70% of early entries, and the
other 30% comprise the test set.
= CPI-U at time
t
12
Summary Statistics
13
Volatility at Different Levels
14
Volatility at Different Sectors
15
Recurrent Neural Networks
16
Artificial Neural Networks
A neural network is a group of
algorithms that endeavors to recognize
underlying relationships in a set of
data through a process that mimics the
way the human brain operates.
17
Recurrent Neural Networks
RNNs are neural networks that model sequences of data in which each
value is assumed to be dependent on previous values.
RNNs are feed-forward networks augmented by including a feedback loop.
RNNs introduce a notion of time to the standard feed-forward neural
networks and excel at modeling temporal dynamic behavior (Chung et al., 2014).
Some RNN units retain an internal memory state from previous time steps
representing an arbitrarily long context window.
Our paper covers the three most popular units: Basic RNN, Long-Short
Time Memory (LSTM), and Gated Recurrent Unit (GRU).
18
Basic RNN
The linear combination is the
argument of ahyperbolic tangent
activation function allowing the
unit to model nonlinear relations
between inputs and outputs.19
Long-Short Term Memory Networks
Basic RNNs suffer from the “short-term memoryproblem
Use recent history to forecast, but for long enough sequences, cannot carry relevant
information from earlier to later periods, e.g., relevant patterns from the same month in
previous years.
Long-Short Term Memory networks (LSTMs) deal with this problem by
introducing gates that enable the preservation of relevant “long-term
memoryand combining it with the most recent data.
The introduction of LSTMs paved the way for significant strides in various
fields such as NLP, speech recognition, robot control and more.
20
Long-Short Term Memory Networks
A LSTM unit has the ability to
memorizeor “forgetinformation
through the use of a special memory cell
state.
The cell state is carefully regulated by
three gates that control the flow of
information in the LSTM unit: input gate
(i), forget gate (f), and output gate (o).
The cell state Cis updated by a
combination of its previous state and its
current candidate C.
Learned params
21
Gated Recurrent Unit
A Gated Recurrent Unit (GRU) is a newer improvement of the LSTM unit
that dropped the cell state in favor of a more simplified unit that requires
less learnable parameters.
GRUs are faster and more efficient especially when training data is limited,
such as in the case of inflation predictions (and especially disaggregated
inflation components).
22
Gated Recurrent Unit
The candidate activation vis a function
of the input and the previous output.
The output sis a combination of the
candidate activation vand the previous
output controlled by z.
23
Hierarchical Recurrent Neural
Networks
24
Hierarchical Recurrent Neural Networks
HRNN exhibits a network graph that follows the CPI hierarchy.
Each node is a RNN that models the inflation rate of a specific
component in the full CPI hierarchy.
HRNNs propagate information from RNN models in higher levels to
lower levels via hierarchical priors over the RNNslearned weights.
Expected result: Better predictions for lower-level components.
25
HRNN Formulation
Define a parametric function grepresenting a
RNN node in the hierarchy.
g predicts a scalar value for the next input value
of a series.
Assuming a normal likelihood relation between
gand the observed time series.
= Inflation rate at time t of node n
= Last time step of node n
= Precision variable of node n
= RNN learned params of node n
26
HRNN Formulation
Define hierarchical network of normal priors
over the nodes’ parameters:
This models the relationship between a node's
parameters and its parent in the hierarchy.
This relationship grows stronger according to
the correlation between the two series.
It ensures that each node is kept close to its
parent, in terms of squared Euclidean distance
in the parameter space.
= RNN learnt params of node n
= RNN learnt params of node n’s
parent
= Pearson corr.
coefficient between the parent and
the childs time series
= Precision parameter
induced by the Pearson corr. and
an additional hyperparameter .
27
HRNN Formulation
According to the Bayes Rule, the posterior
probability is:
Maximum A-Posteriori (MAP) approach:
= Enumeration of all
nodes from all levels
= Aggregation of all
series from all levels
= Aggregation of all
learnt params from
all levels
= Aggregation of all
precision params
from all levels
28
HRNN Based on GRUs
HRNNs implement gas a scalar GRU.
Specifically, each node n, is associated
with a GRU of its own.
HRNNs optimization proceeds with
stochastic gradient ascent over the
objective in MAP.
29
HRNN Architecture
Each node is a scalar GRU
predicting the inflation in the
next time step for the given
component.
Constraints from the parent
node are propagated down to
the child node.
GRUs are trained from top to
bottom.
30
HRNN Inference
Equipped with trained weights
for node n.
Predict for future time step
The prediction for horizon is
obtained by the predictions of
previous horizons iteratively.
Each time using the previous
predicted value as input to the
GRU.
31
Evaluation and Results
32
Evaluation Metrics
Evaluation metrics:
RMSE:
Pearson Correlation:
Distance Correlation:
= Actual and predicted
inflation rate at month t,
respectively
= Actual and predicted
inflation rate series,
respectively.
33
Baselines
Autoregressive (AR): Estimated next month value
based on previous d months.
Phillips Curve (PC): Add unemployment rate u
which should have an inverse relation with inflation.
Random Walk (RW): Simple average of last d months.
Auto-Regression in Gap Form (AR-GAP): Detrend
time series using RW, then use AR to predict the gap
form and finally add the trend to final prediction.
Vector Autoregression (VAR): Learn Kmost similar
time series together.
Logistic Smooth Transition Auto Regressive Model
(LSTAR): extension of AR that allows for changes in
the model parameters according to a transition
variable F(Van Dijk et al. 2000). 34
Ablation Study
Single Scalar GRU: One scalar GRU for all components.
Assumes that the different components of CPI hierarchy behave
similarly.
HRNN without hierarchy: Set , removing the
hierarchical priors. Equivalent to Nindependent GRU units.
Fully Connected Neural Network (FC): Similar to Auto-
Regression but with non-linearities.
Vectorial GRU based on K Nearest Neighbors: Different GRU
for each node n, but each entry is a vector that includes the time
series of its knearest (most correlated) series.
35
36
37
No hierarchy prediction.
No advantage for GRU compared to simple AR model.
Results
Average Results of Best HRNN Model on
Disaggregated CPI Components by Hierarchy
HRNN shows best performance in the lower levels, where CPI components are more volatile.
38
Results
Average Results of Best HRNN Model on
Disaggregated CPI Components by Sector
HRNN shows best performance in Food and Beverages sector which contains the most low-level
CPI components. 39
40
41
42
Conclusion
43
Conclusion
The hierarchical nature of the model enables information
propagation from higher levels.
HRNNs are superior at predicting low-level inflation
components.
Policy implications.
44
Thanks
Thank you for your attention.
Paper: Open Access @ International Journal of Forecasting
Replication files: GitHub.com/AllonHammer/CPI_HRNN
Comments: jonathan@benchimol.name
Other papers: JonathanBenchimol.com/Research
45
Article
Full-text available
Purpose: This paper aims to test the accuracy of some Machine Learning (ML) models in forecasting inflation in the case of Turkey and to give a new and also complementary approach to time series models. Methods: This paper forecasts inflation in Turkey by using time-series and machine learning (ML) models. The data is spanning from the period 2006:M1 to 2020:M12. Findings: According to our findings, although the linear-based Ridge and Lasso regression algorithms perform worse than the VAR model, the multilayer perceptron algorithm gives satisfactory results that are close to the results of the time series algorithm. In this direction, non-linear machine learning models are thought to be a reliable complementary method for estimating inflation in emerging economies. It is also predicted that it can be considered as an alternative method as the amount of data and computational power increase. Implication: The findings are expected to be useful as a guide for central banks and policy-makers in emerging economies with volatile inflation rates. Originality: We evaluate the forecasting performance of ML models against each other and a time series model, and investigate possible improvements upon the naive model. So, this is the first study in the field, which uses both linear and nonlinear ML methods to make a comparison with the time series inflation forecasts for Turkey.
Paper: Open Access @ International Journal of Forecasting Replication files: GitHub.com/AllonHammer/CPI_HRNN Comments: jonathan@benchimol
Paper: Open Access @ International Journal of Forecasting Replication files: GitHub.com/AllonHammer/CPI_HRNN Comments: jonathan@benchimol.name Other papers: JonathanBenchimol.com/Research