Content uploaded by Adria Binte Habib
All content in this area was uploaded by Adria Binte Habib on Dec 02, 2022
Content may be subject to copyright.
A Detailed Explanation of the workflow of N-BEATS Architecture
Predicting the future has always been a great way to prepare humans for upcoming events. For
doing this “Future Prediction” or “Forecasting” time series data are needed. In simple words, time
series data are arranged along with a “date” column. There are a lot of algorithms for time series
analysis. Years back from now statistical methods were considered to be best suited ways to predict
time series. Gradually, machine learning and deep learning algorithms hopped in the field of time
series analysis. Nowadays, time series analysis becoming so vast that a lot of dedicated algorithms
have been invented in order to forecast the future, N-BEATS (NEURAL BASIS EXPANSION
ANALYSIS FOR INTERPRETABLE TIME SERIES) is one of them which will be explained in
This algorithm focuses on forecasting univariate time series data using deep learning. It is a deep
neural architecture based on backward and forward residual links and a very deep stack of fully-
Let’s get into the algorithm. “A series of past data with length T need to forecast a series of future
data of length H”
The given problem will be solved with the N-BEATS algorithm.
Figure 1: N-BEATS Architecture
In the figure 1, A lookback window enters in Box 1. Lookback window is a slice of given past
data. The size of the slice may vary due to the preference of solution of the problem. The length
of the lookback window is t. In other words, this lookback window can be called “Model Input”.
This Model Input goes to the box called Stack.
Box 2 shows what happens in “Stack” in detail. In Figure 1, it can be seen, the stack input enters
Block 1. Box 3 shows what happens in “Block” in detail.
In box 3, the Block input enters a 4 layered Fully Connected stack. The equations of the 4 layered
FC stack are given below.
This FC stack generates forward predictor 𝜃ₗ𝑓and backward predictor 𝜃ₗ𝑏. These are coefficients
which will help to predict both backcast data and forecast data. These backcast and forecast
predictors are generated through a linear function which takes the output of Layer 4 as input. The
equations are given below.
FC Layer is a fully connected layer with RELU nonlinearity. For example:
For obvious reasons the nonlinearity has been maintained to precisely get the output. As from the
FC layer we get backward and forward coefficients, it goes to the second and last step of this block
which generates brand new backcast and forecast data. Let it be explained in detail.
With the help of basis vector glb we get an estimate of xl (backcast data). This xl is generated by
removing the features of the given input which are not helpful for the calculation of forecasting.
In the meantime, yl is being produced following the same path with the help of basis vector glf.
glb and glf are the basis layers for mapping the expansion of backward and forward coefficients.
vb and vf are non-learnable vectors which are set to get better validation performance.
Now let’s get into box 2. In the Box 2, the blocks (basically box 1) are stacked following doubly
residual stacking principle. In doubly residual stacking principle, the difference between the newly
generated backcast data ( ) and given backcast data ( ) has been calculated to improve the
trainability of the architecture. Moreover, the principle has also been applied to the forecast data
( ). Since from each block partial forecast ( ) has been generated, by the summation of all the
partial forecasts, the final forecast can be found. The doubly residual formula has been given below
and the steps are shown in Figure 2.
Figure 2: Doubly Residual Stacking
Box 3 does the summation of the Stacks and gives the global forecast. The whole process of Box
1,2 & 3 runs for a single forecast.