ArticlePDF Available

Finding an Accurate Early Forecasting Model from Small Dataset: A Case of 2019-nCoV Novel Coronavirus Outbreak

Authors:
  • JIS University
ResearchGate Logo

This article is featured on the COVID-19 research community page

View COVID-19 community

Abstract and Figures

Epidemic is a rapid and wide spread of infectious disease threatening many lives and economy damages. It is important to fore-tell the epidemic lifetime so to decide on timely and remedic actions. These measures include closing borders, schools, suspending community services and commuters. Resuming such curfews depends on the momentum of the outbreak and its rate of decay. Being able to accurately forecast the fate of an epidemic is an extremely important but difficult task. Due to limited knowledge of the novel disease, the high uncertainty involved and the complex societal-political factors that influence the widespread of the new virus, any forecast is anything but reliable. Another factor is the insufficient amount of available data. Data samples are often scarce when an epidemic just started. With only few training samples on hand, finding a forecasting model which offers forecast at the best efforts is a big challenge in machine learning. In the past, three popular methods have been proposed, they include 1) augmenting the existing little data, 2) using a panel selection to pick the best forecasting model from several models, and 3) fine-tuning the parameters of an individual forecasting model for the highest possible accuracy. In this paper, a methodology that embraces these three virtues of data mining from a small dataset is proposed. An experiment that is based on the recent coronavirus outbreak originated from Wuhan is conducted by applying this methodology. It is shown that an optimized forecasting model that is constructed from a new algorithm, namely polynomial neural network with corrective feedback (PNN+cf) is able to make a forecast that has relatively the lowest prediction error. The results showcase that the newly proposed methodology and PNN+cf are useful in generating acceptable forecast upon the critical time of disease outbreak when the samples are far from abundant.
Content may be subject to copyright.
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 6, Nº 1
- 132 -- 132 -
I. 
SINCE 


reported [1]      
         [2] 
     


     



        [3] 


uncertain [4] especially for young children and the senior aged groups.
        


   
notice. The purpose is to limit the chances of physical contacts among


         
 
          
         

  
* Corresponding author.
      

Finding an Accurate Early Forecasting Model from


Simon James Fong2345
1
2
3
4
5
Received 5 February 2020 | Accepted 7 February 2020 | Published 7 February 2020
Keywords
F


Epidemic.
Abstract






            
      


           
          





 DOI: 10.9781/ijimai.2020.02.002
- 133 -
Special Issue on Soft Computing
economical loss. Timing is very uncertain during this initial stage

 



        
  
of several candidate models. In data mining this is a challenging
  [5] [6]       

      
from a small dataset are reported in the literature [7] [8]

  [9] [10]    [11]     

      
use. The results from the rest of the candidates are discarded. The third
approach [12]


The default parameters values for such algorithm often do not provide
        
required to improve the accuracy level.










          
         
         
    

on this critical topic.
II. 

  
uncertainty. It is assumed that the virus is novel and human expert

         


  


        

          
   

         

A. Group of Optimized and Multi-source Selection
     
  

   

  
          
   
  
       
inference) that requires no setting of model parameter; parametric
         
        
           
         




       




[13].

          
          
        
complexity in the model designs [14].
- 134 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 6, Nº 1
Group 01: Forecasting using complex machine learning models

          

         
 
and feature selection [15]
        

         
model. This group of candidate models requires dual tuning of model
        

Group 02: Forecasting using complex machine learning models.
        


         
        


Group 03     
models
          
   

   

Group 04: Simple data analytics
        
          

Group 05
      
        

  




          
 
 
        


model under time constraints.
B. Polynomial Neural Network with Corrective Feedback
(PNN+cf)
        

         
to sophistical deep learning models that have many configuration

   

         
     

  
         

         
           
        
           




         
         
         
        


 
[16]    


the polynomial coefficients until it can model the time series as fit as



automatically through an iterative data sampling and controlled
  


       

(1)
Output_error() is a criterion of output error of the model
    
models Β
          
parametric equation. The equation = f(x1xn
xt=1xt=nxn
comprised as an input vector   
the final model .

      
         
          


(2)
      
         
any function in a general form y=f(  .
           
  
 

monitoring the error level. When there is no significant incremental

      
[17]
       
- 135 -
Special Issue on Soft Computing

        [18]
         
        

reported in [19].
        
disasters [20]   [21]    
[22]
          

        





          
         


    
nonlinearity that exists in the time series.
            
  
         
        

          
        
  

fitting curve is resulted from the polynomial equation for regressing a


         
         

III. 
An experiment is carried out to verify the efficacy of the proposed
        
   
of Chinese health authorities. The data is updated daily since 21 Jan

              





  




forecast of future days. The fourteen instances represent a challenging
scenario of data mining over small data.
        
         
 




       
challenging – it has a sharp hump and trough near the end of the first
 
 



         
epidemic (see Fig. 4).

Three representative groups of forecasting algorithms are put
       



to the limited epidemic data at the early stage. This is to simulate a
        
 
- 136 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 6, Nº 1

time series forecasting is used as a criterion in panel selection.
        
      
           
           
          
charts as in Fig. 5.
           
          



the fitting curve and the actual data at the hump near the initial stage.

a) ARIMA



- 137 -
Special Issue on Soft Computing


         

           
promising than those classical algorithms for this particular case. An




      
          

a) ARIMA





       
          
         
           
         

a) Linear Regression
[23]
- 138 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 6, Nº 1
c) Fast decision tree learner [24]
[25]


         
         

   


           
        
           
       
     
 
         

         

 
   
        
        

infection or something else. The patient received medical treatment;
        

    
  
      







          

  



      



NN types All

confirmed


critical







critical

cured

died
 200.044 190.599* 190.599* 237.261 214.097 193.520 276.885 290.079
 174.366 189.661 160.338* 165.127 172.627 167.544 241.052 222.997
MIA 153.709 155.684 138.042* 141.557 156.275 158.156 183.904 186.212
 151.789 154.128 136.547* 140.780 157.496 162.175 189.954 186.212

- 139 -
Special Issue on Soft Computing

          
       


        
methods for improving the performance of machine learning models.
         
   
      
          

          
          


      
         
      

       
        
           
     
    


    

           
          
   

      
 




   
    
          
    


[1] 

[2]      


[3] 

       
Archived from the original on 31 January 2020. Retrieved 30 January
2020.
[4]     


[5] 
      

03/2019.
[6] 

[7] 

[8] 

[9]    
         
       

[10] 

[11]        
       

[12]            

[13]         

[14]           


[15]            

      

[16]         

[17]          
        
       

[18] 


[19]         


[20]    
   

[21] 
         
A. et al. (eds) Smart Trends in Information Technology and Computer
Communications. SmartCom 2017. Communications in Computer and

[22]          

       

[23]          
         

[24]           
        

[25] 
          

- 140 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 6, Nº 1
Nilanjan Dey
   


         
  
      


   
       
 
       
    
  


 

Rubén González Crespo
        
      
       


   
  


Enrique Herrera-Viedma

         
 

     
   
     

   
    


       

Simon James Fong


         

       

      



         
        
            
 
            
   





Gloria Li
         


      
        
      
       
          
          
     
           
         

          
Macau.
... [1] [18] March [5], [19], [4], [7], [20], [21], [22] [8], [9], [23] [10], [24], [25] [26], [27], [28] [11], [12], [29], [30], [13], [31], [32], [33] April [34] [35], [36], [37, p. 19] [38] [39], [40], [41], [42], [43] [44] ...
... An optimized forecasting model was proposed by Fong et al. based on concepts including the augmentation of existing data, selecting the best forecasting model, and fine-tuning of parameters of individual forecasting [22]. They named it "polynomial neural network with corrective feedback" (PNN+cf). ...
... Decision-makers can get benefit from the proposed method because it can present the extreme ranges of future possibilities [15] Time-dependent SIR model Predictions about the disease for several countries and recommendation of social distancing measures are emphasized [17] Time series analysis and correlation Authors concluded that alarming predictions about the pandemic will not materialize and the matter will sort out soon. [16] Network-based modeling along with SIR method Network-based modeling approach produces more accurate results of predictions rather than modeling the cities independently [22] Polynomial neural network with corrective feedback The proposed method was able to retrieve results with acceptable accuracy. ...
... In the proposed method, a neural network is implemented as a pre-processing unit to be used as a classifier of the variables to the MCS system. A polynomial neural network is considered [16]. An application of this technique with the focus on prediction was presented in this Chapter. ...
... The second task is focused on forecasting a new time series as another input variable for the MCS system. As an improvement to the PNN model published in [16], here a new optimization algorithm is used solving the stationarity equilibrium problem using different numerical methods such as Quasi-Newton [17] and secant [18]. The Kolmogorov-Gabor polynomial series is presented in Eq. (1). ...
... According to the proposed method in [16], the Polynomial Neural Network is considered as the best performance model of the GROOMS system and is used in our application. ...
Chapter
There are several techniques to support simulation of time series behavior. In this chapter, the approach will be based on the Composite Monte Carlo (CMC) simulation method. This method is able to model future outcomes of time series under analysis from the available data. The establishment of multiple correlations and causality between the data allows modeling the variables and probabilistic distributions and subsequently obtaining also probabilistic results for time series forecasting. To improve the predictor efficiency, computational intelligence techniques are proposed, including a fuzzy inference system and an Artificial Neural Network architecture. This type of model is suitable to be considered not only for the disease monitoring and compartmental classes, but also for managerial data such as clinical resources, medical and health team allocation, and bed management, which are data related to complex decision-making challenges.
... The effective estimation (or prediction, forecasting) of the number of COVID-19 cases will be of great help for each country to plan its own health policies (including vaccination, quarantine, isolation, lockdown, social distancing, etc.) and estimate the economic and social losses of the epidemic [1]. Scholars have been committed to solving the problems of COVID-19 incidence prediction and epidemiological modeling, and proposed epidemiological models (SIR [2], SEIR [3,4], SIRD [5], phenomenology [6], etc.), time series models (autoregressive models [7,8], exponential models [9], regression model [10,11], Prophet model [12], etc.), machine learning model (based on regression tree [13], LSTM [14], polynomial neural network [15], ANFIS [16], SVM [17], etc.) and other types of models [18]. ...
Article
The COVID-19 outbreak poses a huge challenge to international public health. Reliable forecast of the number of cases is of great significance to the planning of health resources and the investigation and evaluation of the epidemic situation. The data-driven machine learning models can adapt to complex changes in the epidemic situation without relying on correct physical dynamics modeling, which are sensitive and accurate in predicting the development of the epidemic. In this paper, an ensemble hybrid model based on Temporal Convolutional Networks (TCN), Gated Recurrent Unit (GRU), Deep Belief Networks (DBN), Q-learning, and Support Vector Machine (SVM) models, namely TCN-GRU-DBN-Q-SVM model, is proposed to achieve the forecasting of COVID-19 infections. Three widely-used predictors, TCN, GRU, and DBN are used as elements of the hybrid model ensembled by the weights provided by reinforcement learning method. Furthermore, an error predictor built by SVM, is trained with validation set, and the final prediction result could be obtained by combining the TCN-GRU-DBN-Q model with the SVM error predictor. In order to investigate the forecasting performance of the proposed hybrid model, several comparison models (TCN-GRU-DBN-Q, LSTM, N-BEATS, ANFIS, VMD-BP, WT-RVFL, and ARIMA models) are selected. The experimental results show that: (1) the prediction effect of the TCN-GRU-DBN-Q-SVM model on COVID-19 infection is satisfactory, which has been verified in three national infection data from the UK, India, and the US, and the proposed model has good generalization ability; (2) in the proposed hybrid model, SVM can efficiently predict the possible error of the predicted series given by TCN-GRU-DBN-Q components; (3) the integrated weights based on Q-learning can be adaptively adjusted according to the characteristics of the data in the forecasting tasks in different countries and multiple situations, which ensures the accuracy, robustness and generalization of the proposed model.
... It should also be mentioned that this approach is not widely applied concerning time-series forecasting tasks, while it offers continuous optimisation and adaptability, as the added-value of the RL model. (ii) Given that for "small" sized time-series datasets, finding an ML/DL forecasting model that offers qualitative forecasts is a major challenge in machine learning as stated in Fong et al. (2020), the presented approach proposes the utilization of surrogate data as an enhanced approach providing more accurate results. (iii) A custom meta-model (based on the statistical features of each time-series) was leveraged as an extra component in the downstream pipeline in order to introduce a hybrid model for further optimising the performance of the framework. ...
Article
Full-text available
In various application domains/sectors, data collected from the respective industries are complemented with open data providing added value to the overall analysis and decision making process. Open data refer to weather data, transportation information, stock/investment products prices, or even health-related data. One of the application domains that could harvest the added-value of analytics (including open-data) refers to the food industry and more specifically the decisions related to food recalls. The collected data can be analyzed in real-time through Artificial Intelligence techniques and obtain insights about potential unsafe goods and products. These insights are exploited to drive decision making, such as which goods are more probable to be harmful in the near future and subsequently optimize the food supply chain. The latter reflects the overall food recall process monitoring and is enhanced through a data-driven forecasting approach. This provides actionable insights regarding the enhancement of the food safety across the food supply chain given that goods and products can become unsafe for plenty of reasons, such as mislabeling allergens, contamination etc. To address this challenge, this paper introduces a deep learning approach leveraging Natural Language Processing and Time-series Forecasting techniques, to monitor and analyze the risk associated with each food product category and the corresponding potential recalls. Furthermore, we propose a technique that exploits reinforcement learning to utilize historical recall announcements of food products for predicting their future recalls, thus providing insights to food companies regarding upcoming trends in food recalls that can lead to timely recalls. We also evaluate and demonstrate the effectiveness and added-value of the proposed approaches through a real-world scenario that yields promising results. While several techniques/models have been analyzed and applied to address the challenge of food recall predictions, the usage of analogous/surrogate data has also been studied and evaluated towards more accurate outcomes.
... As a result, decision makers are benefited from a better fitted MC outputs complemented by min-max rules that foretell about the extreme ranges of future possibilities with respect to the epidemic. In another work [132] Fong et al. used traditional time series data analysis methods (such as ARIMA, Exponential, and Holt-Winters), ML methods (such as KR, SVM, and DT), and AI methods (such as PNN) to analyze and predict future outbreaks. ...
Article
The outbreak of novel corona virus 2019 (COVID-19) has been treated as a public health crisis of global concern by the World Health Organization (WHO). COVID-19 pandemic hugely affected countries worldwide raising the need to exploit novel, alternative and emerging technologies to respond to the emergency created by the weak health-care systems. In this context, Artificial Intelligence (AI) techniques can give a valid support to public health authorities, complementing traditional approaches with advanced tools. This study provides a comprehensive review of methods, algorithms, applications, and emerging AI technologies that can be utilized for forecasting and diagnosing COVID-19. The main objectives of this review are summarized as follows. (i) Understanding the importance of AI approaches such as machine learning and deep learning for COVID-19 pandemic; (ii) discussing the efficiency and impact of these methods for COVID-19 forecasting and diagnosing; (iii) providing an extensive background description of AI techniques to help non-expert to better catch the underlying concepts; (iv) for each work surveyed, give a detailed analysis of the rationale behind the approach, highlighting the method used, the type and size of data analyzed, the validation method, the target application and the results achieved; (v) focusing on some future challenges in COVID-19 forecasting and diagnosing.
... A hybridized deep learning with fuzzy rule method was used to forecast COVID-19 outbreak included limited data in early Composite Monte-Carlo (Fong et al. 2020), and the prediction of the COVID-19 peaks and size of the outbreak was performed using modified SEIR-LSTM model . The major issue in using AI in the COVID-19 outbreak globally is the limitation of available data and datasets, but as huge relevant data is available, the accuracy of AI-based models will greatly improve. ...
Article
We present a novel method for forecasting with limited information, that is for forecasting short time series. Our method is simple and intuitive; it relates to the most fundamental forecasting benchmark and is straightforward to implement. We present the technical details of the method and explain the nuances of how it works via two illustrative examples, with the use of employment‐related data. We find that our new method outperforms standard forecasting methods and thus offers considerable utility in applied management research. The implications of our findings suggest that forecasting short time series, of which one can find many examples in business and management, is viable and can be of considerable practical help for both research and practice – even when the information available to analysts and decision‐makers is limited.
Article
Full-text available
The Indonesian beef consumption increases sharply during Ramadan and made a difference between supply and demand. The research aimed to study the demand pattern of burger patties and determine a suitable forecasting method compared between quantitative and intervention forecasting methods. The actual demand was intervened by experts based on reasons such as supply shortage, holidays, promotion, and government projects. The daily sales of burger patties were collected for a year. Then, the data were divided into training and testing data. Later, time-series forecasting was performed by software. Then, the best forecasting method for daily data was selected between Individual forecasting and Top-Down forecasting. Similarly, for weekly data, the best forecasting method was compared between aggregate forecasting and Bottom-Up forecasting. Then, repeat the process for the intervened sales data. The result revealed that the mean absolute percentage error was improved after intervention by about 3.64%-58.83%. The combination of quantitative and qualitative approaches improved forecast accuracy. In addition, the aggregate level or weekly sales forecast had higher forecast accuracy than the disaggregated level. The Bottom-Up forecast performs better than the aggregate forecast. Hence, we recommended the company plans based on weekly data and implement Every Low Price to reduce the demand fluctuation.
Article
Full-text available
In this paper, a new application of ridge polynomial based neural network models in multivariate time series forecasting is presented. The existing ridge polynomial based neural network models can be grouped into two groups. Group A consists of models that use only autoregressive inputs, whereas Group B consists of models that use autoregressive and moving-average (i.e., error feedback) inputs. The well-known Box-Jenkins gas furnace multivariate time series was used in the forecasting comparison between the two groups. Simulation results show that the models in Group B achieve significant forecasting performance as compared to the models in Group A. Therefore, the Box-Jenkins gas furnace data can be modeled better using neural networks when error feedback is used.
Article
Full-text available
Depression is a burdensome psychiatric disease common in low and middle income countries causing disability, morbidity and mortality in late life. In this study, we demonstrate a novel approach for detection of depression using clinical data obtained from the on-going Mysore Studies of Natal effects on Ageing and Health (MYNAH), in South India where the members have undergone a comprehensive assessment for cognitive function, mental health and cardiometabolic disorders. The proposed model is developed using machine learning approach for classification of depression using Meta-Cognitive Neural Network (McNN) classifier with Projection-based learning (PBL) to address the self-regulating principles like how, what and when to learn. XGBoost is used for feature selection on the available data of assessments with improved confidence. To improve the efficiency of McNN-PBL classifier the best parameters are found using Particle Swarm Optimization (PSO) algorithm. The results indicate that the McNNPBL classifier selects appropriate records to learn and remove repetitive records which improve the generalization performance. The study helps the clinician to identify the best parameters to analyze the patient.
Article
Full-text available
Sales forecasting allows firms to plan their production outputs, which contributes to optimizing firms' inventory management via a cost reduction. However, not all firms have the same capacity to store all the necessary information through time. So, time-series with a short length are common within industries, and problems arise due to small time series does not fully capture sales' behavior. In this paper, we show the applicability of neural networks in a case where a company reports a short time-series given the changes in its warehouse structure. Given the neural networks independence form statistical assumptions, we use a multilayer-perceptron to get the sales forecasting of this enterprise. We find that learning rates variations do not significantly increase the computing time, and the validation fails with an error minor to five percent.
Article
Full-text available
The emerging era of big data for past few years has led to large and complex data which needed faster and better decision making. However, the small dataset problems still arise in a certain area which causes analysis and decision are hard to make. In order to build a prediction model, a large sample is required as a training sample of the model. Small dataset is insufficient to produce an accurate prediction model. This paper will review an artificial data generation approach as one of the solution to solve the small dataset problem.
Article
Full-text available
Motivation: Single-centre studies in medical domain are often characterised by limited samples due to the complexity and high costs of patient data collection. Machine learning methods for regression modelling of small datasets (less than 10 observations per predictor variable) remain scarce. Our work bridges this gap by developing a novel framework for application of artificial neural networks (NNs) for regression tasks involving small medical datasets. Methods: In order to address the sporadic fluctuations and validation issues that appear in regression NNs trained on small datasets, the method of multiple runs and surrogate data analysis were proposed in this work. The approach was compared to the state-of-the-art ensemble NNs; the effect of dataset size on NN performance was also investigated. Results: The proposed framework was applied for the prediction of compressive strength (CS) of femoral trabecular bone in patients suffering from severe osteoarthritis. The NN model was able to estimate the CS of osteoarthritic trabecular bone from its structural and biological properties with a standard error of 0.85MPa. When evaluated on independent test samples, the NN achieved accuracy of 98.3%, outperforming an ensemble NN model by 11%. We reproduce this result on CS data of another porous solid (concrete) and demonstrate that the proposed framework allows for an NN modelled with as few as 56 samples to generalise on 300 independent test samples with 86.5% accuracy, which is comparable to the performance of an NN developed with 18 times larger dataset (1030 samples). Conclusion: The significance of this work is two-fold: the practical application allows for non-destructive prediction of bone fracture risk, while the novel methodology extends beyond the task considered in this study and provides a general framework for application of regression NNs to medical problems characterised by limited dataset sizes.
Article
Full-text available
Forecasting of prices of commodities, especially those of agricultural commodities, is very difficult because they are not only governed by demand and supply but also by so many other factors which are beyond control, such as weather vagaries, storage capacity, transportation, etc. In this paper time series models namely ARIMA (Autoregressive Integrated Moving Average) methodology given by Box and Jenkins has been used for forecasting prices of Groundnut oil in Mumbai. This approach has been compared with ANN (Artificial Neural Network) methodology. The results showed that ANN performed better than the ARIMA models in forecasting the prices.
Book
Full-text available
Modeling and forecasting of time series data has fundamental importance in various practical domains. The aim of this book is to present a concise description of some popular time series forecasting models with their salient features. Three important classes of time series models, viz. stochastic, neural networks and support vector machines are studied together with their inherent forecasting strengths and weaknesses. The book also meticulously discusses about several basic issues related to time series analysis, such as stationarity, parsimony, overfitting, etc. Our study is enriched by presenting the empirical forecasting results, conducted on six real-world time series datasets. Five performance measures are used to evaluate the forecasting accuracies of different models as well as to compare the models. For each of the six time series datasets, we further show the obtained forecast diagram which graphically depicts the closeness between the original and predicted observations.
Article
Full-text available
Neural network modeling for small datasets can be justified from a theoretical point of view according to some of Bartlett’s results showing that the generalization performance of a multilayer perceptron (MLP) depends more on the L1 norm of the weights between the hidden layer and the output layer rather than on the total number of weights. In this article we investigate some geometrical properties of MLPs and drawing on linear projection theory, we propose an equivalent number of degrees of freedom to be used in neural model selection criteria like the Akaike information criterion and the Bayes information criterion and in the unbiased estimation of the error variance. This measure proves to be much smaller than the total number of parameters of the network usually adopted, and it does not depend on the number of input variables. Moreover, this concept is compatible with Bartlett’s results and with similar ideas long associated with projection-based models and kernel models. Some numerical studies involving both real and simulated datasets are presented and discussed.
Article
Artificial neural networks (ANNs) are usually considered as tools which can help to analyze cause-effect relationships in complex systems within a big-data framework. On the other hand, health sciences undergo complexity more than any other scientific discipline, and in this field large datasets are seldom available. In this situation, I show how a particular neural network tool, which is able to handle small datasets of experimental or observational data, can help in identifying the main causal factors leading to changes in some variable which summarizes the behaviour of a complex system, for instance the onset of a disease. A detailed description of the neural network tool is given and its application to a specific case study is shown. Recommendations for a correct use of this tool are also supplied.
Article
This paper presents simple criteria which can be used to select which of the three members of the Johnson System of distributions should be used for fitting a set of data. The paper also presents elementary formulas for estimating the parameters for each of the members of the family. Thus, many obstacles to the use of the Johnson System are resolved.