Article

Predicting Time Series with Support Vector Machines

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In SVR, the input vectors are mapped into high-dimensional feature space by nonlinear mapping U where linear regression occurs to find a relationship between input and output vectors [46]. SVR algorithm assigns a linear hyperplane, called a decision boundary, to estimate input and output data relation, which is then used to predict future values represented in Eq. (10) [47], ...
... R reg (f) results from model complexity and training errors and should be kept as low as possible. Vapnik's e-insensitive loss function is defined by Eq. (12) [46]. ...
Article
Full-text available
Increasing photovoltaic (PV) instalments could affect the stability of the electrical grid as the PV produces weather-dependent electricity. However, prediction of the power output of the PV panels or incoming radiation could help to tackle this problem. It has been concluded within the European Actions “Weather Intelligence for Renewable Energies” framework that more research is needed on short-term energy forecasting using different models, locations and data for a complete overview of all possible scenarios around the world representing all possible meteorological conditions. On the other hand, for the Mediterranean region, there is a need for studies that cover a larger spectrum of forecasting algorithms. This study focuses on forecasting short-term GHI for Kalkanli, Northern Cyprus, while aiming to contribute to ongoing research on developing prediction models by testing different hybrid forecasting algorithms. Three different hybrid models are proposed using convolutional neural network (CNN), long short-term memory (LSTM) and support vector regression (SVR), and the proposed hybrid models are compared with the performance of stand-alone models, i.e. CNN, LSTM and SVR, for the short-term GHI estimation. We present our results with several evaluation metrics and statistical analysis. This is the first time such a study conducted for GHI prediction.
... The advantage of MICE is that the results are calculated after a few iterations, and in most cases, five iterations are sufficient [49], [51]. The MICE algorithm procedure for filling multivariate missing data is summarised as follows: Fig. 7. ...
... The MICE algorithm procedure for filling multivariate missing data is summarised as follows: Fig. 7. The procedure of the MICE algorithm is suggested by [49], [51]. ...
Article
Full-text available
Missing value in hydrological research is common, and there is a growing interest to recover missing streamflow data as accurate information is required for various purposes. Due to missing data limitations, this study aims to evaluate the performance of the RNN-based method compared to the non-RNN based imputation methods to predict recurrence in a streamflow dataset. In this study, daily streamflow datasets from Malaysia's Langat River Basins were used. Following that, the datasets were fed into the Multiple Linear Regression (MLR) model. The validation of the best estimation methods was performed based on the estimation error, using methods such as Nash-Sutcliffe Efficiency Coefficient (CE), Mean Absolute Percentage Error (MAPE), and Root Mean Squared Error (RMSE). The findings revealed that the RNN-based method coupled with MLR (BRNN-MLR) outperformed all the approaches examined for filling missing values in streamflow datasets, with the highest CE value and lowest MAPE and RMSE value regardless of any missing data conditions.
... But for the BPNN algorithm, it requires a large amount of data, and the prediction result of small samples is not ideal. For small sample predictions, the SVR [13], [14]algorithm performs better. Therefore, in recent years, many scholars have used SVR algorithms to predict time series data and have achieved some results. ...
... The predicting performance is measured by MAPE, WAPE [21] and NMAE [22] are shown as Eqs. (12)(13)(14). ...
... Other approaches to time series prediction include support vector regression. Müller et al. [68] used support vector regression (SVR) for time series forecasting on benchmark problems. Lau et al. [69] implemented SVR for Sunspot time series forecasting with better results than the radial basis function network in relatively long-term prediction. ...
Article
Full-text available
The gross domestic product (GDP) is the most widely used indicator in macroeconomics and the main tool for measuring a country’s economic output. Due to the diversity and complexity of the world economy, a wide range of models have been used, but there are challenges in making decadal GDP forecasts given unexpected changes such as pandemics and wars. Deep learning models are well suited for modelling temporal sequences and time series forecasting. In this paper, we develop a deep learning framework to forecast the GDP growth rate of the world economy over a decade. We use the Penn World Table as the data source featuring 13 countries prior to the COVID-19 pandemic, such as Australia, China, India, and the United States. We present a recursive deep learning framework to predict the GDP growth rate in the next ten years. We test prominent deep learning models and compare their results with traditional econometric models. We predict that most countries will experience economic growth slowdown, stagnation or even recession within five years. We predict that only China, France and India are predicted to experience stable, or increasing, GDP growth.
... Numerous studies have explored the use of Support Vector Machines (SVM) to predict air quality and pollutant levels. The work of Muller et al. (1997) indicates that SVR outperforms Artificial Neural Networks (ANN) in performance. Noteworthy contributions in the application of SVM models to air quality prediction include the research by Cao (2003), Wang (2005), and Sotomayor (2013). ...
Article
Full-text available
n recent years, global warming and air pollution are the two important environmental issues that threaten both developed and developing countries. Global warming leads to the gradual melting of polar ice caps, resulting in elevated sea levels that, in turn, trigger floods. This phenomenon also exerts adverse effects on ecosystems and causes significant harm to agriculture and fishing industries. The accurate prediction of air pollutant levels significantly contributes to effective air quality management and the protection of the population from the adverse impacts of pollution. This paper proposes a hybrid regression model based on Regression Tree (RT) and Support Vector Regression (SVR) to improve the prediction accuracy. The proposed hybrid model also overcomes a major disadvantage of RT and has the ability to handle fluctuations and changes in both spatial and temporal dimensions. This paper mainly focuses on the identification, estimation and accurate prediction of pollutants level and Temperature of Delhi. The results of the hybrid model can be used as a guide to predict the Temperature. The SVM time series model is employed to predict levels of significant air pollutants. The results show that SO2, NO2, PM2.5, O3, PM10, and Benzene levels are increasing in the upcoming days which can cause a variety of adverse health outcomes. Further, Temperature and CO show a decreasing trend.
... SVRMs attempt to define a hyperplane that would fit the largest possible amount of data points [21]. SVRM has also been widely used for time series forecasting, showing that the algorithm can be adapted to a wide range of time series tasks, in part due to its flexibility of using different kernels which allows it to understand non-linear relationships by mapping the input data to higher dimensions [24], [25]. ...
Article
Full-text available
This analysis aims to provide an overview of potential machine learning algorithms that may aid the aviation industry in predicting future air passenger traffic flow, which can help increase stakeholder value as well as improve customer experiences. A review and discussion of the aviation industry’s past, current, and future challenges is provided, as well as an overview of machine learning algorithms, neural networks, and learning methods. Further, an overview and discussion of the architecture of the Long Short-Term Memory (LSTM) network, Support Vector Regression Machine (SVRM), and Random Forest (RF) algorithms is provided. The comparative analysis provides an overview and comparison of the performance of the LSTM, SVRM, and RF models based on Mean Squared Error (MSE) and Root Mean Squared Error (RMSE). The dataset used includes the hourly number of passengers from scheduled flights at Oslo Airport Gardermoen for the period of January 1, 2009, to December 31, 2019, including the datetime features such as Time (hour), day, month, and year, as well as the weather features of air temperature and mean wind speed, with a total of 96185 samples. The Long Short-Term Memory model exhibited the highest generalization ability, with a performance evaluation on the testing dataset of 0.00445/0.06667 MSE/RMSE. Additionally, the performance of the SVRM and RF models on the testing dataset is 0.00511/0.07147 and 0.00543/0.07368 MSE/RMSE respectively. In addition to the performance, each of the models’ complexity, stability, and ability to predict the hourly and daily fluctuations of passengers are discussed.
... However, since there are so many parameters to adjust and the user has no prior knowledge regarding the importance of the inputs in the investigated situation, it has an overfitting problem [4]. In addition, SVM model is one of the most useful algorithms for data classification [5], regression [6], and prediction [7], which avoids such limitations suffered from ANN models [8]. The success of using SVM to predict stock prices is supported by the strong theoretical underpinnings based on VC-theory [9]. ...
Article
Full-text available
One of the most significant operations in the finance sector is stock trading. The stock market is an essential part in the economy of a country and serves as the indicators of the situation of a country’s economy as the stock prices go up or down. Therefore, stock price prediction, the behavior of attempting to predict the potential worth of a corporation or any financial instruments successfully, will maximize investor’s gain, enhance market’s confidence, and help government policymakers to make economic decisions. In order to forecast the price of a stock, a machine learning approach is constructed in this study. The suggested algorithm includes random forest, support vector machine (SVM), and least square support vector machine (LS-SVM). In particular, the random forest is employed to select the most important features from the technical indicators calculated for stock price prediction. The SVM and the LS-SVM models are employed to predict the daily stock prices. Besides, R-Squared (R²), mean squared error (MSE) and mean absolute error (MAE) are used for model evaluation. According to the results, both SVM and LS-SVM models can predict stock price well, but both algorithms are not suitable for large datasets, and overfitting problem exists. These results shed light on guiding further exploration of stock price predictions.
... Regarding our technical manipulations, it can be inferred that the SVR technique generates the smallest errors and the biggest R squared. This technique can be robust and effective for time series prediction, especially for financial and economic series [7,8]. ...
... Support Vector Machine (SVM) Model SVM is a well-known ML approach that is dependent on mathematical learning theory. It is useful for classifying large amounts of data, identifying features, and performing regression analyses [62]. From the datasets (x, y) provided, SVR sought to create functions where x is the input parameter and y is the output parameter of the "IWQIs". ...
Article
Full-text available
Agriculture has significantly aided in meeting the food needs of growing population. In addition, it has boosted economic development in irrigated regions. In this study, an assessment of the groundwater (GW) quality for agricultural land was carried out in El Kharga Oasis, Western Desert of Egypt. Several irrigation water quality indices (IWQIs) and geographic information systems (GIS) were used for the modeling development. Two machine learning (ML) models (i.e., adaptive neuro-fuzzy inference system (ANFIS) and support vector machine (SVM)) were developed for the prediction of eight IWQIs, including the irrigation water quality index (IWQI), sodium adsorption ratio (SAR), soluble sodium percentage (SSP), potential salinity (PS), residual sodium carbonate index (RSC), and Kelley index (KI). The physicochemical parameters included T • , pH, EC, TDS, K + , Na + , Mg 2+ , Ca 2+ , Cl − , SO 4 2− , HCO 3 − , CO 3 2− , and NO 3 − , and they were measured in 140 GW wells. The hydrochemical facies of the GW resources were of Ca-Mg-SO 4 , mixed Ca-Mg-Cl-SO 4 , Na-Cl, Ca-Mg-HCO 3 , and mixed Na-Ca-HCO 3 types, which revealed silicate weathering, dissolution of gypsum/calcite/dolomite/ halite, rock-water interactions, and reverse ion exchange processes. The IWQI, SAR, KI, and PS showed that the majority of the GW samples were categorized for irrigation purposes into no restriction (67.85%), excellent (100%), good (57.85%), and excellent to good (65.71%), respectively. Moreover, the majority of the selected samples were categorized as Water 2023, 15, 694. https://doi.org/10.3390/w15040694 https://www.mdpi.com/journal/water Water 2023, 15, 694 2 of 26 excellent to good and safe for irrigation according to the SSP and RSC. The performance of the simulation models was evaluated based on several prediction skills criteria, which revealed that the ANFIS model and SVM model were capable of simulating the IWQIs with reasonable accuracy for both training "determination coefficient (R 2)" (R 2 = 0.99 and 0.97) and testing (R 2 = 0.97 and 0.76). The presented models' promising accuracy illustrates their potential for use in IWQI prediction. The findings indicate the potential for ML methods of geographically dispersed hydrogeochemical data, such as ANFIS and SVM, to be used for assessing the GW quality for irrigation. The proposed methodological approach offers a useful tool for identifying the crucial hydrogeochemical components for GW evolution assessment and mitigation measures related to GW management in arid and semi-arid environments.
... In order to evaluate the effectiveness of our model on the task of key positions prediction, we compared the (11) precision =tp/(tp + fp) recall =tp/(tp + fn) F 1 =2 * (precision * recall)/(precision + recall) MF-GCN-LSTM with SVM [35,36], the Static GCN [12] and the Time Series LSTM model [32]. Table 2 represents the feature extraction ability of different models for projects in smart grid. ...
Article
Full-text available
In this article, we solve the key positions prediction problem of engineering projects in smart grid, which pays more attention to the spatial-temporal distribution of projects. Many studies show that the projects are affected by multi-dimensional features such as time, space, correlation etc. However, few work can accurately predict the key positions of projects based on multi-dimensional features. In order to solve this problem, we propose the idea of multi-feature extraction, and make use of the real-world records trace to conduct multi-dimensional modeling. Then we introduce a multi-dimensional features extraction model: Multi-Feature-based GCN-LSTM (MF-GCN-LSTM) to take the effect of time, space and correlation for predicting the key positions of projects. Experiments on different datasets with various project types have proved that our model can complete the key positions prediction task efficiently. Compared with the other traditional method and non-linear models, our model shows higher prediction accuracy and robustness. Moreover, we show that the whole prediction framework MF-GCN-LSTM can be split and deployed in a distributed manner to accelerate the inference of the model under the cloud edge system.
... Comprehensive training on SVM classifiers is provided by Burges (1998). Also in regression applications and time series prediction, very good performances were achieved quickly Mattera et al., 1999;Müller et al., 1997;Stitson et al., 1999). SVM has now become an active research topic (Cherkassky and Moliere, 1998;Haykin, 1998;Hearst et al., 1998). ...
Chapter
One of the most useful research fields with many real-life applications, such as in water science, is the subject of data mining. Data mining (DM) is considered a process to extract valuable data from a wide range of information stored in various databases. The data is categorized into the form of patterns, associations, changes, anomalies and significant structures. In water recourses management and environmental engineering, predicting and modelling parameters play an integral role in decision making. The most critical freshwater water resource for millions of people worldwide are rivers with a dynamic nature (floods/droughts), in terms of available freshwater quantity and quality. With various basin characteristics, river flow and sediment regime may be influenced by natural processes such as erosion and sediment transport as well as anthropogenic factors such as urban stormwater runoff and semi-treated sanitary/industrial sewage discharge. Therefore, artificial intelligence (AI) techniques are used to decrease model development costs and improve prediction errors, achieving more efficient models. In this chapter, some well-known techniques and AI-based methods are introduced, and their applications are elaborated. The models are comprised of extreme learning machine (ELM), least square support vector machine (LSSVM), genetic programming (GP), adaptive neural-fuzzy inference system (ANFIS), and multivariate adaptive regression spline (MARS). Each technique, then, is illustrated with a brief literature review. Having being evaluated in terms of the basic concept, the methods are addressed based on a mathematical statement. In the last part, the pseudocode of the ways, an acceptable guideline for coding the methods, is pointed out. This chapter is collected for graduate students, researchers, educators, and practitioners interested in engineering optimization.
... Comprehensive training on SVM classifiers is provided by Burges (1998). Also in regression applications and time series prediction, very good performances were achieved quickly (Drucker et al., 1997;Mattera et al., 1999;Müller et al., 1997;Stitson et al., 1999). SVM has now become an active research topic (Cherkassky and Moliere, 1998;Haykin, 1998;Hearst et al., 1998). ...
Chapter
Another supervised learning method is the support vector machine (SVM) method. This method, like the artificial neural network method, is one of the basic data methods that can classify or predict data after the training process. This theory was later used as a powerful tool for classifying data in various sciences especially in water and environmental science. In this chapter, before examining and studying the details of the SVM model, first the basics and basic concepts of classification and then the SVM model are stated. After that, the details and relations of the ruler and the types of functions used in it are discussed. At the end of this chapter, various software applications for this tool will be introduced, and how some of the work will be described.
... It was originally developed by Cortes and Vapnik (1995) for the object recognition tasks. Later it was expanded for regression applications for its excellent performance (Müller et al. 1997;Mattera and Haykin 1999). SVM uses street like boundaries that separate or regress data points. ...
Article
Full-text available
Permeability is the most important petrophysical attribute for analyzing fluid flow behavior. So far, no universal approach can provide an accurate and reliable estimation of permeability for an entire hydrocarbon reservoir. The present study utilizes five empirical, three statistical, and three connectionist methods to estimate the permeability of a heterogeneous oil reservoir. The empirical models include ‘Tixier’, ‘Morris and Biggs’, ‘Timur’, ‘Coates and Dumanoir’, and ‘Coates and Denoo’. The statistical methods incorporate ‘multiple variable regression (MVR)’, ‘gaussian process regression (GPR)’, and ‘bagged tree (BT)’. The connectionist techniques are ‘support vector machine (SVM)’, ‘convolutional neural network (CNN)’, and ‘feed-forward backpropagation artificial neural network (ANN) with training algorithms Levenberg–Marquardt (LM), Bayesian Regularization (BR), and Scaled Conjugate Gradient (SCG)’. Prediction efficiency of study methods are compared using six statistical indexes, such as regression coefficient, mean squared error, root mean squared error, average absolute error percentage, minimum absolute error percentage, and maximum absolute error percentage. Ranking of the log variables based on their importance in permeability modeling has been performed. To achieve the objectives, 439 data points comprising of laboratory derived core permeability information and seven well log parameters, namely gamma ray (GRGR), bulk density (RHOBRHOB), sonic travel time (DTDT), true resistivity (LLDLLD), neutron porosity (φNφN\varphi_{N}), NMR porosity, and bulk volume of irreducible fluid are selected from a Jeanne d’Arc Basin’s reservoir. All these methods are tested on different data sets of study wells to confirm the reproducibility of the results. The results of the statistical indexes analysis imply that the empirical relationships are inappropriate for a heterogeneous reservoir as they provide a poor match with real data. However, the ‘Coates and Dumanoir’ model provides a relatively better match with core permeability among all five empirical approaches. The MVR is less efficient among statistical models, BT is reasonably efficient, and GPR is highly efficient. Amidst soft computing techniques, SVM, ANN with LM, and ANN with BR show very high efficiency, whereas ANN with SCG is moderately acceptable, and CNN provides extremely poor efficiency. A comprehensive comparison among all studied models shows that the best predictor is ANN with BR as it provides an excellent match between predicted and real data, and it requires only 14.9 s to process the data. Both statistical and connectionist methods imply that the GRGR and φNφN\varphi_{N} are the most vital log parameters in permeability modeling, whereas DTDT, RHOBRHOB, and LLDLLD are the least important predictor variables. The outcomes of this study will help engineers and researchers to apply an accurate permeability prediction tool in petroleum industries during the exploration phase to obtain accurate reservoir permeability data, correct analysis of fluid flow behavior, accurate reservoir characterization, and reduced uncertainty associated with a reservoir evaluation. Article highlightsEmpirical models are incapable of predicting permeability accurately. SVM, ANN with LM, and ANN with BR are the most efficient predictive methods. An accurate and cost-effective permeability prediction strategy is achieved.
... Support Vector Regression can be understood as a generalization of Support Vector Machines for regression tasks. The good performance of Support Vector Regression on predicting time series data, albeit mostly applied on micro financial time series (see for example Müller et al. (1997) and Crone, Hibon, and Nikolopoulos (2011)), makes it an interesting candidate for an application on macroeconomic time series data. Smola and Schölkopf (2004) give an excellent overview of the technical details related to the estimation of SVR algorithms. ...
Preprint
Full-text available
Macroeconomic forecasters often show a poor track record in producing reliable predictions. Especially times of economic crises have been consistently missed by traditional forecasting models, which is why the global financial crisis of 2007-2009 caught many people by surprise. Theory-based models and pure time series formulations, both traditionally used in macroeconomic forecasting, suffer the curse of dimensionality when being confronted with a high dimensional space of possible predictors. In times of increasing data availability, this opens the way to rethink macroeconomic forecasting. This paper analyzes the application of machine learning methods in forecasting real GDP growth. Machine learning poses a natural extension to the more traditional models because it is designed to extract information from high dimensional feature spaces. Moreover, machine learning methods are nonlinear by construction which allows to capture nonlinear relations typically encountered among macroeconomic variables in recessions. Given the failure of existing forecasting models and the advantages of machine learning, the goal of this paper is to assess to which extent machine learning algorithms may contribute to the challenge of macroeconomic forecasting.
... SVM models are closely related to neural networks. A detailed principles and algorithms of SVM can be found in Müller et al. (1997). The basic idea is to map the data x into a high dimensional feature space via a nonlinear mapping π and to do linear regression in this space (Boser et al. 1992;Vapnik 1995). ...
Article
Full-text available
Evaporation, which is one of the most important component of hydrological cycle, is under the effect of many dynamic factors. Due to its complex structure, it is a difficult parameter to predict. In this study, estimation of evaporation was performed using support vector machines. Different input combinations of metrological data including maximum (Max. Temp), minimum (Min. Temp) air temperature, relative humidity (RH), wind speed (WS) and sunshine hours (SH) were used to estimate evaporation (Evap). Support vector Regression models with different kernel functions were tried and their performance was evaluated using statistical tests Root Mean Square Error (RMSE), Mean Absolute Error (MAE) , Mean Square Error (MSE) and the coefficient of determination (R 2). According to performance criterion, the most successful model for evaporation estimation was determined as Model ε-SVR M-1 with R 2 (0.85) with radial basis kernel function.
... Hence, SVM algorithm abilities depend on suitable kernel functions. SVM is being widely used because of its robust performance in classification, (Pal and Mather 2005), and regression (Burges 1998) to obtain least regression error, particularly in applications like time-series predictions (Müller et al. 1997). By using Sentinel-2 image data, the robustness of RF, SVM, and kNN algorithms was compared in LULC classification, and reported that SVM algorithm produced higher overall accuracy with minimum sensitivity to the given training datasets, followed by RF, and kNN algorithms (Noi and Kappas 2018). ...
Chapter
Numerous technological advancements have assisted to secure the vital key for scientific predictions in various fields, and soil science is no exception. Soil has always been chosen as an indispensable component by scientists, environmentalists, and policy makers for shaping a sustainable present and a secure future. Evidently, a huge amount of soil spatial data is required in the process to attain the desired outcomes. However, the challenges posed by the paucity of time and resources pose the hurdle in the collection and analysis of soil information. State-of-art technologies like Machine Learning (ML) and Big Data come as saviors to address those challenges. ML is helping to quantify, predict, identify, and classify the soil resources. The advanced algorithms, and models helped to gain better insights into soil mapping along with widening the perspective for its better management. Digital Soil Mapping (DSM), ML integrated with spectroscopic soil studies is gaining momentum among the scientific communities. These innovative approaches have the capabilities to solve global issues like desertification, ecological stability, carbon pool management and climate mitigation in a holistic and integrated way by keeping the soil as one of the key parameters. In this chapter, an attempt has been made to present a comprehensive overview of ML algorithms, which have been adopted by many researchers across the globe in prediction of various soil properties and presented a case study on digital soil mapping by using ML algorithms.
... I Nyoman Setiawan et al. / Procedia Computer Science 179 (2021)[17][18][19][20][21][22][23][24] ...
Article
Full-text available
Support Vector Regression (SVR) is often used in forecasting. Adjustment of parameters in the SVR affects the results of forecasting. This study aims to analyze the SVR method that is optimized using Harris Hawks Optimization (HHO), hereinafter referred to as HHO-SVR. The HHO-SVR was evaluated using five benchmark datasets to determine the performance of this method. The HHO process is also compared based on the type of kernel and other metaheuristic algorithms. The results showed that the HHO-SVR has almost the same performance as other methods but is less efficient in terms of time. In addition, the type of kernel also affects the process and results.
... I Nyoman Setiawan et al. / Procedia Computer Science 179 (2021)[17][18][19][20][21][22][23][24] ...
Conference Paper
Full-text available
Support Vector Regression (SVR) is often used in forecasting. Adjustment of parameters in the SVR affects the results of forecasting. This study aims to analyze the SVR method that is optimized using Harris Hawks Optimization (HHO), hereinafter referred to as HHO-SVR. The HHO-SVR was evaluated using five benchmark datasets to determine the performance of this method. The HHO process is also compared based on the type of kernel and other metaheuristic algorithms. The results showed that the HHO-SVR has almost the same performance as other methods but is less efficient in terms of time. In addition, the type of kernel also affects the process and results. Abstract Support Vector Regression (SVR) is often used in forecasting. Adjustment of parameters in the SVR affects the results of forecasting. This study aims to analyze the SVR method that is optimized using Harris Hawks Optimization (HHO), hereinafter referred to as HHO-SVR. The HHO-SVR was evaluated using five benchmark datasets to determine the performance of this method. The HHO process is also compared based on the type of kernel and other metaheuristic algorithms. The results showed that the HHO-SVR has almost the same performance as other methods but is less efficient in terms of time. In addition, the type of kernel also affects the process and results.
... The defined ε-insensitive loss function allows data to remain within the margin by tolerating data with an error value in the range [ − ε, ε] in the regression model (Girma 2009). The ε-insensitive loss function is expressed mathematically as follows in Eq. 3 (Müller et al. 1997). ...
Article
Full-text available
Demand forecasts are used as input to planning activities and play an important role in the management of fundamental operations. Accurate demand forecasting is an important information for many organizations. It provides information for each stage of inventory management. In this study, multiple linear regression analysis, multiple nonlinear regression analysis, artificial neural networks and support vector regression were applied in a production facility that produces spare parts of construction machinery. The aim of the study is to forecast the number of spare parts requested in the future period by the customer as close as possible. As the input variables in the developed models, the sales amounts of the past years belonging to the manifold product group, which is one of the important spare parts of the construction machinery, number of construction machines sold in the world, USD exchange rate and monthly impact rate are used as input variables. The inputs of the model are designed according to construction machinery sector. In the model, monthly impact rate enables us to create more robust model. In addition, the estimation results have high accuracy by systematic parameter design of artificial intelligence methods. The data of the 9 years (from 2010 to 2018) were used in the application. Demand forecasts were conducted for 2018 to compare actual values. In forecasts, artificial neural network and support vector regression produced better results than regression methods. In addition, it was found that support vector regression forecasting produced better results in comparison to artificial neural network. __________________________________________________________________________________________
... Typically, since of the concept of basic hazard minimization utilized in SVMs, which has more prominent speculation potential and is prevalent to the rule of experimental chance minimization grasped by customary neural systems. SVMs have been effectively connected to different time arrangement prediction issues, such as yield esteem determining within the apparatus industry [7], motor unwavering quality prediction [8], and financial time arrangement prediction [9], [10]. By utilizing SVM for time series estimating, the effective utilize of SVM in time arrangement forecast propels our research work. ...
Article
Full-text available
This paper representing a study of supply chain operation data that was used on 100 different store items from 10 stores using 5 years history of sales through open sources contest to compare the performance of time-series forecasting model mainly, decomposition, Auto-Regressive Integrated Moving Average(ARIMA), Prophet, Box-Cox transformation. Here data is collected from 2013 to 2018 were used in real-time transaction at different store, initially model was applied on 2013 to 2017 data and based on the that predicted for 2018 then again cross checked with actual 2018 with proceed predicted data of 2018. To improve the performance and evaluation of the supply chain management system, scrutiny 3 metrices that will help to make decision on the model selection. The accuracy of the Machine learning model in forecasting future sales of supply chain store. Although the result on comparison indicates that there is no single method gives better and superior result. But present study indicates that prophet and ARIMA hybrid model gives better result compare to individual model.
... Recently, Support Vector Machines (SVM), a novel neural network algorithm developed by Vapnik and his colleagues is a focus research field in the world [14,16]. SVM method, which was first suggested by Vapnik has recently been used in a range of applications such as in data mining, classification, regression, and time series forecasting [10,7,13]. The SVM has become a hot topic of intensive study due to its successful application in classification tasks [17,3] and regression tasks [8,18], especially on time series prediction [2]. ...
Article
Full-text available
Support Vector Machines (SVM) has been a naval research field in scientific research for forecasting. This study deals with the application of SVM in financial time series predicting. This paper suggests a model of stock market prediction based on SVMs with appropriate parameter values. A data set of daily closing prices of five selected companies such as Alhaj Textiles Limited, Apex Tannery Limited, Jamuna Bank Limited, Padma Oil Company, and Square Pharmaceuticals Limited of the Dhaka Stock Exchange (DSE) from 01 January 2017 to 13 August 2019 was selected and uses these data to train the model and checks the predictive power of the model. The obtained results show that all the companies closing stock prices are non-stationary. Also the number of support vectors and mean square error is decreasing pattern with the increase of kernel parameter. It is also found that original data and predicted data are very much identical. The result shows that in all the cases SVM model has some predictive power it can be used to forecast financial time series. Several methods, such as SVM, ARIMA, single exponential smoothing, and double exponential smoothing, were performed to predict Bangladesh's stock market. Amazingly, the outcome shows the most efficient method to be Support Vector Machine because of its lowest forecasting errors.
... For the implementation of an LfD policy based on SVR modelling, we used Vapnik's E-insensitive time-series [48], where the aim is to identify a policy Π 0 by minimization of an E-function. In our case, this function is defined as ...
Article
Research on robotic manipulation of fragile, compliant objects, such as food items, is gaining traction due to its game-changing potential within the food production and retailing sectors, currently characterized by manually-intensive and highly repetitive tasks. Food products exhibit high levels of frailness, biological variation, and complex 3D shapes and textures. For these reasons, introducing greater levels of robotic automation in the food and agricultural sectors remains an important challenge. This paper addresses this challenge by developing a human-centred, haptic-based, Learning from Demonstration (LfD) policy that enables pre-trained autonomous grasping of food items using an anthropomorphic robotic system. The policy combines data from teleoperation and direct human manipulation of objects, embodying human intent and interaction areas of significance. We evaluated the proposed solution against a recent state-of-the-art LfD policy as well as against two standard impedance controller techniques. Results show that the proposed policy performs significantly better than the other considered techniques, leading to high grasping success rates while guaranteeing the integrity of the food at hand.
... The SVM has shown competitive generalization ability over many existing machine learning models in a number of fields, e.g. optical character recognition (OCR), object recognition, time series prediction, etc. [13], [18], [19], [20], [21]. The Support Vector Regression (SVR) is a powerful regression approach and successfully applied in numerous applications [22], [23], [24], [25]. ...
... Support Vector Machines (SVM) are a new kind of neural network algorithms proposed by Vapnik (Cortes et al., 1995, Vapnic et al., 1999, Muller et al., 1997. SVM has shown excellent prediction performance when applied to classification problem based on statistical learning theory. ...
Article
Full-text available
The photovoltaic (PV) system is always operated at the maximum power point (MPP) condition irrespective of the fluctuations in PV voltage. The maximum power point tracking (MPPT) employed in PV system is not effective during the presence of current ripple as normal tracking becomes increasingly complex during fluctuation in solar irradiation or due to change in MPP condition. This paper proposes a high-efficiency power point tracking algorithm to minimize the current ripple and power oscillation around the maximum power point. The developed algorithm is based on particle swarm optimization-support vector regression (PSO-SVR) technique. The proposed algorithm is implemented to select and tune the Support Vector Regression (SVR) parameters such as kernel parameters, variance, and the penalty factor for predicting the irradiation level as well as to determine the PV voltage corresponding of maximum power point. The PSO method is used to accelerate the process of optimizing the SVR parameters at different conditions and get knowledge about the corresponding global optimum. From the experimental results, the efficiency of maximum power point tracking is found to be 99.8%. The proposed algorithm PSO-SVR shows a better performance than using SVR alone. The stability and accuracy of MPPT have been validated during the rapid fluctuation of solar irradiation in the range of 25% to 100%.
... I Nyoman Setiawan et al. / Procedia Computer Science 179 (2021)[17][18][19][20][21][22][23][24] ...
Conference Paper
Full-text available
Support Vector Regression (SVR) is often used in forecasting. Adjustment of parameters in the SVR affects the results of forecasting. This study aims to analyze the SVR method that is optimized using Harris Hawks Optimization (HHO), hereinafter referred to as HHO-SVR. The HHO-SVR was evaluated using five benchmark datasets to determine the performance of this method. The HHO process is also compared based on the type of kernel and other metaheuristic algorithms. The results showed that the HHO-SVR has almost the same performance as other methods but is less efficient in terms of time. In addition, the type of kernel also affects the process and results.
Article
Full-text available
Time series forecasting is crucial in various domains, ranging from finance and economics to weather prediction and supply chain management. Traditional statistical methods and machine learning models have been widely used for this task. However, they often face limitations in capturing complex temporal dependencies and handling multivariate time series data. In recent years, deep learning models have emerged as a promising solution for overcoming these limitations. This paper investigates how deep learning, specifically hybrid models, can enhance time series forecasting and address the shortcomings of traditional approaches. This dual capability handles intricate variable interdependencies and non-stationarities in multivariate forecasting. Our results show that the hybrid models achieved lower error rates and higher R2R2R^2 values, signifying their superior predictive performance and generalization capabilities. These architectures effectively extract spatial features and temporal dynamics in multivariate time series by combining convolutional and recurrent modules. This study evaluates deep learning models, specifically hybrid architectures, for multivariate time series forecasting. On two real-world datasets - Traffic Volume and Air Quality - the TCN-BiLSTM model achieved the best overall performance. For Traffic Volume, the TCN-BiLSTM model achieved an R2R2R^2 score of 0.976, and for Air Quality, it reached an R2R2R^2 score of 0.94. These results highlight the model’s effectiveness in leveraging the strengths of Temporal Convolutional Networks (TCNs) for capturing multi-scale temporal patterns and Bidirectional Long Short-Term Memory (BiLSTMs) for retaining contextual information, thereby enhancing the accuracy of time series forecasting.
Article
Full-text available
The recent global warming effect has brought into focus different solutions for combating climate change. The generation of climate-friendly renewable energy alternatives has been vastly improved and commercialized for power generation. As a result of this industrial revolution, solar photovoltaic (PV) systems have drawn much attention as a power generation source for varying applications, including the main utility-grid power supply. There has been tremendous growth in both on- and off-grid solar PV installations in the last few years. This trend is expected to continue over the next few years as government legislation and awareness campaigns increase to encourage a shift toward using renewable energy alternatives. Despite the numerous advantages of solar PV power generation, the highly variable nature of the sun’s irradiance in different seasons of various geopolitical areas/regions can significantly affect the expected energy yield. This variation directly impacts the profitability or economic viability of the system, and cannot be neglected. To overcome this challenge, various procedures have been applied to forecast the generated solar PV energy. This study provides a comprehensive and systematic review of recent advances in solar PV power forecasting techniques with a focus on data-driven procedures. It critically analyzes recent studies on solar PV power forecasting to highlight the strengths and weaknesses of the techniques or models implemented. The clarity provided will form a basis for higher accuracy in future models and applications.
Article
Due to that participation of energy storage in wind power dispatch can improve scheduling reliability of Grid-accessed, the effectiveness depends on energy storage capacity and feasible energy management. Daily economic dispatch model is proposed firstly under the consideration of scheduling reliability and working characteristics of energy storage. Secondly, the Time–Sequence Rolling Optimal Ultra-short-term scheduling algorithm of energy storage is developed based on dynamic deviation estimation update. To deal with the problems caused by unforeseen prediction error of wind power or energy storage SOC (state-of-charge), relaxation factor of the allowable Grid-accessed power deviation range is introduced either in the scheduling algorithm to ensure reliability of established energy storage capacity. Finally, GA (Genetic Algorithm) and EMD (Empirical Mode Decomposition) are used in reference value setting of hybrid energy storage power distribution. The feasibility of the dispatch model and energy management strategy are verified by the wind/storage simulation platform.
Article
Most typical statistical and machine learning approaches to time series modeling optimize a single-step prediction error. In multiple-step simulation, the learned model is iteratively applied, feeding through the previous output as its new input. Any such predictor however, inevitably introduces errors, and these compounding errors change the input distribution for future prediction steps, breaking the train-test i.i.d assumption common in supervised learning. We present an approach that reuses training data to make a no-regret learner robust to errors made during multi-step prediction. Our insight is to formulate the problem as imitation learning; the training data serves as a "demonstrator" by providing corrections for the errors made during multi-step prediction. By this reduction of multi-step time series prediction to imitation learning, we establish theoretically a strong performance guarantee on the relation between training error and the multi-step prediction error. We present experimental results of our method, DaD, and show significant improvement over the traditional approach in two notably different domains, dynamic system modeling and video texture prediction.
Article
Full-text available
Accurate prediction of interfacial friction factor is critical for calculation of pressure drop and investigation of flow mechanism of vertical annular two-phase flows. Theoretical models of interfacial friction factor based on physical insight have been developed; however, these are inconvenient in engineering practice as too many parameters need to be measured. Although many researchers have proposed various empirical correlations to improve computation efficiency, there is no generally accepted simple formula. In this study, an efficient prediction model based on support vector regression machine (SVR) is proposed. Through sensitivity analysis, five factors are determined as the input parameters to train the SVR model, relative liquid film thickness, liquid Reynolds number, gas Reynolds number, liquid Froude number and gas Froude number. The interfacial friction factor is chosen as the output parameter to check the overall performance of the model. With the help of particle swarm algorithm, the optimization process is accelerated considerably, and the optimal model is obtained through iterations. Compared with other correlations, the optimal model shows the lowest average absolute error (AAE of 0.0004), lowest maximum absolute error (MAE of 0.006), lowest root mean square error (RMSE of 0.00076) and highest correlation factor (r of 0.995). The analysis using various data in the literature demonstrates its accuracy and stability in interfacial friction prediction. In summary, the proposed machine learning model is effective and can be applied to a wider range of conditions for vertical annular two-phase flows.
Article
Attribution of climate change and human activities have been extensively discussed over the past few decades, particularly in urbanization area. However, the relationships among different factors may not be well explained by traditional statistical methods. In this study, we took one of the highly urbanized regions, Central Taihu Basin, as an example. Linear regression (LR), random forest (RF) and support vector machine (SVM) were used for regression and the attributions of climate change and human activities on water level alterations at different scales from 1961 to 2018 were quantified by residual analysis. The regression results indicated that SVM performed best. Water level at each scale showed an increasing trend and human activities were the dominant influence. The altered period was further divided into three sub-periods and human activities contributed the most in the sub-period II (2000–2009). Finally, the importance of thirty-eight factors were quantified by RF based on daily data series from 2008 to 2018. The results showed that cumulative antecedent precipitation (CAP) of five days was one of the important climate factors and daily maximum discharge of sluices were the important human activities factors. The methods and results of this study can help to provide support in flood control.
Article
Full-text available
The precise prediction of the streamflow of reservoirs is of considerable importance for many activities relating to water resource management, such as reservoir operation and flood and drought control and protection. This study aimed to develop and evaluate the applicability of a hidden Markov model (HMM) and two hybrid models, i.e., the support vector machine-genetic algorithm (SVM-GA) and artificial neural fuzzy inference system-genetic algorithm (ANFIS-GA), for reservoir inflow forecasting at the King Fahd dam, Saudi Arabia. The results obtained by the HMM model were compared with those for the two hybrid models ANFIS-GA and SVM-GA, and with those for individual SVM and ANFIS models based on performance evaluation indicators and visual inspection. The results of the comparison revealed that the ANFIS-GA model and ANFIS model provided superior results for forecasting monthly inflow with satisfactory accuracy in both training (R² = 0.924, 0.857) and testing (R² = 0.842, 0.810) models. The performance evaluation results for the developed models showed that the GA-induced improvement in the ANFIS and SVR forecasts was matched by an approximately 25% decrease in RMSE and around a 13% increase in Nash–Sutcliffe efficiency. The promising accuracy of the proposed models demonstrates their potential for applications in monthly inflow forecasting in the present semiarid region.
Chapter
In recent years, Machine Learning (ML) algorithms have gained much attention and found a profound importance in processing, classification as well as analysis of multispectral, and hyperspectral remotely sensed data. The core objectives of this chapter are firstly to provide a critical review on important advanced ML algorithms in remote sensing data classification, and analysis; secondly, examine the performance of widely used important supervised ML algorithms namely Random Forest (RF), Support Vector Machine (SVM), and Classification and Regression Tree (CART) in satellite image classification, and analysis on Google Earth Engine (GEE) platform to derive distinct Land Use/Land Cover (LULC) classes. ML algorithms are being extensively used in optical remote sensing data analysis it includes the image classification algorithms to precisely allocate objects to a distinct set of known classes, the clustering algorithms to group the objects into classes based on a given set of input variables, the regression algorithms to forecast a response variable from a given a set of covariates, and the dimensionality reduction algorithms to build a small set of new variables that includes most of the information available in the input set of numerous variables. In the study, among the three tested supervised ML algorithms in LULC classification, CART algorithm shows relatively better performance than the RF, and SVM algorithms. The study concludes that advanced ML algorithms have immense potential in optical remote sensing data classification, and analysis to attain the higher classification accuracy.
Conference Paper
Abstract— The unprecedented rise in the number of new coronavirus infections worldwide has prompted many researchers to use mathematical and machine-learning-based prediction models to predict future epidemic patterns that will help governments, health service providers, and society understand how to deal with this situation. Using different machine learning methodologies helps researchers to understand the trend curve clearly. These may lead to a better and more effective fight against the epidemic and reduce or end preventive measures, allowing people to return to their everyday lives. This study is based on an analysis of COVID-19 data of KSA. Also, it demonstrates the prediction of the new confirmed cases and death of COVID-19 in the next ten days from 8th July in KSA, which is considered the period of the performing Hajj in 2021. It uses machine learning models such as Support Vector Machine (SVM), Bayesian Edge (BR), Linear Regression (LR), and Moving Average (MA). Each model provides two types of predictions: the number of newly infected cases and deaths over the next 10 days. The results indicate that SVM and MA forecasts have high accuracy, followed by LR which performs well. The BR performs poorly in forecast scenarios when applied with the available data set in forecasting new confirmed cases. All models were accurate in predicting mortality, with the best performing model being SVM, followed by MA, LR, and BR. It also expects an increase in confirmed cases under the SVM model scenario to 511,257 on 17th July from 496,516 on 7th July in the actual daily cumulative cases. The number of deaths will rise to 8,113 on 17th July from 7,921 on 7th July in actual cumulative daily data.
Article
Time series forecasting involves collecting and analyzing past observations to develop a model to extrapolate such observations into the future. Forecasting of future events is important in many fields to support decision making as it contributes to reducing the future uncertainty. We propose explainable boosted linear regression (EBLR) algorithm for time series forecasting, which is an iterative method that starts with a base model, and explains the model’s errors through regression trees. At each iteration, the path leading to highest error is added as a new variable to the base model. In this regard, our approach can be considered as an improvement over general time series models since it enables incorporating nonlinear features by residual explanation. More importantly, use of the single rule that contributes to the error most enables access to interpretable results. The proposed approach extends to probabilistic forecasting through generating prediction intervals based on the empirical error distribution. We conduct a detailed numerical study with EBLR and compare against various other approaches. We observe that EBLR substantially improves the base model performance through extracted features, and provide a comparable performance to other well established approaches. The interpretability of the model predictions and high predictive accuracy of EBLR makes it a promising method for time series forecasting.
Article
Full-text available
Machine learning models are poised to make a transformative impact on chemical sciences by dramatically accelerating computational algorithms and amplifying insights available from computational chemistry methods. However, achieving this requires a confluence and coaction of expertise in computer science and physical sciences. This Review is written for new and experienced researchers working at the intersection of both fields. We first provide concise tutorials of computational chemistry and machine learning methods, showing how insights involving both can be achieved. We follow with a critical review of noteworthy applications that demonstrate how computational chemistry and machine learning can be used together to provide insightful (and useful) predictions in molecular and materials modeling, retrosyntheses, catalysis, and drug design.
Article
Full-text available
Based on resource carrying capacity, this study used the revised theory of relative resource carrying capacity (RRCC) and introduced an innovative concept of relative fossil energy carrying capacity (RFECC), which evaluates the degree of fossil energy sustainability based on the relationship between economy, population, and environment. This study took China and the United States as the study objects, took the whole country as the reference area, and calculated the RFECC of population, economic, and environmental resources from 2000 to 2018. Therefore, based on the comparative analysis, the following conclusions were drawn: (i) there is a big difference in the RFECC between China and the United States, which is manifested in the inverted U-shaped trend in China and the U-shaped trend in the United States; (ii) the relative fossil energy carrying states in China and the United States are different, mainly reflected in the economy and environment; (iii) the gap in RFECC between China and the United States has gradually widened; in general, China’s economic RFECC is better than that of the United States, while environmental RFECC and population RFECC in the United States is better than that of China; and (iv) coal and oil should be used as a breakthrough point for the sustainable fossil energy and sustainable development for China and the United States, respectively.
Article
Transformer Vibration Technique is considered an effective method to monitor structural elements of transformers, in particular, to detect loose or deformed windings. As it is well known, vibrations vary with the sensor location on the transformer tank, which makes the number and the placement of sensors critical aspects for fault detection. In this paper, we investigate this issue by analyzing vibration spectra collected from various sensors installed on the tank of a typical oil filled power transformer operating under two limit cases, namely absence or presence of clamping looseness on windings. Support Vector Machines (SVM) are employed and an extensive analysis is performed to understand the informativeness of data corresponding to various sensors so as to figure out the appropriate number of sensors and their best location. This way fault detection is eventually achieved with a reduced and optimized number of sensors, resulting in a significant saving of time and costs.
Article
Government subsidies for energy storage and renewable generation have led to the cost of energy storage come down during recent years. This has motivated people to deploy behind-the-meter energy storage units, to reduce their monthly electricity bill. For optimal control of the battery to incorporate maximum photovoltaic energy generation as well as demand charge reduction, data-driven and advanced Battery Energy Storage System (BESS) control strategies are required. This paper explores different use cases where customers could deploy energy storage systems for demand charge reduction as well as when customers could deploy energy storage systems for demand charge reduction while satisfying a utility set objective. From historical load and PV data, different use cases are simulated using a Model Predictive Control (MPC) based BESS control model. MPC requires machine-learning (ML) based forecasts of photovoltaic (PV) as well as load as inputs. A sensitivity analysis on the effect of different energy forecasts on the performance of MPC is presented in the paper. A degradation analysis with as a function of charge/discharge cycles is also presented in the paper to evaluate the trade-off between economic objectives and battery health.
Article
Full-text available
Solar energy constitutes an effective supplement to traditional energy sources. However, photovoltaic power generation (PVPG) is strongly weather-dependent, and thus highly intermittent. High-precision forecasting of PVPG forms the basis of the production, transmission, and distribution of electricity, ensuring the stability and reliability of power systems. In this work, we propose a deep learning based framework for accurate PVPG forecasting. In particular, taking advantage of the long short-term memory (LSTM) network in solving sequential-data based regression problems, this paper considers the specific domain knowledge of PV and proposes a physics-constrained LSTM (PC-LSTM) to forecast the hourly day-ahead PVPG. It aims to overcome the shortcoming of recent machine learning algorithms that are applied based only on massive data, and thus easily producing unreasonable forecasts. Real-life PV datasets are adopted to evaluate the feasibility and effectiveness of the models. Sensitivity analysis is conducted for the selection of input feature variables based on a two-stage hybrid method. The results indicate that the proposed PC-LSTM model possesses stronger forecasting capability than the standard LSTM model. It is more robust against PVPG forecasting, and more suitable for PVPG forecasting with sparse data in practice. The PC-LSTM model also demonstrates superior performance with higher accuracy of PVPG forecasting compared to conventional machine learning and statistical methods.
Article
Full-text available
Modeling the methane emission is challenging due to the heterogeneity of solid waste characteristics and different chemical and physical reactions leading to methane generation. This study focused on monitoring the methane generation from landfills and modeling methane emission using machine learning techniques. Hence, two pilot landfills were constructed with a total capacity of 9327 tons of municipal solid waste. The temperature, methane, and leachate generation from the pilot landfills were measured for 3 years. The effect of leachate recirculation system on methane emission from landfill was evaluated, and the results showed that the methane emission was 35% lower when leachate recirculation system was not utilized in the landfilling process. Three machine learning models, including artificial neural networks, adaptive neuro‐fuzzy inference system, and support vector machine, were used for the first time to predict methane generation. Results demonstrated that the support vector machine model was superior to both the adaptive neuro‐fuzzy inference system and artificial neural network models for predicting methane generation. The support vector machine model was able to capture 90% and 82% of the variation in methane emission from landfills with and without leachate recirculation, respectively. In general, machine learning models showed considerable potential for forecasting methane generation.
Article
Full-text available
In neuroimaging, the difference between chronological age and predicted brain age, also known as brain age delta, has been proposed as a pathology marker linked to a range of phenotypes. Brain age delta is estimated using regression, which involves a frequently observed bias due to a negative correlation between chronological age and brain age delta. In brain age prediction models, this correlation can manifest as an overprediction of the age of young brains and an underprediction for elderly ones. We show that this bias can be controlled for by adding correlation constraints to the model training procedure. We develop an analytical solution to this constrained optimization problem for Linear, Ridge, and Kernel Ridge regression. The solution is optimal in the least-squares sense i.e., there is no other model that satisfies the correlation constraints and has a better fit. Analyses on the PAC2019 competition data demonstrate that this approach produces optimal unbiased predictive models with a number of advantages over existing approaches. Finally, we introduce regression toolboxes for Python and MATLAB that implement our algorithm.
Preprint
Full-text available
Machine learning models are poised to make a transformative impact on chemical sciences by dramatically accelerating computational algorithms and amplifying insights available from computational chemistry methods. However, achieving this requires a confluence and coaction of expertise in computer science and physical sciences. This review is written for new and experienced researchers working at the intersection of both fields. We first provide concise tutorials of computational chemistry and machine learning methods, showing how insights involving both can be achieved. We then follow with a critical review of noteworthy applications that demonstrate how computational chemistry and machine learning can be used together to provide insightful (and useful) predictions in molecular and materials modeling, retrosyntheses, catalysis, and drug design.
Article
Regressing the vector field of a dynamical system from a finite number of observed states is a natural way to learn surrogate models for such systems. We present variants of cross-validation (Kernel Flows (Owhadi and Yoo, 2019) and its variants based on Maximum Mean Discrepancy and Lyapunov exponents) as simple approaches for learning the kernel used in these emulators.
Article
Meal delivery platforms like Uber Eats shape the landscape in cities around the world. This paper addresses forecasting demand on a grid into the short-term future, enabling, for example, predictive routing applications. We propose an approach incorporating both classical forecasting and machine learning methods and adapt model evaluation and selection to typical demand: intermittent with a double-seasonal pattern. An empirical study shows that an exponential smoothing based method trained on past demand data alone achieves optimal accuracy, if at least two months are on record. With a more limited demand history, machine learning is shown to yield more accurate prediction results than classical methods.
ResearchGate has not been able to resolve any references for this publication.