Article

Machine Learning Models and Bankruptcy Prediction

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

There has been intensive research from academics and practitioners regarding models for predicting bankruptcy and default events, for credit risk management. Seminal academic research has evaluated bankruptcy using traditional statistics techniques (e.g. discriminant analysis and logistic regression) and early artificial intelligence models (e.g. artificial neural networks). In this study, we test machine learning models (support vector machines, bagging, boosting, and random forest) to predict bankruptcy one year prior to the event, and compare their performance with results from discriminant analysis, logistic regression, and neural networks. We use data from 1985 to 2013 on North American firms, integrating information from the Salomon Center database and Compustat, analysing more than 10,000 firm-year observations. The key insight of the study is a substantial improvement in prediction accuracy using machine learning techniques especially when, in addition to the original Altman’s Z-score variables, we include six complementary financial indicators. Based on Carton and Hofer (2006), we use new variables, such as the operating margin, change in return-on-equity, change in price-to-book, and growth measures related to assets, sales, and number of employees, as predictive variables. Machine learning models show, on average, approximately 10% more accuracy in relation to traditional models. Comparing the best models, with all predictive variables, the machine learning technique related to random forest led to 87% accuracy, whereas logistic regression and linear discriminant analysis led to 69% and 50% accuracy, respectively, in the testing sample. We find that bagging, boosting, and random forest models outperform the others techniques, and that all prediction accuracy in the testing sample improves when the additional variables are included. Our research adds to the discussion of the continuing debate about superiority of computational methods over statistical techniques such as in Tsai, Hsu, and Yen (2014) and Yeh, Chi, and Lin (2014). In particular, for machine learning mechanisms, we do not find SVM to lead to higher accuracy rates than other models. This result contradicts outcomes from Danenas and Garsva (2015) and Cleofas-Sánchez, García, Marqués, and Sénchez (2016), but corroborates, for instance, Wang, Ma, and Yang (2014), Liang, Lu, Tsai, and Shih (2016), and Cano et al. (2017). Our study supports the applicability of the expert systems by practitioners as in Heo and Yang (2014), Kim, Kang, and Kim (2015) and Xiao, Xiao, and Wang (2016).

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Researchers have been trying to apply and compare other methodologies to improve the prediction of default risk, namely the limitations of traditional models, and the difficulties in applying and interpreting the most advanced models (e.g., Barboza et al. 2017;Jones et al. 2017;Zizi et al. 2021). ...
... For instance, Barboza et al. (2017) compare bagging, boosting, random forest, neural networks, SVMs with ADM, and logit. They concluded that machine learning models are more precise than traditional models. ...
... Furthermore, the "new age" classifiers, including generalized boosting, AdaBoost, and random forest, have a reasonably good level of interpretability. Jones et al. (2017), in line with Barboza et al. (2017), conclude that the use of new age classifiers presents a higher accuracy, and they are relatively easy to estimate, implement and interpret when compared with SVM and NN. They compare 16 classifiers, such as logit, probit, ADM, NN, SVM, and "new age" models including Generalised Boosting, AdaBoost, and random forests. ...
Article
Full-text available
This work analyses whether financial information quality is relevant to explaining firms’ probability of default. A financial default prediction model for SMEs (Small and Medium Enterprises) is presented, which includes not only traditional measures but also financial reporting quality (FRQ) measures. FRQ influences the decision-making due to its impact on financial information, which has repercussions on the accounting ratios’ informativeness. A panel data of 1560 Portuguese SMEs in the construction sector, from 2012 to 2018, is analysed. First, firms are classified as default or compliant using an ex-ante criterion which allows us to identify signs of financial constraints in advance. Then, the stepwise method is employed to identify which variables are more relevant to explain the default probability. Results show that FRQ measures, namely accruals quality and timeliness, impact firms’ defaulting, supporting their relevance in predicting financial difficulties. Finally, using a logit approach, the accuracy of the model increased when FRQ variables were included. Results are confirmed using “new age” classifiers, namely the random forest methodology. This work is not only relevant to the extant financial distress literature but has also relevant implications for practice since stakeholders can understand the impact of financial reporting quality to prevent additional risks.
... The 2008-2010 financial crisis (FC) showed the vulnerability of firms due to the complicated relationship they have in business and economy, which affected the firm's financial health difficulties that in some cases led to bankruptcy (Boratyńska & Grzegorzewska, 2018;Marcinkevičius & Kanapickienė, 2014;Veganzones & Eric Séverin, 2018;Wang et al., 2017;Zoričák et al., 2020). Although many studies have been done in the corporation bankruptcy field, the FC increased the importance of credit risk (CR) (Barboza et al., 2017). Due to the global pandemic of Corona Virus Disease , economic activities have been disrupted in many companies (Abdullah & Achsani, 2020) and the stability of economic is in high risk. ...
... Most credit risk assessments (Barboza et al., 2017;García et al., 2019;Liu et al., 2019;Mai et al., 2019;Nedumparambil & Bhandari, 2020;Uthayakumar et al., 2017) are based on credit rating models, and that is why it is called probability of default or bankruptcy models. The result driven out of the models is a number that indicates how likely an entity, person, or company may be bankrupt in future during a specific time. ...
... Choosing a classification algorithm and the input variables are two of the major tasks of BP (Volkov et al., 2017;Chou et al., 2017). Although there has been many research on BP field, many of them have used pre-selected variables of previous studies (Barboza et al., 2017;Son et al., 2019;Zięba et al., 2016), few of them (Chou et al., 2017;Hosaka, 2019;Kim et al., 2016;Son et al., 2019;Tobback et al., 2017;Volkov et al., 2017) have used a method to feature selection. It means they lost one significant part of BP model development. ...
Article
Full-text available
Banks and financial institutions strive to develop and improve their credit risk evaluation methods to reduce financial loss resulting from borrowers' financial default. Although in previous studies, many variables obtained from financial statements - such as financial ratios - have been used as the input to the bankruptcy prediction process, seldom a machine learning method based on computing intelligence has been applied to select the most critical of them. In this research, the data from companies that are were listed in Tehran's Stock Exchange and OTC market during 26 years since 1992 to 2017 has been investigated, with 218 companies selected as the study sample. The ant colony optimization algorithm with k-nearest neighbor has been used to feature the selection and classification of the companies. In this study, the problem of the imbalanced dataset has been solved with the under-sampling technique. The results have shown that variables such as EBIT to total sales, equity ratio, current ratio, cash ratio, and debt ratio are the most effective factors in predicting the health status of companies. The accuracy of final research model is estimated that the bankruptcy prediction ranges between 75.5% to 78.7% for the training and testing sample. © 2021 University of Tehran, College of Farabi. All Rights Reserved.
... A vast number of algorithms have been proposed (Beaver 1966;Altman 1968;Ohlson 1980;Zmijewski 1984;Betz et al. 2014;Mselmi et al. 2017;Pisula 2017;Shrivastav and Ramudu 2020). Nowadays, given the recent growth of big data, the most popular method is the implementation of machine learning techniques (Barboza et al. 2017;Le and Viviani 2018;Petropoulos et al. 2020;Jabeur et al. 2021;Pham and Ho 2021). In creating evaluating algorithms, it is also important to discover the reasons why one company is riskier than another and which variables have an impact on the final prediction results. ...
... Among supervised learning methods, a very popular trend in entity evaluation models is the use of the logistic regression (Barboza et al. 2017;Mselmi et al. 2017;Le and Viviani 2018;Zhou 2013;Zhao et al. 2009;Zizi et al. 2020) and support vector machine (Barboza et al. 2017;Geng et al. 2015;Harris 2015;Mselmi et al. 2017;Xia et al. 2018;Zhou 2013;Shrivastav and Ramudu 2020) models. Additionally, neural networks are also used. ...
... Among supervised learning methods, a very popular trend in entity evaluation models is the use of the logistic regression (Barboza et al. 2017;Mselmi et al. 2017;Le and Viviani 2018;Zhou 2013;Zhao et al. 2009;Zizi et al. 2020) and support vector machine (Barboza et al. 2017;Geng et al. 2015;Harris 2015;Mselmi et al. 2017;Xia et al. 2018;Zhou 2013;Shrivastav and Ramudu 2020) models. Additionally, neural networks are also used. ...
Article
Full-text available
Corporate misconduct is a huge and widespread problem in the economy. Many companies make mistakes that result in them having to pay penalties or compensation to other businesses. Some of these cases are so serious that they take a toll on a company’s financial condition. The purpose of this paper was to create and evaluate an algorithm which can predict whether a company will have to pay a penalty and to discover what financial indicators may signal it. The author addresses these questions by applying several supervised machine learning methods. This algorithm may help financial institutions such as banks decide whether to lend money to companies which are not in good financial standing. The research is based on information contained in the financial statements of companies listed on the Warsaw Stock Exchange and NewConnect. Finally, different methods are compared, and methods which are based on gradient boosting are shown to have a higher accuracy than others. The conclusion is that the values of financial ratios can signal which companies are likely to pay a penalty next year.
... uncertainty. Researchers have used these tools to make macroeconomic predictions, and many studies, especially those focused on the prediction of economic crises, have proven that models based on machine learning perform better than linear models [5,8,9]. ...
... Specifically, machine learning uses a regularization method that minimizes the influence of redundant information caused by multiple variables, thereby alleviating high uncertainty caused by complex and nonlinear macroeconomic relationships [4,5]. In particular, machine learning is excellent at predicting economic crises better than, for example, random forest [9], artificial neural networks [8], support vector machines [20], and adaptive boosting [21]. In addition, fuzzy logic, a method of analyzing complex non-linear and nonstationary relationships based on expert assessments, has been widely used as one representative machine learning technique [22][23][24][25]. ...
Article
Full-text available
For sustainable economic growth, information about economic activities and prospects is critical to decision-makers such as governments, central banks, and financial markets. However, accurate predictions have been challenging due to the complexity and uncertainty of financial and economic systems amid repeated changes in economic environments. This study provides two approaches for better economic prediction and decision-making. We present a deep learning model based on the long short-term memory (LSTM) network architecture to predict economic growth rates and crises by capturing sequential dependencies within the economic cycle. In addition, we provide an interpretable machine learning model that derives economic patterns of growth and crisis through efficient use of the eXplainable AI (XAI) framework. For major G20 countries from 1990 to 2019, our LSTM model outperformed other traditional predictive models, especially in emerging countries. Moreover, in our model, private debt in developed economies and government debt in emerging economies emerged as major factors that limit future economic growth. Regarding the economic impact of COVID-19, we found that sharply reduced interest rates and expansion of government debt increased the probability of a crisis in some emerging economies in the future.
... Detection of false investment strategies using unsupervised learning methods was discussed with the problem of selection bias in the work by Prado and Lewis (2018). Barboza et al. (2017) presented ML models for bankruptcy prediction. Authors also worked on credit risk management by predicting default events through ML algorithms and compared the same with traditional models like discriminant analysis. ...
... The present study has the practical/professional and academic consequences. It indeed has theoretical contributions for the future of scientific literature like other similar work (Barboza et al. 2017;Chakraborty and Joseph 2017;Gestel et al. 2006;Saura et al. 2019) done before. The theoretical implications will help us to firstly understand the ever evolving needs of the financial sector with the changing environment and the evolving customers. ...
Article
Full-text available
In order to survive in this complex economic business environment with fierce competition among various players of the finance sector, the need is to understand the even more complex financial behaviour of the customers. We apply the support vector machine classifier, a machine learning algorithm to construct a nonlinear model which classifies the customers into good and bad class based on their respective positive and negative saving behaviour. With the help of web-based survey, a sample of urban banking millennial was collected and preprocessed to apply the support vector machine classifier technique. Pattern recognition from data and its prediction for the financial behaviour are based on the machine learning forecasts. Moreover, the comparative analysis of the weightage of the three attributes, namely income level, financial literacy and behavioural characteristic, is carried out and it is analysed for the savings/wealth accumulation of the millennial generation to understand the financial distress among the said generation in context.
... Nevertheless, the range of techniques is actually much wider. Other machine learning methods have been employed including boosting, bagging, and random forest models (Barboza et al., 2017;Choi et al., 2018;Jabeur et al., 2020;Kim & Kang, 2010;Wang et al., 2014). In this frame, it should be pointed out the paper of Zhou and Lai (2017) who applied AdaBoost on corporate bankruptcy prediction with missing values revealing that it performs better than other benchmark models. ...
... where β i represents the discriminant weights, X i represent the accounting ratios, N the number of features and β 0 is a constant. Discriminant analysis requires the distinction between failing and healthy firms to be linearly separable, treating the ratios as if they were independent (Barboza et al., 2017;du Jardin, 2016). ...
Article
Full-text available
The emergence of big data, information technology, and social media provides an enormous amount of information about firms’ current financial health. When facing this abundance of data, decision makers must identify the crucial information to build upon an effective and operative prediction model with a high quality of the estimated output. The feature selection technique can be used to select significant variables without lowering the quality of performance classification. In addition, one of the main goals of bankruptcy prediction is to identify the model specification with the strongest explanatory power. Building on this premise, an improved XGBoost algorithm based on feature importance selection (FS-XGBoost) is proposed. FS-XGBoost is compared with seven machine learning algorithms based on three well-known feature selection methods that are frequently used in bankruptcy prediction: stepwise discriminant analysis, stepwise logistic regression, and partial least squares discriminant analysis (PLS-DA). Our experimental results confirm that FS-XGBoost provides more accurate predictions, outperforming traditional feature selection methods.
... Our study makes three important contributions to the literature. First, we rely on machine-learning tools, which are gaining ground in economic and finance applications (see, for example, Barboza et al., 2017;De Moor et al., 2018;Athey et al., 2019;Akyildirim et al., 2021;Aziz et al., 2021;Knaus et al., 2021). We compare the performance of these models with the golden standard of default prediction studies -the discrete hazard (DH) model. ...
... The study shows that this method is out-of-sample performance superior to other common models. Finally, Barboza et al. (2017) show that machine-learning models have, on average, approximately 10% higher accuracy in relation to traditional models. ...
Article
Full-text available
In this study we investigate the ability of machine-learning techniques to predict firm failures, and we compare them against alternatives. Using data on business and financial risks on UK firms over 1994-2019, we document that machine-learning models are systematically more accurate than a discrete hazard benchmark. We conclude that the random forest model outperforms other models in failure prediction. In addition, we show that the improved predictive power of the random forest model relative to its counterparts, persists when we consider extreme economic events as well as firm and industry heterogeneity. Finally, we find that financial factors affect failure probabilities.
... véleménye szerint a csődelőrejelzés széles körben tanulmányozott téma a számviteli és pénzügyi kutatási területeken, amely kitüntetett jelentőséggel rendelkezik, és nagy hatással van a gazdaságra. A kutatók mellett a gyakorlati szakemberek és pénzügyi intézmények -a fentiekkel összhangban -folyamatosan keresik a legjobb módszereket, ügyfeleik fizetőképességének értékelésére (Barboza, Kimura & Altman, 2017). A csődelőrejelzés fontossága hatványozottan felértékelődik recessziós gazdasági környezetben, amikor a finanszírozók kockázatkerülése dominál (Nyitrai, 2014). ...
... A módszerek teljesítményével kapcsolatban azt az előfeltevést fogalmaztam meg, hogy a korszerű, gépi tanulás módszercsaládba tartozó technikák előrejelző pontossága magasabb, mint a hagyományos statisztikai módszereké. A szakirodalom összehasonlító tanulmányai nem mindig hoznak egyértelmű győztes módszereket, némelyekben a gépi tanulási módszerek hatékony képességei tükröződnek (lásd Fan & Palaniswami, 2000;Barboza et al., 2017), ugyanakkor az ellenkezőjére is találunk példát (lásd Coats & Fant, 1993;Pompe & Feelders, 1997). A fenti feltevések tesztelésére és átfogó elemzésére a szisztematikus irodalomkutatás módszerét alkalmaztam. ...
Article
Full-text available
A vállalati fizetésképtelenség, csőd és pénzügyi nehézség vizsgálata egy intenzív kutatási terület, amelynek számos különböző gyakorlati eljárása megfigyelhető. A tanulmány a vállalati csődelőrejelzés külföldi szakirodalmát vizsgálja, a szisztematikus irodalomelemzés módszerével. A kutatás célkitűzése kettős, elsősorban megvizsgálja a vállalati csődelőrejelzés legjobban teljesítő módszereit, másodsorban felfedi az ehhez kapcsolódó legyakoribb tényezőket, a magasan hivatkozott külföldi csődkutatások alapján. Három tudományos adatbázist felhasználva, 105 szakirodalmi cikket dolgoztak fel, amelyeket 1966 és 2017 közötti időszakban tettek közzé. A szakirodalmi áttekintés, hat módszercsalád összehasonlítását teszi lehetővé. Az eredmények azt mutatják, hogy a döntési fa módszercsalád fölülmúlja az SVM, a neuronháló és a hagyományos statisztikai módszereket. A közepes pontosságú módszerek közül a példányalapú módszercsalád és a logisztikus regresszió összemérésekor nem lehetett egyértelmű rangsort felállítani. A csőd tényezőinek vizsgálatánál körvonalazódott, hogy a hagyományos pénzügyi mutatók mellet alkalmazott piaci mutatók átlagosan nem vezetnek magasabb előrejelző pontossághoz, mint a csak kizárólag pénzügyi mutatókat tartalmazó modellek.
... overlapping classes). Another effort presented in [14] tries to evaluate ML models (SVMs, bagging, boosting and random forests) to predict bankruptcy one year prior to the event. The authors also compare the performance of the adopted algorithms with results retrieved by discriminant analysis, logistic regression and NNs. ...
Article
Full-text available
The banking sector is on the eve of a serious transformation and the thrust behind it is artificial intelligence (AI). Novel AI applications have been already proposed to deal with challenges in the areas of credit scoring, risk assessment, client experience and portfolio management. One of the most critical challenges in the aforementioned sector is fraud detection upon streams of transactions. Recently, deep learning models have been introduced to deal with the specific problem in terms of detecting and forecasting possible fraudulent events. The aim is to estimate the unknown distribution of normal/fraudulent transactions and then to identify deviations that may indicate a potential fraud. In this paper, we elaborate on a novel multistage deep learning model that targets to efficiently manage the incoming streams of transactions and detect the fraudulent ones. We propose the use of two autoencoders to perform feature selection and learn the latent data space representation based on a nonlinear optimization model. On the delivered significant features, we subsequently apply a deep convolutional neural network to detect frauds, thus combining two different processing blocks. The adopted combination has the goal of detecting frauds over the exposed latent data representation and not over the initial data.
... In addition to differences in financial ratios, several studies support the increased accuracy of bankruptcy models to classify successful and non-successful firms when the effects of market value data and other non-financial firm information are added, for example market value of equity/total liabilities, stock price, firm age, director characteristics and board structure (Huang and Yen 2019; Barboza et al. 2017;Altman et al. 2010). However, market price contains expectations for the future and is only available for listed companies. ...
Article
Full-text available
In this study, a new risk assessment model is developed and the evidence reasoning (ER) approach is applied to assess failure risk of knowledge-intensive services (KIS) corporates in the UK. General quantitative financial indicators alone (e.g., operational capability or profitability) cannot comprehensively evaluate the probability of company bankruptcy in the KIS sector. This new model combines quantitative financial indicators with macroeconomic variables, industrial factors and company non-financial criteria for robust and balanced risk analysis. It is based on the theory of enterprise risk management (ERM) and can be used to analyze company failure possibility as an important aspect of risk management. This study provides new insight into the selection of macro and industry factors based on statistical analysis. Another innovation is related to how marginal utility functions of variables are constructed and imperfect data can be handled in a distributed assessment framework. It is the first study to convert observed data into probability distributions using the likelihood analysis method instead of subjective judgement for data-driven risk analysis of company bankruptcy in the KIS sector within the ER framework, which makes the model more interpretable and informative. The model can be used to provide an early warning mechanism to assist stakeholders to make investment and other decisions.
... In recent years, there have been some important studies on going-concern decisions by machine learning or deep learning. Barboza et al. [20] utilize machine-learning methods (support vector machines, bagging, boosting, neural networks, and random forest) to predict bankruptcy and compare their performance with results from discriminant analysis and logistic regression. Their results show machine-learning models, on average, approximately 10% more accuracy in relation to traditional models. ...
Article
Full-text available
The going-concern opinions of certified public accountants (CPAs) and auditors are very critical, and due to misjudgments, the failure to discover the possibility of bankruptcy can cause great losses to financial statement users and corporate stakeholders. Traditional statistical models have disadvantages in giving going-concern opinions and are likely to cause misjudgments, which can have significant adverse effects on the sustainable survival and development of enterprises and investors’ judgments. In order to embrace the era of big data, artificial intelligence (AI) and machine learning technologies have been used in recent studies to judge going concern doubts and reduce judgment errors. The Big Four accounting firms (Deloitte, KPMG, PwC, and EY) are paying greater attention to auditing via big data and artificial intelligence (AI). Thus, this study integrates AI and machine learning technologies: in the first stage, important variables are selected by two decision tree algorithms, classification and regression trees (CART), and a chi-squared automatic interaction detector (CHAID); in the second stage, classification models are respectively constructed by extreme gradient boosting (XGB), artificial neural network (ANN), support vector machine (SVM), and C5.0 for comparison, and then, financial and non-financial variables are adopted to construct effective going-concern opinion decision models (which are more accurate in prediction). The subjects of this study are listed companies and OTC (over-the-counter) companies in Taiwan with and without going-concern doubts from 2000 to 2019. According to the empirical results, among the eight models constructed in this study, the prediction accuracy of the CHAID–C5.0 model is the highest (95.65%), followed by the CART–C5.0 model (92.77%).
... Zeng Rongxin predicts the credit risk for the credit customers of local corporate banks in Guizhou Province and concluded model is more accurate in the early warning of credit risk. Li Li establishes the financial risk warning system of grass-roots central banks, which provides an effective basis for financial risk evaluation and prevention [25][26][27][28][29][30][31][32][33]. ...
Article
Full-text available
Based on the background of big data, it is necessary to study the dynamic parameter optimization of the commercial bank risk model neural network. Several customer information attribute groups that have an impact on loan customer rating are selected, and the existing customer data are used to train the network model of the attribute group and customer default rate, so that it can predict the customer’s default rate according to the newly entered loan customer information and then predict whether the customer defaults. Based on a neural network model, this article constructs the credit risk early warning model of science and technology bank, makes an empirical test, and puts forward relevant countermeasures and suggestions to control the credit risk of bank. This article establishes a warning model of commercial banks by using a neural network. Taking the bank as an empirical sample, the constructed neural network model is used. Finally, the error of the model is small and the early warning results are satisfactory. The experimental results show that the proposed risk early warning model can accurately predict the customer default rate, so as to warn the defaulting customers. In the whole process, there are few human intervention factors and a high degree of intelligence, which reduces the operational risk.
... Predicting the bankruptcy of American companies one year before the occurrence using regression and neural network methods [1] Machine Learning Models and Bankruptcy Prediction Zarei and Zarei (2018) The impact of business intelligence on the financial performance of Iranian banks [19] The key to business success for many banks is the correct use of data to make better, faster and flawless decisions. To achieve this goal, banks need to use powerful and efficient tools such as business intelligence as a positive catalyst. ...
Article
Full-text available
Business intelligence, as one of the branches of information technology, is increasingly considered by managers in today’s business world. In order to make better decisions about the business process, most business organizations are very willing to use intelligent systems. Intelligence refers to the ability to pursue a goal in the human way; therefore, it can be said that the more human-like a system is, the more intelligent it is. Through learning and gaining experience or acquiring new knowledge, the intelligent system can increase its knowledge. One of the main goals of implementing business intelligence in any organization is to create reports using variety of management dashboards for effective and critical decisions based on the organization’s key indicators. The present study aims to provide an efficient model for optimizing the products sales system in a pharmaceutical company using clustering methods and based on machine learning indicators and algorithms. The studied model uses RFM (Recency Frequency Monetary)-LRFM (Length Recency Frequency Monetary)-NLRFM (Number Length Recency Frequency Monetary) indices to utilize customer clustering algorithms. Also, the association rules method has been used in this study in order to show the relationship between the sold products, to analyze the customers’ shopping cart, and to offer to the customers based on the obtained rules. Finally, the results are reviewed with K-mean, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) and Optics algorithms. According to the obtained results, the proposed model will provide the best results using the K-means algorithm.
... Studies on firm's financial failure are extensively done at developed countries context such as USA, Uk , and spain(Acosta-González, Fernández-Rodríguez, & Ganga, 2019; Barboza, Kimura, & Altman, 2017;Charitou, Neophytou, & Charalambous, 2004). However, there is a lack of studies in developing countries context, particularly in Algeria. ...
Conference Paper
Full-text available
Currentresearches on the prediction of a firm's financial failure have takeninto account many factors, mostly corresponding to financial ratios derived fromfirms' annual accounts. Nevertheless, the current crisis and the consequent exponentialincrease in rates of insolvency have made it clear that the phenomenon of bankruptcy needs to be explained concerning different variables; thus, this conceptual paper proposes to predict the financial failure through corporate governance. Next, the conceptual paper is recommending future researchers for applying an empirical study in Algeria, more exactly at oil and gas field.
... In recent years, ensemble algorithms have become the trend for developing enterprise credit risk prediction models (Mselmi et al., 2017;Zhou et al., 2017). Among them, bagging, boosting and random forest (RF) ensemble models show better prediction results than other methods (Barboza et al., 2017). In addition, as one of the main solutions for solving the class imbalance problem (i.e., the number of instances of the minority class is far less than that of the majority class), scholars have achieved excellent prediction results by combining ensemble algorithms with data-level methods. ...
Article
Full-text available
The spread of enterprise credit risk in the supply chain may lead to large‐scale bankruptcy and credit crises, which are related to national economic and social stability and financial system security. Therefore, enterprise credit risk in the supply chain context is not only a concern for banking financial institutions, credit rating agencies and enterprise managers but also the focus of governments. This article develops a DTE‐DSA (decision tree [DT] ensemble model using the differential sampling rate, Synthetic Minority Oversampling Technique [SMOTE] and AdaBoost) prediction framework integrating supply chain information to predict enterprise credit risk. The empirical test shows that using supply chain information can significantly improve the prediction score. The DTE‐DSA model has the best prediction effect in dealing with class imbalance problems. Compared with single classifier models—such as logistic regression, k‐nearest neighbours, support vector machine, DT and DT using the SMOTE—as well as ensemble models—such as extremely randomized trees, random forest, rotation forest, extreme gradient boosting, gradient boosting DT and DT ensemble model using AdaBoost—the DTE‐DSA model not only has the best prediction score but also has a more stable performance. The comprehensive use of supply chain information and the DTE‐DSA model can result in the highest prediction score, with an area under the curve of 0.9016 and a Kolmogorov–Smirnov statistic of 0.7369. Further analysis of the variables of importance enhances the interpretability of the model and obtains relevant management insights.
... Studies on firm's financial failure are extensively done at developed countries context such as USA, Uk , and spain(Acosta-González, Fernández-Rodríguez, & Ganga, 2019; Barboza, Kimura, & Altman, 2017;Charitou, Neophytou, & Charalambous, 2004). However, there is a lack of studies in developing countries context, particularly in Algeria. ...
Conference Paper
Full-text available
In recent decade, many considerations have been considered in recent studies on the estimation of a firm's financial failure, the majority of which relate to financial ratios extracted from annual financial statements. However, the economic recession and the resulting exponential growth in insolvency rates have shown that the phenomenon of bankruptcy must be explained in terms of a wider range of variables; consequently, this conceptual paper proposes financial ratios and corporate governance as predictors of firms' financial failure. In addition, the conceptual paper provides a recommendation for future researchers to conduct an empirical study in Algeria, more exactly at ENSP corporate complexes.
... Support vector machines, bagging, boosting, and random forest. [12] 7. ...
Article
Full-text available
Coronavirus disease is an infectious respiratory tract disease caused by SARS-CoV-2 virus. The spread of this pandemic had unprecedented effect on human life and world economy. With every increasing number of infected cases, latest technologies like Machine learning (ML) and Artificial intelligence (AI) are being employed for interpreting and solving the COVID-19 crisis.In the present work, we have compared the impact of COVID-19 crisis on the economy of USA and India using Artificial Intelligence and Machine Learning. We have applied Logistic Regression on the collected dataset to answer various questions related to the impact of COVID on economy of both the countries and have predicted future trends. The role of AI in setting benchmark for all future predictions and uses have also been outlined.
... After that, the FL server call each participant by passing the agent id, latest global model and local epoch to find out an optimal local solution (line[11][12]. Each FL participant receives the shared global model, splits their local samples into batches, performs a local training based on the required number of local epochs, and shares that local model with the FL server (line[15][16][17][18][19][20]. Whenever the FL server receives an updated local model from any of the participants, it aggregates the received models to generate an updated global model (line 13-14).Algorithm 1: FL Model for Customers Financial Disaster Prediction. ...
Article
Full-text available
In recent years, as economic stability is shaking, and the unemployment rate is growing high due to the COVID-19 effect, assigning credit scoring by predicting consumers’ financial conditions has become more crucial. The conventional machine learning (ML) and deep learning approaches need to share customer’s sensitive information with an external credit bureau to generate a prediction model that opens up the door of privacy leakage. A recently invented privacy-preserving distributed ML scheme referred to as Federated learning (FL) enables generating a target model without sharing local information through on-device model training on edge resources. In this paper, we propose an FL-based application to predict customers’ financial issues by constructing a global learning model that is evolved based on the local models of the distributed agents. The local models are generated by the network agents using their on-device data and local resources. We used the FL concept because the learning strategy does not require sharing any data with the server or any other agent that ensures the preservation of customers’ sensitive data. To that end, we enable partial works from the weak agents that eliminate the issue if the model convergence is retarded due to straggler agents. We also leverage asynchronous FL that cut off the extra waiting time during global model generation. We simulated the performance of our FL model considering a popular dataset, Give me Some Credit (Freshcorn, 2017). We evaluated our proposed method considering a a different number of stragglers and setting up various computational tasks (e.g., local epoch, batch size), and simulated the training loss and testing accuracy of the prediction model. Finally, we compared the F1-score of our proposed model with the existing centralized and decentralized approaches. Our results show that our proposed model achieves an almost identical F1-score as like centralized model even when we set up a skew-level of more than 80% and outperforms the state-of-the-art FL models by obtaining an average of 5∼6% higher accuracy when we have resource-constrained agents within a learning environment.
... Using statistical approaches, the algorithms are calibrated to generate classifications and predictions on the datasets, revealing crucial insights or hidden information or patterns in data mining initiatives. Following that, these insights drive the decision-making process [10]. Some recent researches believe that ML can assist EI in light of the success it has experienced in several other fields [11,12,13]. ...
... The financial literature increasingly uses machine learning techniques, which has led to excellent prediction outcomes and exciting research goals [41][42][43]. SVM models and their expansions (such as support vector regression or SVR) have produced acceptable results in many financial applications, as shown by papers outlining current best practices for exchange rate forecasting [15,44,45]. These models have been tested to see whether they can accurately forecast the spot nominal exchange rate of 10 currency pairs multiplied by the US dollar, Euro, British pound, and Japanese yen. ...
Article
Full-text available
This study aimed to forecast the exchange rate between the Vietnamese dong and the US dollar for the following month in the context of the COVID-19 pandemic. It used the Support Vector Regression (SVR) algorithm under the Uncovered Interest Rate Parity (UIRP) theoretical framework; the results are compared with the Ordinary Least Square (OLS) regression model and the Random Walk (RW) model under the rolling window method. The data included the VND/USD exchange rate, the bank interest rate for the 1-month term, and the 1-month T-bill from January 01, 2020, to September 11, 2021. The research discovered a linear link between the two nations' exchange rates and interest rate differentials. Interest rate differentials are input variables to forecast interest rate differentials. Furthermore, the connection between the exchange rate and interest rate differentials during this era does not support the UIRP hypothesis; hence, the error for OLS predictions remains large. The study provided a model to forecast future exchange rates by combining the UIRP theoretical framework and the SVR algorithm. The UIRP theoretical framework can anticipate exchange rate differentials using the input variable and the interest rates between two nations. Meanwhile, the SVR algorithm is a robust machine learning technique that enhances prediction accuracy. Doi: 10.28991/ESJ-2022-06-03-014 Full Text: PDF
... T Previous studies have given diverse criteria as financial ratios in predicting corporate bankruptcy. Some studies show that the Z-Score model has a strong practical application of financial status to the prediction of bankruptcy as studied by Liang, Lu, Tsai, Shih [5], Barboza, et al. [6], Chou, et al. [7], Antunes, et al. [8], Le, et al. [9], Le, et al. [10], Veganzones and Séverin [11], Mai, et al. [12], Son, et al. [13], Chen, et al. [14]. However, previous studies were mainly used in developed countries to predict bankruptcy and few studies applied data mining in predicting bankruptcy, especially in emerging security markets such as Vietnam. ...
... Although the above combined models can improve the accuracy to a certain extent, they are limited by the number of single models. The integrated model that mixes multiple models is gradually becoming favored by scholars and has been applied to various fields [19]. Shuai Wang et al. proposed a probabilistic approach using stacked ensemble learning that integrates random forests, long short-term memory networks, linear regression, and Gaussian process regression, for predicting cloud resources required for CSS applications [20]. ...
Article
Full-text available
This work proposed an integrated model combining bagging and stacking considering the weight coefficient for short-time traffic-flow prediction, which incorporates vacation and peak time features, as well as occupancy and speed information, in order to improve prediction accuracy and accomplish deeper traffic flow data feature mining. To address the limitations of a single prediction model in traffic forecasting, a stacking model with ridge regression as the meta-learner is first established, then the stacking model is optimized from the perspective of the learner using the bagging model, and lastly the optimized learner is embedded into the stacking model as the new base learner to obtain the Ba-stacking model. Finally, to address the Ba-stacking model’s shortcomings in terms of low base learner utilization, the information structure of the base learners is modified by weighting the error coefficients while taking into account the model’s external features, resulting in a DW-Ba-stacking model that can change the weights of the base learners to adjust the feature distribution and thus improve utilization. Using 76,896 data from the I5NB highway as the empirical study object, the DW-Ba-Stacking model is compared and assessed with the traditional model in this paper. The empirical results show that the DW-Ba-stacking model has the highest prediction accuracy, demonstrating that the model is successful in predicting short-term traffic flows and can effectively solve traffic-congestion problems.
... They produced convincing results in terms of forecasting without requiring any statistical restriction. Indeed, Barboza et al. (2017) tested five machine learning models and compared their bankruptcy prediction power against traditional statistics techniques (discriminant analysis and logistic regression) using North American firms' data from 1985 to 2013. Their study found substantial improvement in bankruptcy prediction accuracy using machine learning techniques compared to traditional methods. ...
Article
Full-text available
In this study, we apply several advanced machine learning techniques including extreme gradient boosting (XGBoost), support vector machine (SVM), and a deep neural network to predict bankruptcy using easily obtainable financial data of 3728 Belgian Small and Medium Enterprises (SME’s) during the period 2002–2012. Using the above-mentioned machine learning techniques, we predict bankruptcies with a global accuracy of 82–83% using only three easily obtainable financial ratios: the return on assets, the current ratio, and the solvency ratio. While the prediction accuracy is similar to several previous models in the literature, our model is very simple to implement and represents an accurate and user-friendly tool to discriminate between bankrupt and non-bankrupt firms.
... For this reason, Altman and others have made a new version of Z-score model [43], with a new data sample of 2,640,778 companies (2,602,563 non-bankruptcy and 38,215 bankruptcy) from USA, China, Colombia and 31 European countries. New variables (country, industry, size and age) are also included and grouped into seven hypotheses to improve the performance of the model [48], which are shown in Table 2. ...
Article
Full-text available
This paper fills the gap in the financial perspective of supply chain performance measurement, related to the lack of a bankruptcy probability indicator, and proposes a predictor which is the eighth-model of the Altman Z-Score Logistic Regression. Furthermore, a bankruptcy probability ranking is established for the companies’ supply chains, according to the industry to which they belong. Moreover, the values are set to establish three categories of companies according to predictor. The probability of bankruptcy is analysed and studied for the supply chain of different industries. The building industry is revealed to have the highest probability of bankruptcy.
... In addition, accounting-related literature applying Machine learning tools to predict the quality of accounting numbers is increasing (Liu et al. 2021). Barboza et al. (2017) compared traditional statistical methods and Machine learning using the financial data of North American firms. According to Barboza et al., the performance of Machine learning models (i.e., support vector Machines, bagging, boosting, and Random Forest) was compared the performance of discriminant analysis, logistic regression, and Neural networks. ...
Article
Full-text available
The risk-based capital (RBC) ratio, an insurance company’s financial soundness system, evaluates the capital adequacy needed to withstand unexpected losses. Therefore, continuous institutional improvement has been made to monitor the financial solvency of companies and protect consumers’ rights, and improvement of solvency systems has been researched. The primary purpose of this study is to find a set of important predictors to estimate the RBC ratio of life insurance companies in a large number of variables (1891), which includes crucial finance and management indices collected from all Korean insurers quarterly under regulation for transparent management information. This study employs a combination of Machine learning techniques: Random Forest algorithms and the Bayesian Regulatory Neural Network (BRNN). The combination of Random Forest algorithms and BRNN predicts the next period’s RBC ratio better than the conventional statistical method, which uses ordinary least-squares regression (OLS). As a result of the findings from Machine learning techniques, a set of important predictors is found within three categories: liabilities and expenses, other financial predictors, and predictors from business performance. The dataset of 23 companies with 1891 variables was used in this study from March 2008 to December 2018 with quarterly updates for each year.
Book
Full-text available
El libro “Computación para el Desarrollo – XIV Congreso” en el que se recogen las Actas del XIV Congreso Iberoamericano de Computación para el Desarrollo (COMPDES2021), editadas por Luis Bengochea, Daniel Meziat y Anayanci López, se publica bajo licencia Creative Commons 3.0 de reconocimiento – no comercial – compartir bajo la misma licencia. Se permite su copia, distribución y comunicación pública, siempre que se mantenga el reconocimiento de la obra y no se haga uso comercial de ella. Si se transforma o genera una obra derivada, sólo se puede distribuir con licencia idéntica a ésta.
Conference Paper
Full-text available
Resumo: A predição de movimento nos preços do mercado de ações tem sido uma área importante na pesquisa de algoritmos de machine learning nos últimos anos, devido a seu conteúdo complexo e dinâmico. Além disso, a volatilidade típica do mercado de ações torna a tarefa da previsão mais difícil. Dessa forma, este artigo tem como objetivo propor uma metodologia que preveja a tendência das ações de três empresas brasileiras negociadas na bolsa de valores brasileira. Por conseguinte, deseja-se facilitar o reconhecimento de movimentações no mercado, além de auxiliar a tomada de decisão para investidores com pouca experiência. Ademais, este trabalho propõe comparar as predições dos papeis usando três poderosos algoritmos de aprendizado de máquina, conhecidos como Random Forest, Redes Neurais Artificiais e Support Vector Machines. Além disso, é proposto o uso de uma nova função (kernel-SVM)-utilizada para reconhecimento de imagens-para contornar o problema das previsões. Indicadores técnicos (tais como o Índice de Força Relativa, o oscilador estocástico, entre outros), são usados como entradas para treinar os modelos. Por fim, os resultados são avaliados por meio de indicadores estatísticos como acurácia, precisão, recall e especificidade.
Chapter
Artificial Intelligence (AI) techniques will significantly impact the financial services industry, with various implications, ranging from the redefinition of processes, products, and services to transforming the way we interact with customers. Despite the still tentative benefits of AI, given the impact already being seen in the financial markets and the potential disruptive power of the entire banking industry, financial institutions worldwide are making large-scale investments in artificial intelligence. The banks that will reap the most benefits from MI innovations are most prepared and willing to make changes and adapt their approaches to people, processes, and data. In the context of innovations in finance, this paper has two main dimensions. The first dimension aims to present an overview of financial innovations and applications. The second dimension of this work seeks to illustrate the use of MI techniques, more particularly machine learning algorithms in risk management.
Article
Taking advantage of granular data we measure the change in bank capital requirement resulting from the implementation of AI techniques to predict corporate defaults. For each of the largest banks operating in France we build by an algorithm pseudo-internal models of credit risk management for a range of methodologies extensively used in AI (random forest, gradient boosting, ridge regression, deep learning). We compare these models to the traditional model usually in place that basically relies on a combination of logistic regression and expert judgement. The comparison is made along two sets of criterias capturing: the ability to pass compliance tests used by the regulators during on-site missions of model validation (i), and the induced changes in capital requirement (ii). The different models show noticeable differences in their ability to pass the regulatory tests and to lead to a reduction in capital requirement. While displaying a similar ability than the traditional model to pass compliance tests, neural networks provide the strongest incentive for banks to apply AI models for their internal model of credit risk of corporate businesses as they lead in some cases to sizeable reduction in capital requirement.
Article
Full-text available
El análisis de fracaso empresarial es importante, considerando que las empresas son el motor de la economía de un país. En el presente trabajo de investigación se estudia el riesgo de fracaso de las empresas del sector de fabricación de otros productos minerales no metálicos del Ecuador (CIIU C23). La data consta en promedio de 183 empresas en el periodo 2009-2019. Partiendo del modelo de Ohlson, se proponen los modelos econométricos logit y probit para calcular la probabilidad de fracaso de las empresas del sector. En el modelo logit la probabilidad de fracaso se encuentra entre 3,67% y 8,42%, mientras que en el probit se encuentra entre 3,79% y 7,75%. Se destaca que a medida que se incrementa el tamaño empresarial, se reduce el riesgo de fracaso y que el año 2017 presenta menor riesgo; además, el modelo logit tiene mayor capacidad predictiva.
Article
Full-text available
The influence of Artificial Intelligence is growing, as is the need to make it as explainable as possible. Explainability is one of the main obstacles that AI faces today on the way to more practical implementation. In practise, companies need to use models that balance interpretability and accuracy to make more effective decisions, especially in the field of finance. The main advantages of the multi-criteria decision-making principle (MCDM) in financial decision-making are the ability to structure complex evaluation tasks that allow for well-founded financial decisions, the application of quantitative and qualitative criteria in the analysis process, the possibility of transparency of evaluation and the introduction of improved, universal and practical academic methods to the financial decision-making process. This article presents a review and classification of multi-criteria decision-making methods that help to achieve the goal of forthcoming research: to create artificial intelligence-based methods that are explainable, transparent, and interpretable for most investment decision-makers.
Conference Paper
Full-text available
Lack of funds and difficult to access to formal loans from banks are problems faced by Vietnamese enterprises in general and Hanoi enterprises in particular. There have been many domestic and international studies on accessing to bank credit capital. While the research works focused on the analysis from a corporate perspective, no specific corporate disclosure has been made in Hanoi, in the context of the Covid pandemic. This study aims to identify the important factors affecting access to bank credit capital in Hanoi, through interviews with 200 customers, and to use a quantitative study method (linear regression). The results showed that there were five factors affecting access to bank credit capital in Hanoi: (1) Economic background, (2) Secured Assets (3) Business Plan, (4) Productivity and (5) the relationship between companies with banks.
Chapter
Worldwide, several cases go undiagnosed due to poor healthcare support in remote areas. In this context, a centralized system is needed for effective monitoring and analysis of the medical records. A web-based patient diagnostic system is a central platform to store the medical history and predict the possible disease based on the current symptoms experienced by a patient to ensure faster and accurate diagnosis. Early disease prediction can help the users determine the severity of the disease and take quick action. The proposed web-based disease prediction system utilizes machine learning-based classification techniques on a data set acquired from the National Centre of Disease Control (NCDC). K-nearest neighbor (K-NN), random forest and Naive Bayes classification approaches are utilized, and an ensemble voting algorithm is also proposed where each classifier is assigned weights dynamically based on the prediction confidence. The proposed system is also equipped with a recommendation scheme to recommend the type of tests based on the existing symptoms of the patient, so that necessary precautions can be taken. A centralized database ensures that the medical data is preserved and there is transparency in the system. The tampering into the system is prevented by giving the no “updation” rights once the diagnosis is created.
Article
In the era of big data, investor sentiment will have an impact on personal decision making and asset pricing in the securities market. This paper uses the Easteconomy stock forum and Sina stock forum as the carrier of investor sentiment to measure the positive sentiment index based on stockholders’ comments and to construct an evaluation index system for the public opinion dimension. In addition, the evaluation index system is constructed from four dimensions, which include operation, innovation, finance and financing, to evaluate the overall condition of listed companies from multiple perspectives. In this paper, the SBM model in the data envelopment analysis method is used to measure the efficiency values of each dimension of the multidimensional efficiency evaluation index system, and the efficiency values of each dimension are the multidimensional efficiency indicators. Subsequently, two sets of input feature indicators of the SVM model were established: one set contains traditional financial indicators and multidimensional efficiency indicators, and another set has only traditional financial indicators. The early warning accuracy of the two sets of input feature indicators was empirically analyzed based on the support vector machine early warning model. The results show that the early warning model incorporating multidimensional efficiency indicators has improved the accuracy compared with the early warning model based on traditional financial indicators. Then, the model was optimized by the particle swarm intelligent optimization algorithm, and the robustness of the results was tested. Moreover, six mainstream machine learning methods, including Logistic Regression, GBDT, CatBoost, AdaBoost, Random Forest and Bagging, were used to compare with the early warning effect of the DEA–SVM model, and the empirical results show that DEA–SVM has high early warning accuracy, which proves the superiority of the proposed model. The findings of this study have a positive effect on further preventing and controlling the financial crisis risk of Chinese-listed companies and promoting as well as facilitating the healthy growth of Chinese-listed companies.
Article
In 2016, India's Insolvency and Bankruptcy Board laid out the Insolvency and Bankruptcy Code for Indian companies struggling financially and seeking solvency or resolution. Since then, around three hundred firms have filed for bankruptcy resolution in India as per IBC 2016. This research studies the financial distress in Indian companies listed on the Bombay Stock Exchange (BSE) by taking a balanced sample of companies. The extant research has employed various methodologies. We apply state-of-the-art machine learning techniques for the task of prediction, such as logistic regression, lasso regression, decision tree, bagging, boosting, and support vector machine. We selected eighteen firm-level variables as explanatory variables, among which the ratio of Market capitalization/Debt came out to be the most critical variable in all the models. This variable is suggested as a measure of leverage in Altman's z-score model. The finding on the significance of the ratio of Market capitalization/Debt is in line with the existing literature. Debt is expected to be higher for financially distressed firms than the financially healthy ones. Further, as the investors might not be interested in investing in a distressed firm, it is likely to lead to a further decrease in market capitalization in financially distressed firms. The random forest bagging model achieved the highest accuracy, recall, and area under the curve (AUC) for the receiver operating characteristic (ROC) curve on the performance of the models. The boosting model achieved the highest precision.
Article
External stakeholders require accurate and explainable financial distress prediction (FDP) models. Complex machine learning algorithms offer high accuracy, but most of them lack explanatory power, resulting in external stakeholders being cautious in adopting them. Therefore, an explainable artificial intelligence approach including a whole process ensemble method and an explainable frame for FDP is here proposed. The ensemble algorithm from feature selection to predictor construction can achieve high accuracy according to the actual case, and the interpretation framework can meet the needs of external users by generating local explanations and global explanations. First, a two-stage scheme integrated with a filter and wrapper technique is designed for feature selection. Second, multiple ensemble models are explored and they are evaluated according to the actual case. Finally, Shapley additive explanations, counterfactual explanations and partial dependence plots are employed to enhance model interpretability. Taking financial data of Chinese listed companies from 2007 to 2020 as a dataset, the highest AUC is ensured by LightGBM with a value of 0.92. Local explanations help individual enterprises identify the key features which lead to their financial distress, and counterfactual explanations are produced to provide improvement strategies. By analyzing the features importance and the impact of feature interaction on the results, global explanations can improve the transparency and credibility of ‘black box’ models.
Article
Corporate financial distress prediction is a pivotal aspect of economic development. The ability to foretell that a company will be getting into financial distress is essential for decision-makers, shareholders, and policymakers in making the best decisions and policies for sustainable development. Prediction accuracy is of paramount importance in the implementation of distress mitigation measures, a critical component attracting investment in particular to most of the developing countries in Africa. The advent of the fourth industrial revolution saw Artificial Intelligence (AI) taking centre stage in financial risk modelling. This growth has however not precluded the role of traditional statistical methods in modelling financial risk. There is a lack of consensus amongst academia and practitioners on the accuracy of these two groups of methodologies in distress prediction. Protagonists of the conventional school of thought still hold on to statistical methods being more accurate whilst the new age proponents believe AI has brought in higher levels of predictive strength and model accuracy. This study seeks to compare the accuracy of Logit and Artificial Neural Networks (ANN) in corporate distress prediction. The two modelling techniques were applied to an 8-year panel dataset from the Zimbabwe Stock Exchange. The Logit model outperformed the ANN by an overall accuracy of 92.21% compared to ANN with 85.8%. Heightened prediction accuracy is bound to improve the return to shareholders by enhancing financial risk management within emerging markets. This study also seeks to contribute to the ongoing debate on the superiority between AI techniques and statistical techniques.
Article
The COVID-19 pandemic led to a great deal of financial uncertainty in the stock market. An initial drop in March 2020 was followed by unexpected rapid growth over 2021. Therefore, financial risk forecasting continues to be a central issue in financial planning, dealing with new types of uncertainty. This paper presents a stock market forecasting model combining a multi-layer perceptron artificial neural network (MLP-ANN) with the traditional Altman Z-Score model. The contribution of the paper is presentation of a new hybrid enterprise crisis warning model combining Z-score and MLP-ANN models. The new hybrid default prediction model is demonstrated using Chinese data. The results of empirical analysis show that the average correct classification rate of thew hybrid neural network model (99.40%) is higher than that of the Altman Z-score model (86.54%) and of the pure neural network method (98.26%). Our model can provide early warning signals of a company's deteriorating financial situation to managers and other related personnel, investors and creditors, government regulators, financial institutions and analysts and others so that they can take timely measures to avoid losses.
Article
The main aim and contribution of this study is to outline and demonstrate the usefulness of a machine learning approach to address prediction-based research problems in accounting research, and to contrast this approach with a more conventional explanation-based approach familiar to most accounting scholars. To illustrate the approach, the study applies machine learning to predict a firm’s industry sector using the firm’s publicly available financial statement data. The results show that an algorithm can predict an industry sector with just this data to a high degree of accuracy, especially if a non-linear classifier is used instead of a linear classifier. Additionally, the algorithms were able to carry out an industry-firm pairing exercise taken from introductory accounting text books and MBA cases, with predicted answers showing a high degree of accuracy in carrying out this exercise. The study shows how machine learning approaches and algorithms can be valuable to a range of accounting domains where prediction rather than explanation of the dependent variable is the main area of concern.
Article
This paper provides a new approach to developing a firm's distress and recovery prediction score. This score was designated the FL-Score and was structured from the interaction between financial and economic components. The tests were conducted from a sample of U.S. non-financial public firms from 2002 to 2019, using both more traditional statistical models, such as logit regressions, and machine learning techniques. The results show that the FL-Score is robust in predicting companies' distress and recovery, even for particular cases of distress, e.g., pure economic distress, pure financial distress, and mixed distress. Different profiles of failure risks were identified according to firm size, and an inverse relationship was also identified between the risks of distress, measured by the FL-Score, and the use of financial derivatives by the firm as a way to mitigate distress or accelerate recovery. The study also presents relevant considerations regarding using metrics related to the current and expected generation of economic value for the modeling of distress and recovery prediction scores.
Article
For sustainable economic growth, information about economic activities and prospects is critical to decision-makers such as governments, central banks, and financial markets. However, accurate predictions have been challenging due to the complexity and uncertainty of financial and economic systems amid repeated changes in economic environments. This study provides two approaches for better economic prediction and decision-making. We present a deep learning model based on the long short-term memory (LSTM) network architecture to predict economic growth rates and crises by capturing sequential dependencies within the economic cycle. In addition, we provide an interpretable machine learning model that derives economic patterns of growth and crisis through efficient use of the eXplainable AI (XAI) framework. For major G20 countries from 1990 to 2019, our LSTM model outperformed other traditional predictive models, especially in emerging countries. Moreover, in our model, private debt in developed economies and government debt in emerging economies emerged as major factors that limit future economic growth. Regarding the economic impact of COVID-19, we found that sharply reduced interest rates and expansion of government debt increased the probability of a crisis in some emerging economies in the future.
Article
Applying machine learning techniques to predict bankruptcy in the sample of French, Italian, Russian and Spanish firms, the study demonstrates that the inclusion of economic policy uncertainty (EPU) indicator into bankruptcy prediction models notably increases their accuracy. This effect is more pronounced when we use novel Twitter-based version of EPU index instead of original news-based index. We further compare the prediction accuracy of machine learning techniques and conclude that stacking ensemble method outperforms (though marginally) machine learning methods, which are more commonly used for bankruptcy prediction, such as single classifiers and bagging.
Article
Enterprise credit risk prediction in the supply chain context is an important step for decision making and early credit crisis warnings. Improving the prediction performance of this task is an academic and industrial focus. Feature selection and class imbalance can affect prediction performance: redundant and irrelevant features increase the learning difficulty of the prediction model, cause overfitting and reduce prediction performance, whereas class imbalance, with many fewer minority class instances than majority class instances, may cause model failure. Herein, a sequence backward feature selection algorithm based on ranking information (SBFS-RI) and a novel ensemble feature selection method integrating multiple ranking information (FS-MRI) are proposed. The FS-MRI method can realize the automatic threshold function while considering the model performance and then output the best and a more stable feature subset. In addition, an SVM ensemble model with an artificial imbalance rate (SVME-AIR) is proposed to solve the class imbalance problem and realize the effective combination of under-sampling technology and the AdaBoost ensemble method for the first time. Finally, FS-MRI and SVME-AIR are combined through a two-stage model design. The hybrid model can effectively solve the feature selection and class imbalance problems for enterprise credit risk prediction in the supply chain context. Supply chain data of Chinese listed enterprises shows that the FS-MRI method outperforms nine other feature selection methods and provides more robust and efficient feature subsets. The SVME-AIR model has higher AUC and KS values than other ensemble models and single classifiers. When combined, the two methods achieve the best prediction performance, with maximum AUC and KS values of 0.8772 and 0.6363, respectively.
Article
Given the large amount of customer data available to financial companies, the use of traditional statistical approaches (e.g., regressions) to predict customers’ credit scores may not provide the best predictive performance. Machine learning (ML) algorithms have been explored in the credit scoring literature to increase predictive power. In this paper, we predict commercial customers’ credit scores using hybrid ML algorithms that combine unsupervised and supervised ML methods. We implement different approaches and compare the performance of the hybrid models to that of individual supervised ML models. We find that hybrid models outperform their individual counterparts in predicting commercial customers’ credit scores. Further, while the existing literature ignores past credit scores, we find that the hybrid models’ predictive performance is higher when these features are included.
Conference Paper
Full-text available
Since the COVID-19 pandemic, there has been a highlighted focus on risk management in banks. Meanwhile, with their technological advancements, Artificial Intelligence (AI) and machine learning techniques have become more and more popular in risk management. These advanced technologies are at the core of banks’ strategy, and they have many potentials to revolutionize financial services. This paper aims to understand and examine how Artificial Intelligence (AI) and machine learning have been implemented in Vietnamese commercial banks and have changed the way in which these banks manage their risks. The analysis is carried out through a review of the available literature and disclosure from Vietnamese banks to find current practices and significant benefits of Artificial Intelligence (AI) and machine learning applications. In terms of risk perspectives, credit risk, market risk, operational risk, and RegTech have been explored. We conclude that Artificial Intelligence (AI) and machine learning can help mitigate these risks in Vietnamese commercial banks. Many other areas in risk management can be improved and should be further investigated in the future. However, we noted some specific problems around staff-related barriers, data privacy and protection requirements, and transparency/explainability within banks.
Chapter
This paper focuses broadly on the application of various types of AI technology in the buy-side of financial services and more specifically on the application of AI to financial portfolio management. Current market volatility in response to the COVID-19 pandemic has given new urgency to the perennial challenge of achieving quality investment returns, and the ever-present trade-off between return and risk that all portfolio managers have to master. The complexity and volume of relevant information today, and the rate of change in the current environment, have only heightened the need for smarter financial choices. Various types of AI may be used to respectively achieve higher portfolio returns, increase operational efficiency, and enhance the customer experience. Successful AI usage will always involve an optimum mix of machine-provided and human-based services, where the AI enhances and accelerates human portfolio decision-making and saves labor costs.
Article
Full-text available
The purpose of this paper is to examine the capital structure decisions of restaurant firms. The paper hypothesizes that these decisions are based upon a financial “pecking order” as well as the position of the firm in the financial growth cycle. Using ratios from publicly traded restaurant firms in the U.S. and ordinary least squares regression models, the results tend to support the notion that both the pecking order and the financial growth cycle influence financing decisions. However, the results also indicate that there may be separate factors affecting long-term and short-term debt decisions made by restaurant managers.
Article
Full-text available
Technical and quantitative analysis in financial trading use mathematical and statistical tools to help investors decide on the optimum moment to initiate and close orders. While these traditional approaches have served their purpose to some extent, new techniques arising from the field of computational intelligence such as machine learning and data mining have emerged to analyse financial information. While the main financial engineering research has focused on complex computational models such as Neural Networks and Support Vector Machines, there are also simpler models that have demonstrated their usefulness in applications other than financial trading, and are worth considering to determine their advantages and inherent limitations when used as trading analysis tools. This paper analyses the role of simple machine learning models to achieve profitable trading through a series of trading simulations in the FOREX market. It assesses the performance of the models and how particular setups of the models produce systematic and consistent predictions for profitable trading. Due to the inherent complexities of financial time series the role of attribute selection, periodic retraining and training set size are discussed in order to obtain a combination of those parameters not only capable of generating positive cumulative returns for each one of the machine learning models but also to demonstrate how simple algorithms traditionally precluded from financial forecasting for trading applications presents similar performances as their more complex counterparts. The paper discusses how a combination of attributes in addition to technical indicators that has been used as inputs of the machine learning-based predictors such as price related features, seasonality features and lagged values used in classical time series analysis are used to enhance the classification capabilities that impacts directly into the final profitability.
Article
Full-text available
Multi-layer perception (MLP) neural networks are widely used in automatic credit scoring systems with high accuracy and efficiency. This paper presents a higher accuracy credit scoring model based on MLP neural networks that have been trained with the back propagation algorithm. Our work focuses on enhancing credit scoring models in three aspects: (i) to optimise the data distribution in datasets using a new method called Average Random Choosing; (ii) to compare effects of training–validation–test instance numbers; and (iii) to find the most suitable number of hidden units. We trained 34 models 20 times with different initial weights and training instances. Each model has 6 to 39 hidden units with one hidden layer. Using the well-known German credit dataset we provide test results and a comparison between models, and we get a model with a classification accuracy of 87%, which is higher by 5% than the best result reported in the relevant literature of recent years. We have also proved that our optimisation of dataset structure can increase a model’s accuracy significantly in comparison with traditional methods. Finally, we summarise the tendency of scoring accuracy of models when the number of hidden units increases. The results of this work can be applied not only to credit scoring, but also to other MLP neural network applications, especially when the distribution of instances in a dataset is imbalanced.
Article
Full-text available
Due to the economic significance of bankruptcy prediction of companies for financial institutions, investors and governments, many quantitative methods have been used to develop effective prediction models. Support vector machine SVM, a powerful classification method, has been used for this task; however, the performance of SVM is sensitive to model form, parameter setting and features selection. In this study, a new approach based on direct search and features ranking technology is proposed to optimise features selection and parameter setting for 1-norm and least-squares SVM models for bankruptcy prediction. This approach is also compared to the SVM models with parameter optimisation and features selection by the popular genetic algorithm technique. The experimental results on a data set with 2010 instances show that the proposed models are good alternatives for bankruptcy prediction.
Data
Full-text available
This study aims to explore the relations between bank credit risks and macroeconomic factors. We employ a set of variables including the inflation rate, interest rate, the ISE-100 index, foreign exchange rate, growth rate, M2 money supply, unemployment rate, and the credit risk represented by the ratio of non-performing loans to total loans (NPL) for Turkey during the January 1998 and July 2012 period. The general-to-specific modelling methodology developed by Hendry (1980) was employed to analyze short-run dynamic intervariable relationships, while Engle-Granger (1987) and Gregory-Hansen (1996) methodologies were used to analyze long-run relationships. In both methods, growth rate and ISE index are the variables that reduce banks' credit risk in the long run, while money supply, foreign exchange rate, unemployment rate, inflation rate, and interest rate are the variables that increase banks' credit risks. The specific model demonstrated that the previous period's credit risk has a significant impact on the current period's credit risk.
Article
Full-text available
Banking systemic risk is a complex nonlinear phenomenon and has shed light on the importance of safeguarding financial stability by recent financial crisis. According to the complex nonlinear characteristics of banking systemic risk, in this paper we apply support vector machine (SVM) to the prediction of banking systemic risk in an attempt to suggest a new model with better explanatory power and stability. We conduct a case study of an SVM-based prediction model for Chinese banking systemic risk and find the experiment results showing that support vector machine is an efficient method in such case.
Article
Full-text available
Support vector machines (SVMs), with their roots in Statistical Learning Theory (SLT) and optimization methods, have become powerful tools for problem solution in machine learning. SVMs reduce most machine learning problems to optimization problems and optimization lies at the heart of SVMs. Lots of SVM algorithms involve solving not only convex problems, such as linear programming, quadratic programming, second order cone programming, semi-definite programming, but also non-convex and more general optimization problems, such as integer programming, semi-infinite programming, bi-level programming and so on. The purpose of this paper is to understand SVM from the optimization point of view, review several representative optimization models in SVMs, their applications in economics, in order to promote the research interests in both optimization-based SVMs theory and economics applications. This paper starts with summarizing and explaining the nature of SVMs. It then proceeds to discuss optimization models for SVM following three major themes. First, least squares SVM, twin SVM, AUC Maximizing SVM, and fuzzy SVM are discussed for standard problems. Second, support vector ordinal machine, semisupervised SVM, Universum SVM, robust SVM, knowledge based SVM and multi-instance SVM are then presented for nonstandard problems. Third, we explore other important issues such as lp-norm SVM for feature selection, LOOSVM based on minimizing LOO error bound, probabilistic outputs for SVM, and rule extraction from SVM. At last, several applications of SVMs to financial forecasting, bankruptcy prediction, credit risk analysis are introduced.
Article
Full-text available
Examines the capital structure decisions of restaurant firms. Hypothesizes that these decisions are based upon a financial “pecking-order” as well as the position of the firm in the financial growth cycle. Using ratios from publicly-traded restaurant firms in the USA and ordinary least squares regression models, the results tend to support the notion that both the pecking-order and the financial growth cycle influence financing decisions. However, the results also indicate that there may be separate factors affecting long-term and short-term debt decisions made by restaurant managers.
Article
Full-text available
Empirical accounting researchers often use Altman's (1968) and Ohlson's (1980) bankruptcy prediction models as indicators of financial distress. While these models performed relatively well when they were estimated, we show that they do not perform as well in more recent periods (in particular, the 1980s), even when the coefficients are re-estimated. When we compare the performance of Ohlson's original model to our re-estimated version of his model and to that of Altman's original and re-estimated models, we find that Ohlson's original model displays the strongest overall performance. Given that Ohlson's original model is frequently used in academic research as an indicator of financial distress, its strong performance in this study supports its use as a preferred model.
Article
Full-text available
In this paper, we investigate the performance of several systems based on ensemble of classifiers for bankruptcy prediction and credit scoring.The obtained results are very encouraging, our results improved the performance obtained using the stand-alone classifiers. We show that the method “Random Subspace” outperforms the other ensemble methods tested in this paper. Moreover, the best stand-alone method is the multi-layer perceptron neural net, while the best method tested in this work is the Random Subspace of Levenberg–Marquardt neural net.In this work, three financial datasets are chosen for the experiments: Australian credit, German credit, and Japanese credit.
Article
Full-text available
We investigate the link between distress and idiosyncratic volatility. Specifically, we examine the twin puzzles of anomalously low returns for high idiosyncratic volatility stocks and high distress risk stocks, documented by Ang et al. (2006) and Campbell et al. (2008), respectively. We document that these puzzles are empirically connected, and can be explained by a simple, theoretical, single-beta CAPM model.
Article
Full-text available
Bankruptcy prediction has drawn a lot of research interests in previous literature, and recent studies have shown that machine learning techniques achieved better performance than traditional statistical ones. This paper applies support vector machines (SVMs) to the bankruptcy prediction problem in an attempt to suggest a new model with better explanatory power and stability. To serve this purpose, we use a grid-search technique using 5-fold cross-validation to find out the optimal parameter values of kernel function of SVM. In addition, to evaluate the prediction accuracy of SVM, we compare its performance with those of multiple discriminant analysis (MDA), logistic regression analysis (Logit), and three-layer fully connected back-propagation neural networks (BPNs). The experiment results show that SVM outperforms the other methods.
Article
Understanding if credit risk is driven mostly by idiosyncratic firm characteristics or by systematic factors is an important issue for the assessment of financial stability. By exploring the links between credit risk and macroeconomic developments, we observe that in periods of economic growth there may be some tendency towards excessive risk-taking. Using an extensive dataset with detailed information for more than 30Â 000 firms, we show that default probabilities are influenced by several firm-specific characteristics. When time-effect controls or macroeconomic variables are also taken into account, the results improve substantially. Hence, though the firms' financial situation has a central role in explaining default probabilities, macroeconomic conditions are also very important when assessing default probabilities over time.
Article
Identifying students’ learning styles has several benefits such as making students aware of their strengths and weaknesses when it comes to learning and the possibility to personalize their learning environment to their learning styles. While there exist learning style questionnaires for identifying a student's learning style, such questionnaires have several disadvantages and therefore, research has been conducted on automatically identifying learning styles from students’ behavior in a learning environment. Current approaches to automatically identify learning styles have an average precision between 66% and 77%, which shows the need for improvements in order to use such automatic approaches reliably in learning environments. In this paper, four computational intelligence algorithms (artificial neural network, genetic algorithm, ant colony system and particle swarm optimization) have been investigated with respect to their potential to improve the precision of automatic learning style identification. Each algorithm was evaluated with data from 75 students. The artificial neural network shows the most promising results with an average precision of 80.7%, followed by particle swarm optimization with an average precision of 79.1%. Improving the precision of automatic learning style identification allows more students to benefit from more accurate information about their learning styles as well as more accurate personalization towards accommodating their learning styles in a learning environment. Furthermore, teachers can have a better understanding of their students and be able to provide more appropriate interventions.
Article
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Article
The optimal selection of chemical features (molecular descriptors) is an essential pre-processing step for the efficient application of computational intelligence techniques in virtual screening for identification of bioactive molecules in drug discovery. The selection of molecular descriptors has key influence in the accuracy of affinity prediction. In order to improve this prediction, we examined a Random Forest (RF)-based approach to automatically select molecular descriptors of training data for ligands of kinases, nuclear hormone receptors, and other enzymes. The reduction of features to use during prediction dramatically reduces the computing time over existing approaches and consequently permits the exploration of much larger sets of experimental data. To test the validity of the method, we compared the results of our approach with the ones obtained using manual feature selection in our previous study (Perez-Sanchez, Cano, and Garcia-Rodriguez, 2014).The main novelty of this work in the field of drug discovery is the use of RF in two different ways: feature ranking and dimensionality reduction, and classification using the automatically selected feature subset. Our RF-based method outperforms classification results provided by Support Vector Machine (SVM) and Neural Networks (NN) approaches.
Article
The task of classifying is natural to humans, but there are situations in which a person is not best suited to perform this function, which creates the need for automatic methods of classification. Traditional methods, such as logistic regression, are commonly used in this type of situation, but they lack robustness and accuracy. These methods do not not work very well when the data or when there is noise in the data, situations that are common in expert and intelligent systems. Due to the importance and the increasing complexity of problems of this type, there is a need for methods that provide greater accuracy and interpretability of the results. Among these methods, is Boosting, which operates sequentially by applying a classification algorithm to reweighted versions of the training data set. It was recently shown that Boosting may also be viewed as a method for functional estimation. The purpose of the present study was to compare the logistic regressions estimated by the maximum likelihood model (LRMML) and the logistic regression model estimated using the Boosting algorithm, specifically the Binomial Boosting algorithm (LRMBB), and to select the model with the better fit and discrimination capacity in the situation of presence(absence) of a given property (in this case, binary classification). To illustrate this situation, the example used was to classify the presence (absence) of coronary heart disease (CHD) as a function of various biological variables collected from patients. It is shown in the simulations results based on the strength of the indications that the LRMBB model is more appropriate than the LRMML model for the adjustment of data sets with several covariables and noisy data. The following sections report lower values of the information criteria AIC and BIC for the LRMBB model and that the Hosmer-Lemeshow test exhibits no evidence of a bad fit for the LRMBB model. The LRMBB model also presented a higher AUC, sensitivity, specificity and accuracy and lower values âÇïof false positives rates and false negatives rates, making it a model with better discrimination power compared to the LRMML model. Based on these results, the logistic model adjusted via the Binomial Boosting algorithm (LRMBB model) is better suited to describe the problem of binary response, because it provides more accurate information regarding the problem considered.
Article
We assess whether two popular accounting-based measures, Altman’s (1968) Z-Score and Ohlson’s (1980) O-Score, effectively summarize publicly-available information about the probability of bankruptcy. We compare the relative information content of these Scores to a market-based measure of the probability of bankruptcy that we develop based on the Black–Scholes–Merton option-pricing model, BSM-Prob. Our tests show that BSM-Prob provides significantly more information than either of the two accounting-based measures. This finding is robust to various modifications of Z-Score and O-Score, including updating the coefficients, making industry adjustments, and decomposing them into their lagged levels and changes. We recommend that researchers use BSM-Prob instead of Z-Score and O-Score in their studies and provide the SAS code to calculate BSM-Prob.
Article
This paper presents an alternative technique for financial distress prediction systems. The method is based on a type of neural network, which is called hybrid associative memory with translation. While many different neural network architectures have successfully been used to predict credit risk and corporate failure, the power of associative memories for financial decision-making has not been explored in any depth as yet. The performance of the hybrid associative memory with translation is compared to four traditional neural networks, a support vector machine and a logistic regression model in terms of their prediction capabilities. The experimental results over nine real-life data sets show that the associative memory here proposed constitutes an appropriate solution for bankruptcy and credit risk prediction,performing significantly better than the rest of models under class imbalance and data overlapping conditions in terms of the true positive rate and the geometric mean of true positive and true negative rates.
Article
Ensemble techniques such as bagging or boosting, which are based on combinations of classifiers, make it possible to design models that are often more accurate than those that are made up of a unique prediction rule. However, the performance of an ensemble solely relies on the diversity of its different components and, ultimately, on the algorithm that is used to create this diversity. It means that such models, when they are designed to forecast corporate bankruptcy, do not incorporate or use any explicit knowledge about this phenomenon that might supplement or enrich the information they are likely to capture. This is the reason why we propose a method that is precisely based on some knowledge that governs bankruptcy, using the concept of “financial profiles”, and we show how the complementarity between this technique and ensemble techniques can improve forecasts.
Article
Credit scoring aims to assess the risk associated with lending to individual consumers. Recently, ensemble classification methodology has become popular in this field. However, most researches utilize random sampling to generate training subsets for constructing the base classifiers. Therefore, their diversity is not guaranteed, which may lead to a degradation of overall classification performance. In this paper, we propose an ensemble classification approach based on supervised clustering for credit scoring. In the proposed approach, supervised clustering is employed to partition the data samples of each class into a number of clusters. Clusters from different classes are then pairwise combined to form a number of training subsets. In each training subset, a specific base classifier is constructed. For a sample whose class label needs to be predicted, the outputs of these base classifiers are combined by weighted voting. The weight associated with a base classifier is determined by its classification performance in the neighborhood of the sample. In the experimental study, two benchmark credit data sets are adopted for performance evaluation, and an industrial case study is conducted. The results show that compared to other ensemble classification methods, the proposed approach is able to generate base classifiers with higher diversity and local accuracy, and improve the accuracy of credit scoring.
Focusing on credit risk modelling, this paper introduces a novel approach for ensemble modelling based on a normative linear pooling. Models are first classified as dominant and competitive, and the pooling is run using the competitive models only. Numerical experiments based on parametric (logit, Bayesian model averaging) and nonparametric (classification tree, random forest, bagging, boosting) model comparison shows that the proposed ensemble performs better than alternative approaches, in particular when different modelling cultures are mixed together (logit and classification tree). Copyright
Article
Effective bankruptcy prediction is critical for financial institutions to make appropriate lending decisions. In general, the input variables (or features), such as financial ratios, and prediction techniques, such as statistical and machine learning techniques, are the two most important factors affecting the prediction performance. While many related works have proposed novel prediction techniques, very few have analyzed the discriminatory power of the features related to bankruptcy prediction. In the literature, in addition to financial ratios (FRs), corporate governance indicators (CGIs) have been found to be another important type of input variable. However, the prediction performance obtained by combining CGIs and FRs has not been fully examined. Only some selected CGIs and FRs have been used in related studies and the chosen features may differ from study to study. Therefore, the aim of this paper is to assess the prediction performance obtained by combining seven different categories of FRs and five different categories of CGIs. The experimental results, based on a real-world dataset from Taiwan, show that the FR categories of solvency and profitability and the CGI categories of board structure and ownership structure are the most important features in bankruptcy prediction. Specifically, the best prediction model performance is obtained with a combination in terms of prediction accuracy, Type I/II errors, ROC curve, and misclassification cost. However, these findings may not be applicable in some markets where the definition of distressed companies is unclear and the characteristics of corporate governance indicators are not obvious, such as in the Chinese market.
Article
Business health prediction is critical and challenging in today’s volatile environment, thus demand going beyond classical business failure studies underpinned by rigidities, like paired sampling, a-priori predictors, rigid binary categorization, amongst others. In response, our paper proposes an investor-facing dynamic model for characterizing business health by using a mixed set of techniques, combining both classical and “expert system” methods. Data for constructing the model was obtained from 198 multinational manufacturing and service firms spread over 26 industrial sectors, through Wharton database. The novel 4-stage methodology developed combines a powerful stagewise regression for dynamic predictor selection, a linear regression for modelling expert ratings of firms’ stock value, an SVM model developed from unmatched sample of firms, and finally an SVM-probability model for continuous classification of business health. This hybrid methodology reports comparably higher classification and prediction accuracies (over 0.96 and ~90%, respectively) and predictor extraction rate (~96%). It can also objectively identify and constitute new unsought variables to explain and predict behaviour of business subjects. Among other results, such a volatile model build upon a stable methodology can influence business practitioners in a number of ways to monitor and improve financial health. Future research can concentrate on adding a time-variable to the financial model along with more sector-specificity.
Article
There is great discussion but little consensus on the best measures of organizational performance. This book redresses this imbalance. Measuring Organizational Performance offers a framework with which to better understand the implications of selecting variables for use in both empirical studies and practice where organizational financial performance is the critical issue. © Robert B. Carton and Charles W. Hofer 2006. All rights reserved.
Article
In parallel to the increase in the number of credit card transactions, the financial losses due to fraud have also increased. Thus, the popularity of credit card fraud detection has been increased both for academicians and banks. Many supervised learning methods were introduced in credit card fraud literature some of which bears quite complex algorithms. As compared to complex algorithms which somehow over-fit the dataset they are built on, one can expect simpler algorithms may show a more robust performance on a range of datasets. Although, linear discriminant functions are less complex classifiers and can work on high-dimensional problems like credit card fraud detection, they did not receive considerable attention so far. This study investigates a linear discriminant, called Fisher Discriminant Function for the first time in credit card fraud detection problem. On the other hand, in this and some other domains, cost of false negatives is very higher than false positives and is different for each transaction. Thus, it is necessary to develop classification methods which are biased toward the most important instances. To cope for this, a Modified Fisher Discriminant Function is proposed in this study which makes the traditional function more sensitive to the important instances. This way, the profit that can be obtained from a fraud/legitimate classifier is maximized. Experimental results confirm that Modified Fisher Discriminant could eventuate more profit.
Article
The aim of bankruptcy prediction in the areas of data mining and machine learning is to develop an effective model which can provide the higher prediction accuracy. In the prior literature, various classification techniques have been developed and studied, in/with which classifier ensembles by combining multiple classifiers approach have shown their outperformance over many single classifiers. However, in terms of constructing classifier ensembles, there are three critical issues which can affect their performance. The first one is the classification technique actually used/adopted, and the other two are the combination method to combine multiple classifiers and the number of classifiers to be combined, respectively. Since there are limited, relevant studies examining these aforementioned disuses, this paper conducts a comprehensive study of comparing classifier ensembles by three widely used classification techniques including multilayer perceptron (MLP) neural networks, support vector machines (SVM), and decision trees (DT) based on two well-known combination methods including bagging and boosting and different numbers of combined classifiers. Our experimental results by three public datasets show that DT ensembles composed of 80-100 classifiers using the boosting method perform best. The Wilcoxon signed ranked test also demonstrates that DT ensembles by boosting perform significantly different from the other classifier ensembles. Moreover, a further study over a real-world case by a Taiwan bankruptcy dataset was conducted, which also demonstrates the superiority of DT ensembles by boosting over the others.
Article
The restaurant industry has been facing tough challenges because of the recent economic turmoil. Although different industries face different levels of competition and therefore the likelihood of financial distress can differ for firms in different industries, scant attention has been paid to predicting restaurant financial distress. The primary objective of this paper is to examine the key financial distress factors for publicly traded U.S. restaurants for the period from 1988 to 2010 using decision trees (DT) and AdaBoosted decision trees. The AdaBoosted DT model for the entire dataset revealed that financially distressed restaurants relied more heavily on debt; and showed lower rates of increase of assets, lower net profit margins, and lower current ratios than non-distressed restaurants. A larger proportion of debt in the capital structure ruined restaurants' financial structure and the inability to pay their drastically increased debt exposed restaurants to financial distress. Additionally, a lack of capital efficiency increased the possibility of financial distress. We recommend the use of the AdaBoosted DT model as an early warning system for restaurant distress prediction because the AdaBoosted DT model demonstrated the best prediction performance with the smallest error in overall and type I error rates. The results of two subset models for full-service and limited-service restaurants indicated that the segments had slightly different financial risk factors.
Article
We develop a model of neural networks to study the bankruptcy of U.S. banks, taking into account the specific features of the recent financial crisis. We combine multilayer perceptrons and self-organizing maps to provide a tool that displays the probability of distress up to three years before bankruptcy occurs. Based on data from the Federal Deposit Insurance Corporation between 2002 and 2012, our results show that failed banks are more concentrated in real estate loans and have more provisions. Their situation is partially due to risky expansion, which results in less equity and interest income. After drawing the profile of distressed banks, we develop a model to detect failures and a tool to assess bank risk in the short, medium and long term using bankruptcies that occurred from May 2012 to December 2013 in U.S. banks. The model can detect 96.15% of the failures in this period and outperforms traditional models of bankruptcy prediction.
Article
In classification or prediction tasks, data imbalance problem is frequently observed when most of instances belong to one majority class. Data imbalance problem has received considerable attention in machine learning community because it is one of the main causes that degrade the performance of classifiers or predictors. In this paper, we propose geometric mean based boosting algorithm (GMBoost) to resolve data imbalance problem. GMBoost enables learning with consideration of both majority and minority classes because it uses the geometric mean of both classes in error rate and accuracy calculation. To evaluate the performance of GMBoost, we have applied GMBoost to bankruptcy prediction task. The results and their comparative analysis with AdaBoost and cost-sensitive boosting indicate that GMBoost has the advantages of high prediction power and robust learning capability in imbalanced data as well as balanced data distribution.
Article
A lot of bankruptcy forecasting model has been studied. Most of them uses corporate finance data and is intended for general companies. It may not appropriate for forecasting bankruptcy of construction companies which has big liquidity. It has a different capital structure, and the model to judge the financial risk of general companies can be difficult to apply the construction companies. The existing studies such as traditional Z-score and bankruptcy prediction using machine learning focus on the companies of nonspecific industries. The characteristics of companies are not considered at all. In this paper, we showed that AdaBoost (adaptive boosting) is an appropriate model to judge the financial risk of Korean construction companies. We classified construction companies into three groups - large, middle, and small based on the capital of a company. We analyzed the predictive ability of the AdaBoost and other algorithms for each group of companies. The experimental results showed that the AdaBoost has more predictive power than others, especially for the large group of companies that has the capital more than 50 billion won.
Article
Corporate going-concern opinions are not only useful in predicting bankruptcy but also provide some explanatory power in predicting bankruptcy resolution. The prediction of a firm's ability to remain a going concern is an important and challenging issue that has served as the impetus for many academic studies over the last few decades. Although intellectual capital (IC) is generally acknowledged as the key factor contributing to a corporation's ability to remain a going concern, it has not been considered in early prediction models. The objective of this study is to increase the accuracy of going-concern prediction by using a hybrid random forest (RF) and rough set theory (RST) approach, while adopting IC as a predictive variable. The results show that this proposed hybrid approach has the best classification rate and the lowest occurrence of Types I and II errors, and that IC is indeed valuable for going-concern prediction.
Article
We investigated the performance of parametric and non-parametric methods concerning the in-sample pricing and out-of-sample prediction performances of index options. Comparisons were performed on the KOSPI 200 Index options from January 2001 to December 2010. To verify the statistical differences between the compared methods, we tested the following null hypothesis: two series of forecasting errors have the same mean-squared value. The experimental study reveals that non-parametric methods significantly outperform parametric methods on both in-sample pricing and out-of-sample pricing. The outperforming non-parametric method is statistically different from the other models, and significantly different from the parametric models. The Gaussian process model delivers the most outstanding performance in forecasting, and also provides the predictive distribution of option prices.
Article
With the recent financial crisis and European debt crisis, corporate bankruptcy prediction has become an increasingly important issue for financial institutions. Many statistical and intelligent methods have been proposed, however, there is no overall best method has been used in predicting corporate bankruptcy. Recent studies suggest ensemble learning methods may have potential applicability in corporate bankruptcy prediction. In this paper, a new and improved Boosting, FS-Boosting, is proposed to predict corporate bankruptcy. Through injecting feature selection strategy into Boosting, FS-Booting can get better performance as base learners in FS-Boosting could get more accuracy and diversity. For the testing and illustration purposes, two real world bankruptcy datasets were selected to demonstrate the effectiveness and feasibility of FS-Boosting. Experimental results reveal that FS-Boosting could be used as an alternative method for the corporate bankruptcy...
Article
Seasonality effects and empirical regularities in financial data have been well documented in the financial economics literature for over seven decades. This paper proposes an expert system that uses novel machine learning techniques to predict the price return over these seasonal events, and then uses these predictions to develop a profitable trading strategy. While simple approaches to trading these regularities can prove profitable, such trading leads to potential large drawdowns (peak-to-trough decline of an investment measured as a percentage between the peak and the trough) in profit. In this paper, we introduce an automated trading system based on performance weighted ensembles of random forests that improves the profitability and stability of trading seasonality events. An analysis of various regression techniques is performed as well as an exploration of the merits of various techniques for expert weighting. The performance of the models is analysed using a large sample of stocks from the DAX. The results show that recency-weighted ensembles of random forests produce superior results in terms of both profitability and prediction accuracy compared with other ensemble techniques. It is also found that using seasonality effects produces superior results than not having them modelled explicitly.
Article
This paper extends the macroeconomic frailty model to include sectoral frailty factors that capture default correlations among firms in a similar business. We estimate sectoral and macroeconomic frailty factors and their effects on default intensity using the data for Japanese firms from 1992 to 2010. We find strong evidence for the presence of sectoral frailty factors even after accounting for the effects of observable covariates and macroeconomic frailty on default intensity. The model with sectoral frailties performs better than that without. Results show that accounting for the sources of unobserved sectoral default risk covariations improves the accuracy of default probability estimation.
Article
Consumer credit scoring is often considered a classification task where clients receive either a good or a bad credit status. Default probabilities provide more detailed information about the creditworthiness of consumers, and they are usually estimated by logistic regression. Here, we present a general framework for estimating individual consumer credit risks by use of machine learning methods. Since a probability is an expected value, all nonparametric regression approaches which are consistent for the mean are consistent for the probability estimation problem. Among others, random forests (RF), k-nearest neighbors (kNN), and bagged k-nearest neighbors (bNN) belong to this class of consistent nonparametric regression approaches. We apply the machine learning methods and an optimized logistic regression to a large dataset of complete payment histories of short-termed installment credits. We demonstrate probability estimation in Random Jungle, an RF package written in C++ with a generalized framework for fast tree growing, probability estimation, and classification. We also describe an algorithm for tuning the terminal node size for probability estimation. We demonstrate that regression RF outperforms the optimized logistic regression model, kNN, and bNN on the test data of the short-term installment credits.