Extracting Discriminative Features Using Non-negative Matrix Factorization in Financial Distress Data

DOI: 10.1007/978-3-642-04921-7_55
Source: DBLP

ABSTRACT In the recent financial crisis the incidence of important cases of bankruptcy led to a growing interest in corporate bankruptcy
prediction models. In addition to building appropriate financial distress prediction models, it is also of extreme importance
to devise dimensionality reduction methods able to extract the most discriminative features. Here we show that Non-Negative
Matrix Factorization (NMF) is a powerful technique for successful extraction of features in this financial setting. NMF is
a technique that decomposes financial multivariate data into a few basis functions and encodings using non-negative constraints.
We propose an approach that first performs proper initialization of NMF taking into account original data using K-means clustering.
Second, builds a bankruptcy prediction model using the discriminative financial ratios extracted by NMF decomposition. Model
predictive accuracies evaluated in real database of French companies with statuses belonging to two classes (healthy and distressed)
are illustrated showing the effectiveness of our approach.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Measuring the electrical consumption of individual appliances in a household has recently received renewed interest in the area of energy efficiency research and sustainable development. The unambiguous acquisition of information by a single monitoring point of the whole house's electrical signal is known as energy disaggregation or nonintrusive load monitoring. A novel way to look into the issue of energy disaggregation is to interpret it as a single-channel source separation problem. To this end, we analyze the performance of source modeling based on multiway arrays and the corresponding decomposition or tensor factorization. First, with the proviso that a tensor composed of the data for the several devices in the house is given, nonnegative tensor factorization is performed in order to extract the most relevant components. Second, the outcome is later embedded in the test step, where only the measured consumption over the whole home is available. Finally, the disaggregated data by the device is obtained by factorizing the associated matrix considering the learned models. In this paper, we compare this method with a recent approach based on sparse coding. The results are obtained using real-world data from household electrical consumption measurements. The analysis of the comparison results illustrates the relevance of the multiway array-based approach in terms of accurate disaggregation, as further endorsed by the statistical analysis performed.
    IEEE Transactions on Instrumentation and Measurement 02/2014; 63(2):364-373. DOI:10.1109/TIM.2013.2278596 · 1.71 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Non-negative Matrix Factorization (NMF) is an unsupervised technique that projects data into lower dimensional spaces, effectively reducing the number of features of a dataset while retaining the basis information necessary to reconstruct the original data. In this paper we present a semi-supervised NMF approach that reduces the computational cost while improving the accuracy of NMF-based models. The advantages inherent to the proposed method are supported by the results obtained in two well-known face recognition benchmarks.
    01/2011; DOI:10.1109/IJCNN.2011.6033543
  • [Show abstract] [Hide abstract]
    ABSTRACT: Default risk models have lately raised a great interest due to the recent world economic crisis. In spite of many advanced techniques that have extensively been proposed, no comprehensive method incorporating a holistic perspective has hitherto been considered. Thus, the existing models for bankruptcy prediction lack the whole coverage of contextual knowledge which may prevent the decision makers such as investors and financial analysts to take the right decisions. Recently, SVM+ provides a formal way to incorporate additional information (not only training data) onto the learning models improving generalization. In financial settings examples of such non-financial (though relevant) information are marketing reports, competitors landscape, economic environment, customers screening, industry trends, etc. By exploiting additional information able to improve classical inductive learning we propose a prediction model where data is naturally separated into several structured groups clustered by the size and annual turnover of the firms. Experimental results in the setting of a heterogeneous data set of French companies demonstrated that the proposed default risk model showed better predictability performance than the baseline SVM and multi-task learning with SVM.
    Expert Systems with Applications 09/2012; 39(11):10140–10152. DOI:10.1016/j.eswa.2012.02.142 · 1.97 Impact Factor