Extracting Discriminative Features Using Non-negative Matrix Factorization in Financial Distress Data

DOI: 10.1007/978-3-642-04921-7_55 In book: Adaptive and Natural Computing Algorithms, pp.537-547
Source: DBLP


In the recent financial crisis the incidence of important cases of bankruptcy led to a growing interest in corporate bankruptcy
prediction models. In addition to building appropriate financial distress prediction models, it is also of extreme importance
to devise dimensionality reduction methods able to extract the most discriminative features. Here we show that Non-Negative
Matrix Factorization (NMF) is a powerful technique for successful extraction of features in this financial setting. NMF is
a technique that decomposes financial multivariate data into a few basis functions and encodings using non-negative constraints.
We propose an approach that first performs proper initialization of NMF taking into account original data using K-means clustering.
Second, builds a bankruptcy prediction model using the discriminative financial ratios extracted by NMF decomposition. Model
predictive accuracies evaluated in real database of French companies with statuses belonging to two classes (healthy and distressed)
are illustrated showing the effectiveness of our approach.

10 Reads
  • Source
    • "However, nonlinear projection methods have been successfully used [16] making them more suitable for this problem. With the same goal, non-negative matrix factorization (NMF) is used in [15] for extracting the most discriminative features. Despite the numerous papers dealing with the problem it is often difficult to compare the techniques due to different data sets, algorithms and approaches. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In recent years the the potential and programmability of Graphics Processing Units (GPU) has raised a note-worthy interest in the research community for applications that demand high-computational power. In particular, in financial applications containing thousands of high-dimensional samples, machine learning techniques such as neural networks are often used. One of their main limitations is that the learning phase can be extremely consuming due to the long training times required which constitute a hard bottleneck for their use in practice. Thus their implementation in graphics hardware is highly desirable as a way to speed up the training process. In this paper we present a bankruptcy prediction model based on the parallel implementation of the Multiple BackPropagation (MBP) algorithm which is tested on a real data set of French companies (healthy and bankrupt). Results by running the MBP algorithm in a sequential processing CPU version and in a parallel GPU implementation show reduced computational costs with respect to the latter while yielding very competitive performance.
    Full-text · Conference Paper · Aug 2010
  • Source
    • "ISO- MAP) have been successfully used [14] making them more suitable for this problem. With the same goal, non-negative matrix factorization (NMF) is used in [12] for extracting the most discriminative features. While the forecast of bankruptcy is of paramount importance to all stakeholders, to estimate the probability of a corporate failure can prevent the adverse effects that such event can provoke. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Financial distress prediction is of great importance to all stakeholders in order to enable better decision-making in evaluating firms. In recent years, the rate of bankruptcy has risen and it is becoming harder to estimate as companies become more complex and the asymmetric information between banks and firms increases. Although a great variety of techniques have been applied along the years, no comprehensive method incorporating an holistic perspective had hitherto been considered. Recently, SVM+ a technique proposed by Vapnik [17] provides a formal way to incorporate privileged information onto the learning models improving generalization. By exploiting additional information to improve traditional inductive learning we propose a prediction model where data is naturally separated into several groups according to the size of the firm. Experimental results in the setting of a heterogeneous data set of French companies demonstrated that the proposed model showed superior performance in terms of prediction accuracy in bankruptcy prediction and misclassification cost.
    Full-text · Conference Paper · Jul 2010
  • [Show abstract] [Hide abstract]
    ABSTRACT: Text classification has received increasing interest over the past decades for its wide range of applications driven by the ubiquity of textual information. The high dimensionality of those applications led to pervasive use of dimensionality reduction methods, often black-box feature extraction non-linear techniques. We show how Non-Negative Matrix Factorization (NMF), an algorithm able to learn a parts-based representation of data by imposing non-negativity constraints, can be used to represent and extract knowledge from a text classification problem. The resulting reduced set of features is tested with kernel-based machines on Reuters-21578 benchmark showing the method’s performance competitiveness.
    No preview · Conference Paper · Sep 2009
Show more