Conference PaperPDF Available

A Fourier Spectral Pattern Analysis to Design Credit Scoring Models

Authors:

Abstract and Figures

The increase of consumer credit has made it necessary to research more and more effective models for the credit scoring. Such models are usually trained by using the past loan applications, evaluating the new ones on the basis of certain criteria. Although the state of the art offers several different approaches for their definition, this process represents a hard challenge due to several reasons. The most important ones are the data unbalance between the default and the non-default cases that reduces the effectiveness of almost all techniques, and the data heterogeneity, which makes it difficult the definition of a model able to effectively evaluate all the new loan applications. The approach proposed in this paper faces the aforementioned problems by moving the evaluation process from the canonical time domain to a frequency one, using a model based on the past non-default loan applications. It allows us to overcome the data unbalance problem by exploiting only a class of data, also defining a model that is less influenced by the data heterogeneity. The performed experiments show interesting results, since the proposed approach achieves performance closer or better than that of one of the best state-of-the-art approaches of credit scoring, such as random forests, although it operates in a proactive way, only by exploiting the past non-default cases.
Content may be subject to copyright.
A Fourier Spectral Pattern Analysis to Design Credit Scoring Models
Roberto Saia and Salvatore Carta
Dipartimento di Matematica e Informatica - Universit`a di Cagliari
Via Ospedale 72, 09124 Cagliari - Italy
{roberto.saia,salvatore}@unica.it
ABSTRACT
The increase of consumer credit has made it necessary to research
more and more effective models for the credit scoring. Such mod-
els are usually trained by using the past loan applications, evalu-
ating the new ones on the basis of certain criteria. Although the
state of the art offers several different approaches for their defini-
tion, this process represents a hard challenge due to several reasons.
The most important ones are the data unbalance between the default
and the non-default cases that reduces the effectiveness of almost
all techniques, and the data heterogeneity, which makes it difficult
the definition of a model able to effectively evaluate all the new
loan applications. The approach proposed in this paper faces the
aforementioned problems by moving the evaluation process from
the canonical time domain to a frequency one, using a model based
on the past non-default loan applications. It allows us to overcome
the data unbalance problem by exploiting only a class of data, also
defining a model that is less influenced by the data heterogeneity.
The performed experiments show interesting results, since the pro-
posed approach achieves performance closer or better than that of
one of the best state-of-the-art approaches of credit scoring, such
as random forests, although it operates in a proactive way, only by
exploiting the past non-default cases.
CCS CONCEPTS
Information systems Data stream mining; Clustering and
classification; Business intelligence; Theory of computation
Pattern matching; General and reference Metrics;
KEYWORDS
Business intelligence, credit scoring, imbalanced datasets, classifi-
cation, metrics
ACM Reference format:
Roberto Saia and Salvatore Carta. 2017. A Fourier Spectral Pattern Analy-
sis to Design Credit Scoring Models. In Proceedings of IML ’17, Liverpool,
United Kingdom, October 17-18, 2017, 10 pages.
DOI: 10.1145/3109761.3109779
1 INTRODUCTION
Acredit scoring process is aimed to evaluate, in terms of reliability,
a new loan application (from now on simply named as instance),
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and /or a
fee. Request permissions from permissions@acm.org.
IML ’17, Liverpool, United Kingdom
© 2017 ACM. 978-1-4503-5243-7/17/10. . . $15.00
DOI: 10.1145/3109761.3109779
and from its result depends the acceptation or non acceptation of it.
It is clear how the effectiveness of such models is strongly related
to the gains and losses of the involved financial operators [24], es-
pecially in these last years characterized by an increasing use of the
consumer credit, which is obviously correlated with that of the de-
faulted cases (i.e., loans that have been fully or partially not repaid).
An ideal approach of credit scoring should be able to correctly clas-
sify the new instances into two classes, accepted or rejected), on
the basis of the information given by the past instances.
In other words, the credit scoring techniques can be considered
a set of statistical tools able to calculate the probability that an in-
stance leads toward a default case [26, 35], allowing the financial
operators to evaluate the credit risk [20] and to monitor the credit
activities [8].
The definition of effective credit scoring models represents a
hard challenge due to several problems, the most important of which
is the imbalance in the data used during the model training pro-
cess [4]. It means that these data sources are composed by a small
number of default cases and a big number of non-default ones, an
unbalanced configuration that reduces the effectiveness of almost
all the machine learning approaches [28].
The idea behind this paper is to move the process of evaluation of
the new instances from the canonical time domain to the frequency
one, performing the spectral analysis through the Fourier transfor-
mation [19]. It is performed by using the Fast Fourier Transform
(FFT) algorithm, which allows us to move a time series (i.e. in our
case, a sequence of discrete-time data represented by the feature
values of an instance) from its original time domain to a frequency
one, where we can study the data from a different point of view.
Considering that the model used in the process of evaluation of
the new instances is defined only by using a class of data (the non-
default cases), such approach offers a threefold advantage: it allows
us to operate proactively, it faces the problems related to the cold-
start issue (i.e., the scarcity or absence of default cases), and it re-
duces the issues related to the data heterogeneity, because their new
representation in the frequency domain is less influenced by the
data variation. We compare our approach to the Random Forests
one, since in most cases reported in literature [5, 9, 32] it outper-
forms the other credit scoring approaches.
The main key contributions of this paper are listed below:
(i) definition of the time series to be used in the Fast Fourier
Transform (FFT) process, made on the basis of the data that
compose each instance in the considered datasets;
(ii) formalization of the Fourier spectral analysis comparison pro-
cess, performed in terms of frequency magnitude difference
measured between the time series in the set of the past non-
default instances and that in an unevaluated one;
IML ’17, October 17-18, 2017, Liverpool, United Kingdom Roberto Saia and Salvatore Carta
(iii) definition of the Dynamic Feature Selection (DFS) process,
aimed to assign a different weight to each frequency compo-
nent of the previously extracted instance spectrum, on the ba-
sis of their relevance (in terms of entropy) in the evaluation
process;
(iv) formalization of the Fourier Spectrum Pattern (FSP) algo-
rithm able to classify the new instances as accepted or re-
jected by exploiting the previous spectral analysis, the DFS
process, and the tolerance range ρ.
The remainder of the paper is organized as follows: Section 2
discusses the background and related work; Section 3 provides a
formal notation, makes some premises, and defines the faced prob-
lem; Section 4 describes the implementation of the proposed ap-
proach; Section 5 provides details on the experimental environment,
on the used datasets and metrics, as well as on the adopted strategy
and the chosen competitor, reporting the experimental results at the
end; some concluding remarks and future work are given in the last
Section 6.
2 BACKGROUND AND RELATED WORK
Recent literature proposes numerous classification techniques able
to operate in the credit scoring context [18], as well as a big num-
ber of studies aimed to evaluate their performance [32], also by
taking into account the optimal configuration of the involved pa-
rameters [2] and the metrics to be used in order to evaluate the per-
formance [22].
The two most important advantages derived from the adoption
of credit scoring techniques [40] are the capability to infer when it
is reasonable (in terms of potential risks) to grant a loan to some-
one, and the capability to define models able to infer the customer
behavior, information that can be used to propose targeted financial
services. In the context of this paper we take into account only the
first one.
2.1 Credit Scoring Models
In order to perform a credit scoring process it is possible to exploit
many state-of-the-art techniques usually used in the statistic and
data mining fields [1, 10]. Some significant examples are the lin-
ear discriminant models [38], the logistic regression models [26],
the neural network models [6, 16], the genetic programming mod-
els [11, 36], the k-nearest neighbor models [25], and the decision
tree models [14, 46].
It should be observed that in many cases these techniques can
be combined in order to define hybrid approaches of credit scoring.
Some examples are the techniques that exploit the neural networks
and the clustering methods, presented in [27], and the two-stage
hybrid modeling procedure with artificial neural networks and mul-
tivariate adaptive regression splines, proposed in [31, 45].
2.2 Imbalanced Class Distribution
One of the most important problems that makes it difficult the def-
inition of effective models for the credit scoring is the imbalanced
class distribution of data [23, 28]. This issue is given by the fact
that the data used in order to train the models are characterized by
a small number of default cases and a big number of non-default
ones, a distribution of data that limits the performance of the classi-
fication techniques [9, 28].
This problem leads toward misclassification costs, as reported
in [44], which proposes to preprocess the training data through an
over-sampling or under-sampling of the classes, as a possible solu-
tion.
The effect of such preprocessing activity on the performance has
been studied in [12, 34].
2.3 Cold Start
The cold start issue [17, 47] arises when there is not enough infor-
mation to train a reliable model about a domain. In the context of
the credit scoring, such scenario appears when the data used to train
the model are not representative of all classes of data [3, 43] (i.e.,
default and non-default cases).
This kind of issue affects many areas, e.g., those related to the
recommender systems [21, 33, 42], since they are usually based on
models defined on the basis of the previous choices of the users
(user profiles), similarly to the credit scoring context, where the
past loan applications are taken into account.
In the approach proposed in this paper the cold start issue can be
reduced/overcome by using only a class of data during the model
definition process (i.e., only the non-default cases in the training
dataset).
2.4 Random Forests
Since its formalization [7], Random Forests represents one of the
most common techniques for data analysis, thanks to its better per-
formance compared to the other state-of-the-art techniques.
This technique represents an ensemble learning method for clas-
sification and regression that is based on the construction of a num-
ber of randomized decision trees during the training phase and it
infers conclusions by averaging the results.
It is able to face a wide range of prediction problems, without
performing any complex configuration, since it only requires the
tuning of two parameters: the number of trees and the number of
attributes used to grow each tree.
2.5 Fourier Transform and Spectral Analysis
The basic idea behind the approach proposed in this paper is to
move the process of evaluation of the new instances (time series)
from their canonical time domain to the frequency one, in order to
obtain a representative pattern composed by their frequency compo-
nents, as shown in Figure 1.
This operation is performed by recurring to the Discrete Fourier
Transform (DFT ), whose formalization is shown in Equation 1,
where iis the imaginary unit.
Fn
def
=
N1
k=0
fk·e2πink/N,nZ(1)
The result of the Equation 1 is a set of sinusoidal functions, each
corresponding to a particular frequency component (i.e., the spec-
trum1).
1The spectrum of the frequency components is the frequency domain representation of
a signal.
A Fourier Spectral Pattern Analysis to Design Credit Scoring Models IML ’17, October 17-18, 2017, Liverpool, United Kingdom
Frequency
Time
Magnitude
f1
f2
f...
fX
Figure 1: Time and Frequency Domains
If it is necessary, we can use the inverse Fourier transform shown
in Equation 2 to return to the original time domain.
fk=1
N
N1
n=0
Fn·e2πikn/N,nZ(2)
The Fast Fourier Transform (FFT) algorithm, used in the con-
text of this paper to perform the Fourier transformations, rapidly
computes the DFT , or its Inverse Fast Fourier Transform (IDFT ),
by factorizing the input matrix into a product of sparse (mostly zero)
factors. It is largely used because it reduces the computational com-
plexity of the process from O(n2)to O(nlog n), where ndenotes the
data size.
3 PRELIMINARIES
Formal notation, premises, and problem statement related to this
paper are stated in the following:
3.1 Notation
Given a set of classified instances I={i1,i2,...,iN}, and a set of
features V={v1,v2,...,vM}that compose each iI, we denote as
I+Ithe subset of non-default instances, and as IIthe subset
of default ones.
We also denote as ˆ
I={ˆ
i1,ˆ
i2,...,ˆ
iU}a set of unclassified in-
stances and as O={o1,o2,...,oU}these instances after the clas-
sification process, thus |ˆ
I|=|O|.
It should be observed that an instance can belong only to one
class cC, where C={acce pted ,re ject ed}.
Finally, we denote as F={f1,f2,..., fX}the frequency compo-
nents of each instance (spectrum), obtained at the end of the DF T
process.
3.2 Premises
Considering that the periodic wavesare characterized by a frequency
fand a wavelength λ(i.e., the distance in the medium between
the beginning and end of a cycle λ=w
f0
, where wstands for the
wave velocity), which are defined by the repeating pattern, the non-
periodic waves taken into account in the Discrete Fourier Transform
process do not have a frequency or wavelength. Their fundamental
period Tis the period where the wave values were taken and sr de-
notes their number over this time (i.e., the acquisition frequency).
Assuming that the time interval between the acquisitions is equal,
on the basis of the previous definitions applied in the context of this
paper, the considered non-periodic wave is given by the sequence of
values v1,v2,...,vMwith vV, which compose each instance iI+
(i.e., the past non-default instances) and ˆ
iˆ
I(i.e., the unevaluated
instances), and that representing the time series taken into account.
Their fundamental period Tstarts with v1and it ends with vM,
thus we have that sr =|V|; the sample interval si is instead given by
the fundamental period Tdivided by the number of acquisition, i.e.,
si =T
|V|.
Through the FFT algorithm we compute the Discrete Fourier
Transform of each time series i I+and ˆ
iˆ
I, by converting their
representation from the time domain to the frequency one. The ob-
tained frequency-domain representation provides information about
the signal’s magnitude and phase at each frequency. For this reason,
the output (denoted as x) of the FFT computation is a series of com-
plex numbers composed by a real part xrand an imaginary part xi,
thus x=(xr+ixi).
We can obtain the xmagnitude by using |x|=q(x2
r+x2
i)and
the xphase by using ϕ(x)=arctanxi
xr, although in the context
of this paper we will take into account only the magnitude at each
frequency.
3.3 Problem Statement
On the basis of the comparison of the spectral analysis λperformed
by the FFT algorithm on the time series in iI+and ˆ
iˆ
I, our FSP
approach classifies each instance ˆ
iˆ
Ias accepted or rejected.
Given a function eval(ˆ
i,λ)created to evaluate the correctness of
the ˆ
iclassification, which returns a boolean value σ(0=misclassifi-
cation,1=correct classification), we formalize our objective as the
maximization of the results sum, as shown in Equation 3.
max
0σ≤|ˆ
I|
σ=
|ˆ
I|
u=1
eval(ˆ
iu,λ)(3)
4 PROPOSED APPROACH
The implementation of our approach is carried out through the fol-
lowing steps:
(1) Time Series Definition (T SD): definition of the time se-
ries to use in the FFT algorithm, in terms of sequence of
instance feature values;
(2) Time Series Analysis (T SA): comparison of the Fourier
spectral patterns of two instances, performed by process-
ing their time series, defined in the previous step, through
the FFT algorithm;
(3) Time Series Dynamic Feature Selection (DFS): deter-
mination of the weight of each frequency component in
the instance spectrum, on the basis of the Shannon entropy
metric;
(4) Time Series Classification (T SC): formalization of the
FSP algorithm able to classify a new instance as accepted
or rejected, on the basis of the T SA comparison, the DFS
process, and the tolerance range ρ.
In the following, we provide a detailed description of each of
these steps, since we have introduced the high-level architecture of
the proposed FSP approach.
IML ’17, October 17-18, 2017, Liverpool, United Kingdom Roberto Saia and Salvatore Carta
Time Series
Definition
Fourier
Transform
Instances
Comparison
Instances
Classification
DFS and ρ
Calculation
I+
ˆ
I
O
T(I+)
T(ˆ
I)
F(I+)
F(ˆ
I)
F(I+)
DFS,ρ
Figure 2: F SP Architecture
The high-level description shown in Figure 2 wants shortly intro-
duces the processes involved in our approach, which are however
explained in detail in the following.
According to the notation given in Section 3.1, I+and ˆ
Idenote,
respectively, the set of non-default instances and the set of instances
to evaluate, while the set Odenotes the instances in ˆ
Iafter their clas-
sification. We indicates with T(I+)and T(ˆ
I)the time series related,
respectively, with the instances in I+and ˆ
I, and with F(I+)and F(ˆ
I)
the set of frequency components obtained by processing these time
series through the FFT algorithm.
At the beginning, the time series related to the sets of the unevalua-
ted instances and the previous non-default instances are extracted.
They are used as input in the Fourier Transform process, obtaining
as result the spectral pattern of each instance. The classification
of the instances to evaluate is based on a comparison process per-
formed between their spectral patterns and those of the previous
non-default ones, taking into account the importance of each fre-
quency component (in terms of entropy) and a tolerance range ρ
experimentally defined.
4.1 Time Series Definition
In the first step of our approach we define the time series to use in
the Discrete Fourier Transform process.
Formally, a time series represents a series of data points stored
by following the time order and usually it is a sequence captured at
successive equally spaced points in time, thus it can be considered
a sequence of discrete-time data.
In the context of the proposed approach, the time series taken
into account are defined by using the set of features Vthat com-
pose each instance in the I+and ˆ
Isets, as shown in Equation 4, by
following the criterion reported in Equation 5.
I+=
v1,1v1,2... v1,M
v2,1v2,2... v2,M
.
.
..
.
.....
.
.
vN,1vN,2... vN,M
ˆ
I=
v1,1v1,2... v1,M
v2,1v2,2... v2,M
.
.
..
.
.....
.
.
vU,1vU,2... vU,M
(4)
(v1,1,v1,2,...,v1,M),(v2,1,v2,2,...,v2,M),···,(vN,1,vN,2,...,vN,M)
(v1,1,v1,2,...,v1,M),(v2,1,v2,2,...,v2,M),···,(vU,1,vU,2,...,vU,M)(5)
The time series related to an item ˆ
iˆ
Iwill be compared to the
time series related to all the items iI+, by following the criteria
explained in the next steps.
4.2 Time Series Analysis
Before we describe the process of analysis based on the Fourier
transformation, it is useful to observe the spectral pattern of an in-
stance randomly taken from a dataset, with |V|=20 (Figure 3), be-
side its canonical representation in the time domain.
The frequency domain representation allows us to perform a data
(represented by the sequence of values assumed by the instance fea-
tures, as described in Section 4.1) analysis in terms of peaks (mag-
nitudes) of the spectral frequencies that compose it. This allows us
to detect some patterns in the features, which are not discoverable
in the time domain.
Comparing the two different domains, we can observe some in-
teresting properties for the context taken into account in this paper.
The most significant are the following:
The phase invariance property shown in Figure 4 proves
that also in case of translation2between instances, a spe-
cific pattern still exists in the frequency spectrum. More
formally, it is one of the phase properties of the Fourier
transform [41], i.e., a shift of a time series in the time do-
main leaves the magnitude unchanged in the frequency do-
main. This property allows us to detect a particular pattern
in the user behavior, regardless to the involved instances
that compose it. A concrete example is represented by the
values in the features from 6to 11, from 12 to 17, and
from 18 to 23, of the DC dataset (described in Table 2 of
Section 5.2). They report a sequence of values that belong
to three different types of information, related to the loan
applicant (i.e., past repayments,bill statement, and amount
paid), and by exploiting the spectrum pattern analysis we
can detect a specific pattern (behavior), also when it shifts
along the features that compose one of these subsets of val-
ues;
Another interesting aspect of the frequency domain is given
by the amplitude correlation property shown in Figure 5.
It proves the existence of a direct correlation between the
values assumed by the features in the temporal domain
and the magnitudes of the spectral components in the fre-
quency domain. More formally, it is the homogeneity prop-
erty of the Fourier transform [41], i.e., when the amplitude
is altered in one domain, it is altered by the same entity
2In terms of signal it represents a change of phase, considering that a translation in
time domain corresponds to a change in phase in the frequency domain.
A Fourier Spectral Pattern Analysis to Design Credit Scoring Models IML ’17, October 17-18, 2017, Liverpool, United Kingdom
0 10 20
0
200
400
Time (Series)
Value
0.0 0.2 0.4
450
500
550
Frequency (Hz)
Magnitud e
Figure 3: Time and Frequency Domains
1 2 3 4 5 6 7 8 9 10
1.0
2.0
3.0
4.0
Time (Series)
Value
1 2 3 4 5 6 7 8 9 10
1.0
2.0
3.0
4.0
Time (Series)
Value
0.2 0.4
2.0
4.0
6.0
8.0
Frequency (Hz)
Magnitud e
0.2 0.4
2.0
4.0
6.0
8.0
Frequency (Hz)
Magnitud e
Figure 4: Phase Invariance Pro pert y
1 2 3 4 5 6 7 8 9 10
1.0
2.0
3.0
4.0
Time (Series)
Value
1 2 3 4 5 6 7 8 9 10
1.0
2.0
3.0
4.0
Time (Series)
Value
0.2 0.4
2.0
4.0
6.0
8.0
Frequency (Hz)
Magnitud e
0.2 0.4
2.0
4.0
6.0
8.0
Frequency (Hz)
Magnitud e
Figure 5: Am plitud e Correlat ion Pro perty
in the other domain3. This property assures us of the ca-
pability of the frequency representation to differentiate the
instances on the basis of the size of the values in their fea-
tures.
Practically, the process of analysis is performed by moving the
time series of the instances to compare from their time domain to
the frequency one, by recurring to the FFT approach introduced in
Section 2.5.
In this context, although there are many algorithms able to cal-
culate the FFT, the most used are those based on the Cooley-Tukey
recursive algorithm. It grants us a decimation in time on the basis
3Scaling in one domain corresponds to scaling in the other domain
f1
xf2
x·· · | · ·
|f2
x|
|f1
x|
0
max
Frequency
Magnitud e
Figure 6: Delt a Di f f erence
of the following considerations: when the number Nof input data
is even, it is possible to express it as N=2·M, allowing us to split
the Nelement summation of the DFT formula into two Melement
ones, one over n=2·m, another over n=2·(m+1), as shown in the
Equation 6.
Xk=
M1
m=0
x2me2πimk
M+e2πik
N
M1
m=0
x2m+1e2πimk
M(6)
We implement the FFT approach by using the JTransforms Java
library, according to what reported in Section 5.1.
The process of comparison between an instance ˆ
iˆ
Ito evaluate
and a past non-default instance iI+is performed by measuring the
difference between the magnitude |f|of each component fF
in the frequency spectrum of the involved instances.
It is shown in the Equation 7, where f1
xand f2
xdenote, respec-
tively, the same frequency component of an item iI+and an item
ˆ
iˆ
I. Such process of comparison between the same frequency
component of two instances is also graphically shown in Figure 6.
=|f1
x| − | f2
x|,with |f1
x| ≥ | f2
x|(7)
It should be noted that, as described in Section 4.4, for each in-
stance ˆ
iˆ
Ito evaluate, the aforementioned process is repeated by
comparing it to each instance iI+. This allows us to evaluate the
variation in the context of all the non-default past cases.
4.3 Time Series Dynamic Feature Selection
In the context of machine learning and statistics, the feature se-
lection process is aimed to detect a subset of relevant features to
use during the model definition. It represents an important prepro-
cessing step, since it reduces the complexity of the final model, de-
creasing the training times, and increasing the generalization of the
model. It also reduces the problem related with the overfitting, a
problem that occurs when a statistical model describes random error
or noise instead of the underlying relationship, and this frequently
happens during the definition of excessively complex models, since
many parameters, with respect to the number of training data, are
involved.
In the context of our approach, we perform the feature selection
task in a dynamic way, by measuring the Shannon entropy (i.e., a
metric described in Section 5.3.1) in each feature of the training
datasets (Figure 7).
Since the entropy gives us a measure of the uncertainty of a ran-
dom variable, the larger it is, the less a-priori information one has
on the value of it, then the entropy increases as the data becomes
equally probable and decreases when their chances are unbalanced.
IML ’17, October 17-18, 2017, Liverpool, United Kingdom Roberto Saia and Salvatore Carta
2 4 68 10 12 14 16 18 20
min
max
Features (GC dataset )
Entro py
2 4 68 10 12 14 16 18 20 22
min
max
Features (DC dataset )
Entro py
Figure 7: Instance Features Ent ropy
The adoption of different weights during the evaluation process,
performed by the Dynamic Feature Selection (DFS) approach, is
aimed to differentiate the instance features on the basis of their pre-
dictive power.
For the needs of the Algorithm 1, we formalize the DFS in terms
of inverse Shannon entropy (in order to have a high value when the
feature is important, and a low value otherwise). Such formalization
is shown in Equation 8, where P(f)indicates the probability that the
frequency component fis present in the set F.
The obtained result represents the weight of the ffrequency com-
ponent (in terms of entropy) to use in the evaluation process de-
scribed in the Section 4.4.
DFS(f)=1
fF
P(f)log2[P(f)]!(8)
4.4 Time Series Classification
This section formalizes the FSP algorithm used to perform the clas-
sification of new instances, together with the analysis of its asymp-
totic time complexity.
4.4.1 Algorithm. The proposed FSP approach is based on the
Algorithm 1. It takes as input the set I+of non-default instances oc-
curred in the past, the set ˆ
Iof unevaluated instances, and the toler-
ance range ρ(determined as described in Section 5.4.2). It returns
as output a set Othat contains all the instances in ˆ
I, classified as
accepted or rejected.
From step 2 to step 23 we process all unevaluated instances ˆ
iˆ
I,
by starting with the extraction of the time series of each instance
(step 3), which is processed at step 4 in order to obtain the fre-
quency spectrum. From the step 5 to step 15 we instead process
each non-default instance iI+, by performing the extraction of the
time series of each instance (step 6) and by obtaining its frequency
spectrum (step 7).
Algorithm 1 FS P I nstances cl assi f ication
Input: I+=Non-default instances, ˆ
I=Unevaluated instances, ρ=Tolerance range
Output: O=Set of classified instances
1: procedure INSTANC ESCLA SS IFIC ATIO N(I+,ˆ
I,ρ)
2: for each ˆ
iin ˆ
Ido
3: ts1=getT imeseries(ˆ
i)
4: F1=getF FT (t s1)
5: for each iin I+do
6: ts2=getT imeseries(i)
7: F2=getF FT (ts2)
8: for each fin Fd o
9: if (|F2(f)|− |F1(f)| ∈ ρ)then
10: reliable += DF S(f)
11: else
12: unreliable += DF S(f)
13: end if
14: end for
15: end for
16: if reliable >unrel iable then
17: O(ˆ
i,accepted )
18: else
19: O(ˆ
i,reject ed)
20: end if
21: reliable =0
22: unreliable =0
23: end for
24: return O
25: end procedure
The steps from 8to 14 verify if the difference between the mag-
nitude of each frequency components fFof the non-default in-
stances and the correspondent component of the current instance, is
within the ρrange.
On the basis of the result of this operation, in the steps from
9to 13, the weight (in terms of entropy) of the current frequency
component is used in order to increase the reliable value (when the
difference is within the ρrange) or the unreliable one (otherwise)
(steps 10 and 12).
On the basis of these two values the instance under evaluation
is classified as accepted or rejected in the steps from 16 to 20, and
the result of the classification process is returned by the algorithm
at the step 24, when all instances ˆ
iˆ
Ihave been processed.
4.4.2 Asymptotic Time Complexity. Although the evaluation of
the time needed to perform the classification of a single instance
is quite unnecessary, the possible implementation of the proposed
FSP approach in a real-time scoring system [37], where the response-
time represents a crucial factor, suggests us to analyze the theoreti-
cal complexity of the classification Algorithm 1.
Denoted as Nthe dimension of the training set I+(i.e., N=|I+|),
we define the asymptotic time complexity of the evaluation of a
single instance (according to the Big O notation) by observing what
follows:
(i) the Algorithm 1 presents three nested loops given by the outer
loop that starts at step 2, which executes Ntimes the other two
inner loops (the first that starts at step 5 and the second that
starts at step 8), plus other operations (getTimeseries,getFFT,
comparisons, and assignations), respectively with complexity
of O(n),O(n log n),O(1), and O(1);
(ii) the first inner loop executes one time the same aforementioned
operations, plus its inner loop that executes operations with
complexity O(1)(comparisons and assignations) for a num-
ber of times lesser than N(i.e., for |F|times);
On the basis of the previous considerations, we can conclude that
the asymptotic time complexity of the algorithm is O(N2).
A Fourier Spectral Pattern Analysis to Design Credit Scoring Models IML ’17, October 17-18, 2017, Liverpool, United Kingdom
It should also be noted that the computational time can be ade-
quately reduced by distributing the process over different machines,
by employing large scale distributed computing models like MapRe-
duce [15].
5 EXPERIMENTS
This section reports information about the experimental environ-
ment, the used datasets and metrics, the adopted strategy, the chosen
competitor, as well as the results of the performed experiments.
5.1 Environment
The proposed FSP approach was developed in Java, where we use
the JTransforms4library to operate the Fourier transformations.
The state-of-the-art approach used to evaluate its performance
was made in R5, using the randomForest and ROCR packages, as
detailed in Section 5.5.
The experiments have been conducted by using two real-world
datasets, both characterized by a strong unbalanced distribution of
data.
It should be further added that we verified the existence of a
statistical difference between the results, by using the independent-
samples two-tailed Student's t-tests (p<0.05).
5.2 Datasets
The two real-world datasets used in the experiments (i.e., German
Credit and Default of Credit Card Clients datasets, both available
at the UCI Repository of Machine Learning Databases6) represent
two benchmarks in this research field. In the following we provide
a brief description of their characteristics:
5.2.1 German Credit (GC). It contains 1,000 instances: 700 of
them are non-default instances (70.00%) and 300 are default in-
stances (30.00%). Each instance is composed by 20 features (whose
type is described in Table 1) and a binary class variable (accepted
or rejected).
Table 1: Dataset GC Fields
Feature Description Feature Description
01 Status of checking account 11 Present residence since
02 Duration 12 Property
03 Credit history 13 Age
04 Purpose 14 Other installment plans
05 Credit amount 15 Housing
06 Savings account/bonds 16 Existing credits
07 Present employment since 17 Job
08 Installment rate 18 Maintained people
09 Personal status and sex 19 Telephone
10 Other debtors/guarantors 20 Foreign worker
5.2.2 Default of Credit Card Clients (DC). It contains 30,000 in-
stances: 23,364 of them are non-default instances (77.88%) and
6,636 are default instances (22.12%). Each instance is composed
by 23 features (whose type is described in Table 2) and a binary
class variable (accepted or rejected).
4https://sites.google.com/site/piotrwendykier/software/jtransforms
5https://www.r-project.org/
6ftp://ftp.ics.uci.edu/pub/machine-learning-databases/statlog/
Table 2: Dataset DC Fields
Feature Description Feature Description
01 Credit amount 13 Bill statement in Aug-2005
02 Gender 14 Bill statement in Jul-2005
03 Education 15 Bill statement in Jun-2005
04 Marital status 16 Bill statement in May-2005
05 Age 17 Bill statement in Apr-2005
06 Past repayments in Sep-2005 18 Amount paid in Sep-2005
07 Past repayments in Aug-2005 19 Amount paid in Aug-2005
08 Past repayments in Jul-2005 20 Amount paid in Jul-2005
09 Past repayments in Jun-2005 21 Amount paid in Jun-2005
10 Past repayments in May-2005 22 Amount paid in May-2005
11 Past repayments in Apr-2005 23 Amount paid in Apr-2005
12 Bill statement in Sep-2005
5.3 Metrics
This section introduces the metrics used in the context of this paper.
5.3.1 Shannon Entropy. The Shannon entropy, formalized by
Claude E. Shannon in [39], is one of the most important metrics
used in information theory. It reports the uncertainty associated
with a random variable, allowing us to evaluate the average mini-
mum number of bits needed to encode a string of symbols, based
on their frequency.
More formally, given a set of values vV, the entropy H(V)is
defined as shown in the Equation 9, where P(v)is the probability
that the element vis present in the set V.
In the context of the classification methods, the entropy-based
metrics are frequently used during the feature selection [13, 29, 30]
process, which is aimed to detect a subset of relevant features (vari-
ables, predictors) to use during the definition of the classification
model. We use it for this task, dynamically, as described in Sec-
tion 4.3.
H(V)=
vV
P(v)log2[P(v)](9)
5.3.2 Accuracy. The Accuracy metric reports the number of in-
stances correctly classified, compared to the total number of them.
More formally, given a set of instances Xto be classified, it is
calculated as shown in Equation 10, where |X|stands for the total
number of instances, and |X(+)|for the number of those correctly
classified.
Accuracy(X)=|X(+)|
|X|(10)
5.3.3 Sensitivity. Differently from the accuracy metric previ-
ously described, which takes into account all kind of classifications,
through the Sensitivity we only obtain information about the num-
ber of instances correctly classified as reliable. It gives us an impor-
tant information, since it evaluates the predictive power of our FSP
approach in terms of capability to detect the reliable loan applica-
tions, offering a crucial decision support in real-world contexts.
More formally, given a set of instances Xto be classified, the
Sensitivity is calculated as shown in Equation 11, where |X(T P )|
stands for the number of instances correctly classified as reliable
and |X(FN )|for the number of reliable instances wrongly classified
as unreliable.
Sensitivity(X)=|X(T P)|
|X(TP )|+|X(FN)|(11)
IML ’17, October 17-18, 2017, Liverpool, United Kingdom Roberto Saia and Salvatore Carta
5.3.4 F-measure. The F-measure is the weighted average of
the precision and recall metrics. It is a largely used metric in the sta-
tistical analysis of binary classification, returning a value in a range
[0,1], where 0 is the worst value and 1 the best one.
More formally, given two sets Xand Y, where Xdenotes the set
of performed classifications of instances, and Ythe set that contains
the actual classifications of them, this metric is defined as shown in
Equation 12.
F-measure(X,Y)=2·(precision(X,Y)·recall (X,Y) )
(precision(X,Y)+recall (X,Y))
with
precision(X,Y)=|YX|
|X|,recall (X,Y)=|YX|
|Y|
(12)
5.4 Strategy
This section reports information about the strategy adopted during
the execution of the experiments.
5.4.1 Cross-validation. In order to reduce the impact of data
dependency, improving the reliability of the obtained results, all
the experiments have been performed by using the k-fold cross-
validation criterion, with k=10.
Each dataset is randomly shuffled, then it is divided in ksubsets,
and each ksubset is used as test set, while the other k-1 subsets are
used as training set. The final result is given by the average of all
results.
5.4.2 Tolerance Range. Considering that we have introduced a
tolerance range ρin the evaluation process performed by the Al-
gorithm 1 (Section 4.4.1), we need to define its upper and lower
bounds in the context of each dataset.
This range is used in the spectrum comparison process in order
to determine when a value, i.e., the difference between the mag-
nitude of the same frequency component of two instances (one of
them that belongs to the non-default cases and the other one that
represents the instance to evaluate), as shown in Figure 6, must be
considered acceptable or not (the classification of an instance as
accepted or rejected depends on the results of these evaluations).
For each frequency component fF, measured in the set of
past non-default instances I+, we calculate the difference in terms
of magnitude between each possible pair (f,ˆ
f), with f,ˆ
fF.
Denoting as |fˆ
f|I+the aforementioned process of calculation
of the differences between the magnitudes assumed by the same
frequency component fin the dataset I+, we define the tolerance
range ρof each fFas shown in the Equation 13.
ρ=[ρmin,ρmax ]
with
ρmin =min(|fˆ
f|I+),ρmax =max(|fˆ
f|I+)
(13)
Differently from our competitor approach (i.e., Random Forests),
which allows us to determine the parameters value that leads toward
its best performance, our approach adopts a dynamic method in or-
der to determine the optimal range (minimum and maximum value)
for each frequency component, instead to use a single range for all
of them. For this reason, we can not determine these ranges of val-
ues a priori, since they are strictly related to the dataset taken into
account, according to the Equation 13.
2 4 68 10 12 14 16 18 20 22 24 26 28 30
0.70
0.75
0.80
(GC)mtry parameter
Accuracy
GC DC
Figure 8: Random Forests T uning
5.5 Competitor
Here, we describe the state-of-the-art approach chosen as competi-
tor in order to evaluate the performance of our approach, beside the
parameter tuning process aimed to optimize its performance.
5.5.1 Description. As mentioned previously, the implementa-
tion of the state-of-the-art approach to which we compare our ap-
proach was made in R, by using the randomForest and ROCR pack-
ages. For reproducibility reasons, we fix the seed of the random
number generator by calling the Rfunction set.seed().
5.5.2 Parameters Tuning. In order to get the best performance
from the RF approach, we need to perform a tuning process aimed
to detect the optimal value of its configuration parameters.
The caret package in Rprovides an excellent functionality to per-
form this type of operation. Considering that caret supports only
those algorithm parameters that have a crucial role in the tuning
process, such as the mtry in the RF (number of variables randomly
sampled as candidates at each split), we use caret in order to tune
this parameter. The operation was performed by following the grid
search approach, where each axis of the grid is an algorithm param-
eter and the points in the grid are specific parameters combinations.
The tests were stopped as soon as the measured accuracy did
not improve further. Although the differences are minimal beyond
certain values, as can be seen in the Figure 8, the experiments indi-
cate as optimal value for mtry 27 for the GC dataset and 8for the
DC dataset, since these values lead toward the maximum value of
Accuracy (i.e., respectively, 75.36% and 81.26%).
5.6 Results
This section reports, presents and discusses the results of the per-
formed experiments.
5.6.1 Overview. A first analysis of the experimental results (re-
ported in Figure 9, Figure 10, and Figure 11) shows that:
(i) the performance of our F SP approach is very similar to the
RF one, in terms of Accuracy, with both the GC and DC
datasets;
(ii) the F SP approach gets better performance than RF one, in
terms of F-measure, by using the DC dataset, and very close
to it by using the GC dataset;
(iii) the F SP approach outperforms the RF one, in terms of Sensi-
tivity, with both the GC and DC datasets.
The above aspects will be more deeply discussed in the next Sec-
tion 5.6.2 and Section 5.6.3.
A Fourier Spectral Pattern Analysis to Design Credit Scoring Models IML ’17, October 17-18, 2017, Liverpool, United Kingdom
0.20 0.40 0.60 0.80 1.00
DC
GC
Accuracy
Data-sets
RF FSP
Figure 9: Accuracy Per f ormance
0.20 0.40 0.60 0.80 1.00
DC
GC
F-measure
Data-sets
RF FSP
Figure 10: F-measure Per f ormance
0.20 0.40 0.60 0.80 1.00
DC
GC
Sensitivity
Data-sets
RF FSP
Figure 11: Sensitivity Per f ormance
5.6.2 Discussion. The first observation that rises by examining
in more detail the experimental results is related to the fact that our
FSP approach gets performance very close (or better) to those of
RF one, although it does not exploit the past default cases during
the training process.
Another observation is instead related to the F-measure results,
which show that the effectiveness of the FSP approach increases
with the number of past non-default instances involved in the train-
ing process (DC dataset), differently from the RF approach, where
this does not happen, although its training process involves both the
default and the non-default past cases.
It should be noted that, in the light of the obtained results, the
proactivity that characterizes our approach can reduce/overcome
the cold-start problem described in Section 2.3, allowing a real-
world system to operate even in the absence of previous cases of
default instances, with all the advantages that derive from it.
The last but not less important observation is related to the re-
sults in terms of sensitivity, which show significant improvements,
compared to the state-of-the-art RF approach taken into account. It
means that the number of correct true positive classifications of in-
stances is higher than that obtained by the RF approach, and this
provides a clear benefit in a real-world context.
5.6.3 Benefits and Limitation. The experimental results presented
and discussed before show that our approach performs similarly to
one of the best performing state-of-the-art approaches such as Ran-
dom Forests, although it operates in a proactive manner.
In addition, it is able to outperform Random Forests when the
training process involves a large number of previous non-default
cases, proving to be more effective than its competitor in the identi-
fication of the reliable instances.
On the basis of these results, a benefit related to the adoption of
our credit scoring approach is its ability to face the data unbalance
problem that reduces the effectiveness of the canonical approaches,
since it exploits only a class of data in the model definition process
(i.e., the previous non-default instances). Such proactive strategy
also reduces/overcomes the well-known cold-start problem.
Another benefit is instead related to the fact that the model used
during the evaluation process, based on the spectral pattern of the
instances, is more stable than the canonical one, because the fre-
quency components are less influenced by the data heterogeneity.
For the aforementioned reasons, our approach can be used in
order to create hybrid approaches able to operate in all contexts, by
combining its capability to operate proactively with the advantages
offered by the non-proactive state-of-the-art approaches.
We can also identify as main limitation of our approach its few
benefits in those cases where it exists a balanced data distribution
with enough default and non-default cases to use during the model
training. However, it should be underlined that this represents an
uncommon real-world scenario.
6 CONCLUSIONS AND FUTURE WORK
The credit scoring techniques cover a crucial role in many finan-
cial contexts (i.e., bank loans, mortgage lending, insurance policies,
etc.). They are adopted by the financial operators in order to as-
sess the potential risks related to the customer applications, allow-
ing them to reduce the losses due to default.
In this paper we proposed a novel approach of credit scoring
able to classify the new instances as accepted or rejected by eval-
uating them in terms of frequency spectral pattern. This operation
is performed by moving the evaluation process from the canonical
domain to a frequency one, where the evaluation model is defined
by using only the past non-default loan applications.
Such strategy presents two main advantages, the first of them is
related to its ability to face the data unbalance issue, facing at the
same time the cold-start problem, and the second one is related to
its capability to define a model only by exploiting the non-default
previous instances, allowing a system to operate proactively.
Future work would explore the effect, in terms of performance,
of the inclusion of the default past instances in the model definition
process, evaluating the advantages and disadvantages of the adop-
tion of such non-proactive strategy.
Another interesting study would be to experiment the exploita-
tion of other characteristics of the instances represented in the fre-
quency domain, with the objective to improve the effectiveness of
the classification algorithm.
A secondary but also interesting future work would be the evalu-
ation of our approach in the context of heterogeneous environments,
where numerous types of financial data are involved (e.g., the elec-
tronic commerce environment).
IML ’17, October 17-18, 2017, Liverpool, United Kingdom Roberto Saia and Salvatore Carta
Acknowledgments
This research is partially funded by Regione Sardegna under project
“Next generation Open Mobile Apps Development” (NOMAD), “Pac-
chetti Integrati di Agevolazione” (PIA) - Industria Artigianato e
Servizi - Annualit`a 2013.
REFERENCES
[1] Mahmood Alborzi and Mohammad Khanbabaei. 2016. Using data min-
ing and neural networks techniques to propose a new hybrid customer be-
haviour analysis and credit scoring model in banking services based on
a developed RFM analysis method. IJBIS 23, 1 (2016), 1–22.
DOI:
http://dx.doi.org/10.1504/IJBIS.2016.078020
[2] Shawkat Ali and Kate A. Smith. 2006. On learning algorithm selec-
tion for classification. Appl. Soft Comput. 6, 2 (2006), 119–138.
DOI:
http://dx.doi.org/10.1016/j.asoc.2004.12.002
[3] Josh Attenberg and Foster J. Provost. 2010. Inactive learning?: difficulties em-
ploying active learning in practice. SIGKDD E xplorations 12, 2 (2010), 36–41.
DOI:
http://dx.doi.org/10.1145/1964897.1964906
[4] Gustavo EAPA Batista, Ronaldo C Prati, and Maria Carolina Monard. 2004. A
study of the behavior of several methods for balancing machine learning training
data. ACM Sigkdd Explorations Newsletter 6, 1 (2004), 20–29.
[5] Siddhartha Bhattacharyya, Sanjeev Jha, Kurian K. Tharakunnel, and J. Christo-
pher Westland. 2011. Data mining for credit card fraud: A compar-
ative study. Decision Support Systems 50, 3 (2011), 602–613.
DOI:
http://dx.doi.org/10.1016/j.dss.2010.08.008
[6] Antonio Blanco-Oliver, Rafael Pino-Mej´ıas, Juan Lara-Rubio, and Salvador
Rayo. 2013. Credit scoring models for the microfinance industry using neural
networks: Evidence from Peru. Expert Syst. Appl. 40, 1 (2013), 356–364.
DOI:
http://dx.doi.org/10.1016/j.eswa.2012.07.051
[7] Leo Breiman. 2001. Random Forests. Machine Learning 45, 1 (2001), 5–32.
DOI:
http://dx.doi.org/10.1023/A:1010933404324
[8] Jeff Brill. 1998. The importance of credit scoring models in improving cash flow
and collections. Business Credit 100, 1 (1998), 16–17.
[9] Iain Brown and Christophe Mues. 2012. An experimental comparison of classifi-
cation algorithms for imbalanced credit scoring data sets. Expert Syst. Appl. 39,
3 (2012), 3446–3453.
DOI:
http://dx.doi.org/10.1016/j.eswa.2011.09.033
[10] Sherry Y Chen and Xiaohui Liu. 2004. The contribution of data mining to infor-
mation science. Journal of Information Science 30, 6 (2004), 550–558.
[11] Bo-Wen Chi and Chiun-Chieh Hsu. 2012. A hybrid approach to integrate
genetic algorithm into dual scoring model in enhancing the performance of
credit scoring model. Expert Syst. Appl. 39, 3 (2012), 2650–2661.
DOI:
http://dx.doi.org/10.1016/j.eswa.2011.08.120
[12] Sven F Crone and Steven Finlay. 2012. Instance sampling in credit scoring: An
empirical study of sample size and balancing. International Journal of Forecast-
ing 28, 1 (2012), 224–238.
[13] Manoranjan Dash and Huan Liu. 1997. Feature Selection for
Classification. Intell. Data Anal. 1, 1-4 (1997), 131–156.
DOI:
http://dx.doi.org/10.1016/S1088-467X(97)00008-5
[14] RH Davis, DB Edelman, and AJ Gammerman. 1992. Machine-learning algo-
rithms for credit-card applications. IMA Journal of Management Mathematics 4,
1 (1992), 43–51.
[15] Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data pro-
cessing on large clusters. Commun. ACM 51, 1 (2008), 107–113.
DOI:
http://dx.doi.org/10.1145/1327452.1327492
[16] Vijay S Desai, Jonathan N Crook, and George A Overstreet. 1996. A comparison
of neural networks and linear scoring models in the credit union environment.
European Journal of Operational Research 95, 1 (1996), 24–37.
[17] Pinar Donmez, Jaime G. Carbonell, and Paul N. Bennett. 2007. Dual Strat-
egy Active Learning. In ECML (Lecture Notes in Computer Science), Vol. 4701.
Springer, 116–127.
[18] Michael Doumpos and Constantin Zopounidis. 2014. Credit Scoring. In Multi-
criteria Analysis in Finance. Springer, 43–59.
[19] Pierre Duhamel and Martin Vetterli. 1990. Fast Fourier transforms: a tutorial
review and a state of the art. Signal processing 19, 4 (1990), 259–299.
[20] Albert Fensterstock. 2005. Credit scoring and the next step. Bus iness Credit 107,
3 (2005), 46–49.
[21] Ignacio Fern´andez-Tob´ıas, Paolo Tomeo, Iv´an Cantador, Tommaso Di Noia, and
Eugenio Di Sciascio. 2016. Accuracy and Diversity in Cross-domain Recommen-
dations for Cold-start Users with Positive-only Feedback. In Proceedings of the
10th ACM Conference on Recommender Systems, Boston, MA, USA, September
15-19, 2016, Shilad Sen, Werner Geyer, Jill Freyne, and Pablo Castells (Eds.).
ACM, 119–122.
DOI:
http://dx.doi.org/10.1145/2959100.2959175
[22] David J. Hand. 2009. Measuring classifier performance: a coherent alternative
to the area under the ROC curve. Machine Learning 77, 1 (2009), 103–123.
DOI:
http://dx.doi.org/10.1007/s10994-009-5119- 5
[23] Haibo He and Edwardo A. Garcia. 2009. Learning from Imbalanced
Data. IEEE Trans. Knowl. Data Eng. 21, 9 (2009), 1263–1284.
DOI:
http://dx.doi.org/10.1109/TKDE.2008.239
[24] WE Henley and others. 1997. Construction of a k-nearest-neighbour credit-
scoring system. IMA Journal of Management Mathematics 8, 4 (1997), 305–
321.
[25] WE Henley and David J Hand. 1996. A k-nearest-neighbour classifier for assess-
ing consumer credit risk. The Statistician (1996), 77–95.
[26] William Edward Henley. 1994. Statistical aspects of credit scoring. Ph.D. Dis-
sertation. Open University.
[27] Nan-Chen Hsieh. 2005. Hybrid mining approach in the design of credit scoring
models. Expert Systems with Applications 28, 4 (2005), 655–665.
[28] Nathalie Japkowicz and Shaju Stephen. 2002. The class imbalance problem: A
systematic study. Intell. Data Anal. 6, 5 (2002), 429–449.
[29] Feng Jiang, Yuefei Sui, and Lin Zhou. 2015. A relative decision entropy-based
feature selection approach. Pattern Recognition 48, 7 (2015), 2151–2163.
DOI:
http://dx.doi.org/10.1016/j.patcog.2015.01.023
[30] Nojun Kwak and Chong-Ho Choi. 2002. Input feature selection for classifi-
cation problems. IEEE Trans. Neural Networks 13, 1 (2002), 143–159.
DOI:
http://dx.doi.org/10.1109/72.977291
[31] Tian-Shyug Lee and I-Fei Chen. 2005. A two-stage hybrid credit scoring model
using artificial neural networks and multivariate adaptive regression splines. Ex-
pert Systems with Applications 28, 4 (2005), 743–752.
[32] Stefan Lessmann, Bart Baesens, Hsin-Vonn Seow, and Lyn C. Thomas. 2015.
Benchmarking state-of-the-art classification algorithms for credit scoring: An
update of research. European Journal of Operational Research 247, 1 (2015),
124–136.
DOI:
http://dx.doi.org/10.1016/j.ejor.2015.05.030
[33] Blerina Lika, Kostas Kolomvatsos, and Stathes Hadjiefthymiades. 2014. Facing
the cold start problem in recommender systems. Expert Syst. Appl. 41, 4 (2014),
2065–2073.
DOI:
http://dx.doi.org/10.1016/j.eswa.2013.09.005
[34] Ana Isabel Marqu´es, Vicente Garc´ıa, and Jos´e Salvador S ´anchez. 2013.
On the suitability of resampling techniques for the class imbalance prob-
lem in credit scoring. JORS 64, 7 (2013), 1060–1070.
DOI:
http://dx.doi.org/10.1057/jors.2012.120
[35] Loretta J Mester and others. 1997. Whats the point of credit scoring? Business
review 3 (1997), 3–16.
[36] Chorng-Shyong Ong, Jih-Jeng Huang, and Gwo-Hshiung Tzeng. 2005. Building
credit scoring models using genetic programming. Expert Sys tems with Applica-
tions 29, 1 (2005), 41–47.
[37] Jon T. S. Quah and M. Sriganesh. 2008. Real-time credit card fraud detection
using computational intelligence. Expert Syst. Appl. 35, 4 (2008), 1721–1732.
DOI:
http://dx.doi.org/10.1016/j.eswa.2007.08.093
[38] Alan K Reichert, Chien-Ching Cho, and George M Wagner. 1983. An exam-
ination of the conceptual issues involved in developing credit-scoring models.
Journal of Business & Economic Statistics 1, 2 (1983), 101–114.
[39] Claude E. Shannon. 2001. A mathematical theory of communication. Mobile
Computing and Communications Review 5, 1 (2001), 3–55.
[40] Mohammad Siami, Zeynab Hajimohammadi, and others. 2013. Credit scoring in
banks and financial institutions via data mining techniques: A literature review.
Journal of AI and Data Mining 1, 2 (2013), 119–129.
[41] Steven W Smith and others. 1997. The scientist and engineer’s guide to digital
signal processing. (1997).
[42] Le Hoang Son. 2016. Dealing with the new user cold-start problem in recom-
mender systems: A comparative review. Inf. Syst. 58 (2016), 87–104.
DOI:
http://dx.doi.org/10.1016/j.is.2014.10.001
[43] V Thanuja, B Venkateswarlu, and GSGN Anjaneyulu. 2011. Applications of
Data Mining in Customer Relationship Management. Journal of Computer and
Mathematical Sciences Vol 2, 3 (2011), 399–580.
[44] Veronica Vinciotti and David J Hand. 2003. Scorecard construction with unbal-
anced class sizes. Journal of Iranian Statistical Society 2, 2 (2003), 189–205.
[45] Gang Wang, Jinxing Hao, Jian Ma, and Hongbing Jiang. 2011. A comparative
assessment of ensemble learning for credit scoring. Expert Syst. Appl. 38, 1
(2011), 223–230.
DOI:
http://dx.doi.org/10.1016/j.eswa.2010.06.048
[46] Gang Wang, Jian Ma, Lihua Huang, and Kaiquan Xu. 2012. Two credit scoring
models based on dual strategy ensemble trees. Knowl.-Based Syst. 26 (2012),
61–68.
DOI:
http://dx.doi.org/10.1016/j.knosys.2011.06.020
[47] Jingbo Zhu, Huizhen Wang, Tianshun Yao, and Benjamin K. Tsou. 2008. Ac-
tive Learning with Sampling by Uncertainty and Density for Word Sense Disam-
biguation and Text Classification. In COLING 2008, 22nd International Confer-
ence on Computational Linguistics, Proceedings of the Conference, 18-22 August
2008, Manchester, UK, Donia Scott and Hans Uszkoreit (Eds.). 1137–1144.
... Based on our previous experience [13][14][15][16][17][18] to deal with credit scoring, we present in this work a solution for data heterogeneity in credit scoring datasets. We do that by assessing the performance of a Two-Step Feature Space Transforming (TSFST) method we previously proposed in [19] to improve credit scoring systems. ...
... The work in [18] processes data in the wavelet domain with three metrics used to rate customers. Similarly, the approach in [17] uses differences of magnitudes in the frequency domain. Finally, the approach in [13] performs comparison of non-square matrix determinants to allow or deny loans. ...
... In other words, our goal is to maximize the total number of correct predictions, or β (p z ==y z ) = 1. To increase α of this objective function, several approaches can be chosen, as discussed previously in the related work in Section 2. These can be: (i) select and/or transform features [13,17,18]; (ii) select the best classifier [30,32,33]; or (iii) select the best ensemble of classifiers [39,40,42]. ...
Chapter
Full-text available
The increasing amount of credit offered by financial institutions has required intelligent and efficient methodologies of credit scoring. Therefore, the use of different machine learning solutions to that task has been growing during the past recent years. Such procedures have been used in order to identify customers who are reliable or unreliable, with the intention to counterbalance financial losses due to loans offered to wrong customer profiles. Notwithstanding, such an application of machine learning suffers with several limitations when put into practice, such as unbalanced datasets and, specially, the absence of sufficient information from the features that can be useful to discriminate reliable and unreliable loans. To overcome such drawbacks, we propose in this work a Two-Step Feature Space Transforming approach, which operates by evolving feature information in a twofold operation: (i) data enhancement; and (ii) data discretization. In the first step, additional meta-features are used in order to improve data discrimination. In the second step, the goal is to reduce the diversity of features. Experiments results performed in real-world datasets with different levels of unbalancing show that such a step can improve, in a consistent way, the performance of the best machine learning algorithm for such a task. With such results we aim to open new perspectives for novel efficient credit scoring systems.
... Regarding statistical approaches, literature offers many works, such as the one that improve the Logistic Regression algorithm with non-linear decision-tree effects (Dumitrescu et al., 2022), or that where the Linear Discriminant Analysis has been used for the credit scoring task (Khemais et al., 2016). Regarding transformed data domains approaches, in a work (Saia and Carta, 2017b) the authors face the credit scoring task by exploiting the Fourier Transform, similarly in another work (Saia et al., 2018), which instead exploits the Wavelet Transform, or in Carta et al., 2019), where the authors use a transformed feature space. Regarding machine learning approaches, the Decision Tree and Support Vector Machine al-gorithms were combined in a work (Roy and Urolagin, 2019) in order to define a credit scoring system, whereas in the work (Liu et al., 2022) the authors designed a credit scoring system based on tree-enhanced gradient boosting decision trees. ...
Conference Paper
Full-text available
The rating of users requesting financial services is a growing task, especially in this historical period of the COVID-19 pandemic characterized by a dramatic increase in online activities, mainly related to e-commerce. This kind of assessment is a task manually performed in the past that today needs to be carried out by automatic credit scoring systems, due to the enormous number of requests to process. It follows that such systems play a crucial role for financial operators, as their effectiveness is directly related to gains and losses of money. Despite the huge investments in terms of financial and human resources devoted to the development of such systems, the state-of-the-art solutions are transversally affected by some well-known problems that make the development of credit scoring systems a challenging task, mainly related to the unbalance and heterogeneity of the involved data, problems to which it adds the scarcity of public datasets. The Region-based Training Data Segmentation (RTDS) strategy proposed in this work revolves around a divide-and-conquer approach, where the user classification depends on the results of several sub-classifications. In more detail, the training data is divided into regions that bound different users and features, which are used to train several classification models that will lead toward the final classification through a majority voting rule. Such a strategy relies on the consideration that the independent analysis of different users and features can lead to a more accurate classification than that offered by a single evaluation model trained on the entire dataset. The validation process carried out using three public real-world datasets with a different number of features, samples, and degree of data imbalance demonstrates the effectiveness of the proposed strategy, which outperforms the canonical training one in the context of all the datasets.
... Researchers in Saia et al. [6] performed credit scoring to detect defaults using the Wavelet transform combined with three metrics three different datasets were used in their experimentation the authors compared their results with RF and improved on RF; however, state of the art results is achieved using neural networks and to get a better perspective neural networks approach needed to be included. The work in Saia and Carta [7] transformed the canonical time domain representation to the frequency domain, by comparing differences of magnitudes after Fourier Transform conversion of time-series data. The authors in Ceronmani Sharmila et al. [8] applied an outlier-based score for each transaction, together with an isolation forest classifier to improve default detection. ...
Article
Full-text available
Credit card defaults pause a business-critical threat in banking systems thus prompt detection of defaulters is a crucial and challenging research problem. Machine learning algorithms must deal with a heavily skewed dataset since the ratio of defaulters to non-defaulters is very small. The purpose of this research is to apply different ensemble methods and compare their performance in detecting the probability of defaults customer’s credit card default payments in Taiwan from the UCI Machine learning repository. This is done on both the original skewed dataset and then on balanced dataset several studies have showed the superiority of neural networks as compared to traditional machine learning algorithms, the results of our study show that ensemble methods consistently outperform Neural Networks and other machine learning algorithms in terms of F1 score and area under receiver operating characteristic curve regardless of balancing the dataset or ignoring the imbalance
... A preliminary consideration regarding the evaluation metrics used in this domain is related to the fact that, similar to some other domains Carta et al., 2020a;Saia and Carta, 2017a;Saia, 2017;Saia and Carta, 2016;Saia et al., 2018a;Saia and Carta, 2017b), the involved data are usually characterized by a high degree of imbalance (in the intrusion detection context the minority class is the intrusion one), requiring assessment metrics that are not biased by this characteristic. ...
Conference Paper
Full-text available
Anyone working in the field of network intrusion detection has been able to observe how it involves an ever-increasing number of techniques and strategies aimed to overcome the issues that affect the state-of-the-art solutions. Data unbalance and heterogeneity are only some representative examples of them, and each misclassification operates in this context could have enormous repercussions in different crucial areas such as, for instance, financial, privacy, and public reputation. This happens because the current scenario is characterized by a huge number of public and private network-based services. The idea behind the proposed work is decomposing the canonical classification process into several sub-processes, where the final classification depends on all the sub-processes results, plus the canonical one. The proposed Training Data Decomposition (TDD) strategy is applied on the training datasets, where it applies a decomposition into regions, according to a defined number of events and features. The ratio that leads this process is related to the observation that the same network event could be evaluated in a different manner, when it is evaluated in different time periods and/or when it involves different features. According to this observation, the proposed approach adopts different classification models, each of them trained in a different data region characterized by different time periods and features, classifying the event both on the basis of all model results, and on the basis of the canonical strategy that involves all data.
... In (Khemais et al., 2016) the authors use the Linear Discriminant Analysis (LDA) in order to achieve this result. A recent study (Roy and Shaw, 2021) proposes a model based on multiple-criteria decision-making (MCDM) that can be adopted as an internal scoring model to preliminary screen the loan applications, and it can be initially applied to reduce costs; -Transformed Domain: in (Saia and Carta, 2017b) has been proposed a Credit Scoring approach based on the Fourier transform, similarly to the work done in (Saia et al., 2018a), where instead the Wavelet transform has been exploited; -Machine Learning (ML): in (Roy and Urolagin, 2019) a Credit Scoring approach based on Decision Tree (DT) and Support Vector Machine (SVM) algorithms has been defined, whereas in (Zhang et al., 2018) a Random Forest (RF) algorithm has been adopted. Another work is based on a survival gradient boosting decision tree (GBDT) approach (Xia et al., 2020); -Deep Learning (DL): recent works focus on exploiting DL models also in the field of Credit Scoring. ...
Conference Paper
Full-text available
The Payments Systems Directive 2 (PSD2), recently issued by the European Union, allows the banks to share their customer data if they authorize the operation. On the one hand, this opportunity offers interesting perspectives to the financial operators, allowing them to evaluate the customers reliability (Credit Scoring) even in the absence of the canonical information typically used (e.g., age, current job, total incomes, or previous loans). On the other hand, the state-of-the-art approaches and strategies still train their Credit Scoring models using the canonical information. This scenario is further worsened by the scarcity of proper datasets needed for research purposes and the class imbalance between the reliable and unreliable cases, which biases the reliability of the classification models trained using this information. The proposed work is aimed at experimentally investigating the possibility of defining a Credit Scoring model based on the bank transactions of a customer, instead of using the canonical information, comparing the performance of the two models (canonical and transaction-based), and proposing an approach to improve the performance of the transactions-based model through the introduction of meta-features. The performed experiments show the feasibility of a Credit Scoring model based only on banking transactions and the possibility of improving its performance by introducing simple meta-features.
... Fan et al. [30] also exploited the entropy criterion in order to face the issues related to imbalanced datasets. Saia and Carta [56] compared features magnitudes in the Fourier space in a test sample and all samples from a dataset. Saia et al. [57] analyzed the cosine, features and magnitudes similarities in the Wavelet transform. ...
Article
Full-text available
The credit scoring models are aimed to assess the capability of refunding a loan by assessing user reliability in several financial contexts, representing a crucial instrument for a large number of financial operators such as banks. Literature solutions offer many approaches designed to evaluate users' reliability on the basis of information about them, but they share some well-known problems that reduce their performance, such as data imbalance and heterogeneity. In order to face these problems, this paper introduces an ensemble stochastic criterion that operates in a discretized feature space, extended with some meta-features in order to perform efficient credit scoring. Such an approach uses several classification algorithms in such a way that the final classification is obtained by a stochastic criterion applied to a new feature space, obtained by a two-fold preprocessing technique. We validated the proposed approach by using real-world datasets with different data imbalance configurations, and the obtained results show that it outperforms some state-of-the-art solutions.
... For instance, the literature reports intrusion detection approaches based on machine learning criteria such as gradient boosting [14], adaptive boosting [15], and random forests [16]. Other proposals involve artificial neural networks [17], probabilistic criteria [18], or data transformation/representation [19][20][21], similarly to what is done, in terms of scenario and data balance, in closely related domains [22][23][24][25][26][27][28][29][30]. It should be noted that, besides sharing the same objectives, they also tackle analogous problems, such as the difficulty of classifying intrusion events that are very similar to normal ones in terms of characteristics or the difficulty to detect novel form of attacks (e.g., zero-days attacks [31]). ...
Article
Full-text available
The dramatic increase in devices and services that has characterized modern societies in recent decades, boosted by the exponential growth of ever faster network connections and the predominant use of wireless connection technologies, has materialized a very crucial challenge in terms of security. The anomaly-based Intrusion Detection Systems, which for a long time have represented one of the most efficient solutions in order to detect intrusion attempts on a network, then have to face this new and more complicated scenario. Well-known problems, such as the difficulty of distinguishing legitimate activities from illegitimate ones due to their similar characteristics and their high degree of heterogeneity, today have become even more complex, considering the increase in the network activity. After providing an extensive overview of the scenario under consideration, this work proposes a Local Feature Engineering (LFE) strategy aimed to face such problems through the adoption of a data preprocessing strategy that reduces the number of possible network event patterns, increasing at the same time their characterization. Unlike the canonical feature engineering approaches, which take into account the entire dataset, it operates locally in the feature space of each single event. The experiments conducted on real-world data have shown that this strategy, which is based on the introduction of new features and the discretization of their values, improves the performance of the canonical state-of-the-art solutions.
... On the other side, many studies are instead oriented to a deeper analysis of the social media posts, such as in [34,35], where the authors propose approaches for toxic comment classification. Indeed, similarly to other contexts [36][37][38][39][40][41][42], a considerable importance is given to the transformation of the original data domain, such as, for instance, in [43], where an approach aimed to predict the popularity of online videos by exploiting the Fourier transform has been presented. Another example is represented by the work in [44], where the authors propose a solution based on the wavelet transform to detect human, legitimate bot, and malicious bot in online social networks. ...
Article
Full-text available
Predicting the popularity of posts on social networks has taken on significant importance in recent years, and several social media management tools now offer solutions to improve and optimize the quality of published content and to enhance the attractiveness of companies and organizations. Scientific research has recently moved in this direction, with the aim of exploiting advanced techniques such as machine learning, deep learning, natural language processing, etc., to support such tools. In light of the above, in this work we aim to address the challenge of predicting the popularity of a future post on Instagram, by defining the problem as a classification task and by proposing an original approach based on Gradient Boosting and feature engineering, which led us to promising experimental results. The proposed approach exploits big data technologies for scalability and efficiency, and it is general enough to be applied to other social media as well.
Article
Customer credit scoring is a dynamic interactive process. Simply designing the static reward function for deep reinforcement learning may be difficult to guide an agent to adapt to the change of the customer credit scoring environment. To solve this problem, we propose a deep Q-network with the confusion-matrix-based dynamic reward function (DQN-CMDRF) model. Especially, the new constructed dynamic reward function can adjust the reward dynamically according to the change of confusion matrix after each deep Q-network model training, which can guide the agent to adapt to the change of environment quickly, so as to improve the customer credit scoring performance of the deep Q-network model. First, we formulate customer credit scoring as a finite Markov decision process. Second, to adjust the reward dynamically according to the customer credit scoring environment, the dynamic reward function is designed based on the confusion matrix. Finally, we introduce the confusion-matrix-based dynamic reward function into the deep Q-network model for customer credit scoring. To verify the effectiveness of the proposed model, we introduce four evaluation measures and make a series of experiments on the five customer credit scoring datasets. The experimental results show that the constructed dynamic reward function can more effectively improve customer credit scoring performance of the deep Q-network model, and the performance of the DQN-CMDRF model is significantly better than that of the other eight traditional classification models. More importantly, we find that the constructed dynamic reward function can accelerate the convergence speed and improve the stability of the deep Q-network model.
Article
Full-text available
The financial status of high-tech startups directly reflects a country’s economic vitality. Hence, the financial distress of high-tech startups will hinder a country's economic growth. In this study, we extract financial data of seven countries from the VICO 2.0 dataset; the sample includes high-tech startups (labeled as acquired, non-acquired, failure, and non-failure) between 2005 and 2014. We utilize seven algorithms in machine learning to identify critical variables that predict financial distress of high-tech startups in the EU, thereby preventing high-tech startup failure or acquisition. Specifically, our experimental results also thoroughly verify the seven countries’ economic situation after the 2008 economic crisis. Overall, this work can provide sufficient financial risk warnings for creditors, investors, and managers and support businesses from each countries’ commercial sectors.
Conference Paper
Full-text available
Nowadays, the prevention of credit card fraud represents a crucial task, since almost all the operators in the E-commerce environment accept payments made through credit cards, aware of that some of them could be fraudulent. The development of approaches able to face effectively this problem represents a hard challenge due to several problems. The most important among them are the heterogeneity and the imbalanced class distribution of data, problems that lead toward a reduction of the effectiveness of the most used techniques, making it difficult to define effective models able to evaluate the new transactions. This paper proposes a new strategy able to face the aforementioned problems based on a model defined by using the Discrete Fourier Transform conversion in order to exploit frequency patterns, instead of the canonical ones, in the evaluation process. Such approach presents some advantages, since it allows us to face the imbalanced class distribution and the cold-start issues by involving only the past legitimate transactions, reducing the data heterogeneity problem thanks to the frequency-domain-based data representation, which results less influenced by the data variation. A practical implementation of the proposed approach is given by presenting an algorithm able to classify a new transaction as reliable or unreliable on the basis of the aforementioned strategy.
Article
Full-text available
This paper presents a comprehensive review of the works done, during the 2000–2012, in the application of data mining techniques in Credit scoring. Yet there isn’t any literature in the field of data mining applications in credit scoring. Using a novel research approach, this paper investigates academic and systematic literature review and includes all of the journals in the Science direct online journal database. The articles are categorized and classified into enterprise, individual and small and midsized (SME) companies credit scoring. Data mining techniques is also categorized to single classifier, Hybrid methods and Ensembles. Variable selection methods are also investigated separately because it’s a major issue in credit scoring problem. The findings of the review reveals that data mining techniques are mostly applied to individual credit score and there are a few researches on enterprise and SME credit scoring. Also ensemble methods, support vector machines and neural network methods are the most favorite techniques used recently. Hybrid methods are investigated in four categories and two of them which are “classification and classification” and “clustering and classification” combinations are used more. Paper analysis provides a guide to future researches and concludes with several suggestions for further studies.
Presentation
Full-text available
The main aim of a credit scoring model is the classification of the loan applicants into two classes, reliable and non-reliable customers, on the basis of their potential capability to keep up with their repayments. Nowadays, credit scoring models are increasingly in demand, due to the consumer credit growth. Such models are usually designed on the basis of the past loan applications and used to evaluate the new ones. Their definition represents an hard challenge for different reasons, the most important of which is the imbalanced class distribution of data (i.e., the number of default cases is much smaller than that of the non-default cases), and this reduces the effectiveness of the most widely used approaches (e.g., neural network, random forests, and so on). The Linear Dependence Based (LDB) approach proposed in this paper offers a twofold advantage: it evaluates a new loan application on the basis of the linear dependence of its vector representation in the context of a matrix composed by the vector representation of the non-default applications history, thus by using only a class of data, overcoming the imbalanced class distribution issue; furthermore, it does not exploit the defaulting loans, allowing us to operate in a proactive manner, by addressing also the cold-start problem. We validate our approach on two real-world data sets characterized by a strong unbalanced distribution of data, by comparing its performance with that of one of the best state-of-the-art approach: random forests.
Conference Paper
Full-text available
The main aim of a credit scoring model is the classification of the loan applicants into two classes, reliable and non-reliable customers, on the basis of their potential capability to keep up with their repayments. Nowadays, credit scoring models are increasingly in demand, due to the consumer credit growth. Such models are usually designed on the basis of the past loan applications and used to evaluate the new ones. Their definition represents an hard challenge for different reasons, the most important of which is the imbalanced class distribution of data (i.e., the number of default cases is much smaller than that of the non-default cases), and this reduces the effectiveness of the most widely used approaches (e.g., neural network, random forests, and so on). The Linear Dependence Based (LDB) approach proposed in this paper offers a twofold advantage: it evaluates a new loan application on the basis of the linear dependence of its vector representation in the context of a matrix composed by the vector representation of the non-default applications history, thus by using only a class of data, overcoming the imbalanced class distribution issue; furthermore, it does not exploit the defaulting loans, allowing us to operate in a proactive manner, by addressing also the cold-start problem. We validate our approach on two real-world data sets characterized by a strong unbalanced distribution of data, by comparing its performance with that of one of the best state-of-the-art approach: random forests.
Article
This book provides a concise introduction into the fundamentals and applied techniques of multiple criteria decision making in the finance sector. Based on an analysis of the nature of financial decisions and the general methods of financial modelling, risk management and financial engineering, the book introduces into portfolio management, banking management and credit scoring. Finally the book presents an overview of further applications of multi criteria analysis in finance and gives an outlook on future perspectives for the application of MCDA in finance.
Conference Paper
Computing useful recommendations for cold-start users is a major challenge in the design of recommender systems, and additional data is often required to compensate the scarcity of user feedback. In this paper we address such problem in a target domain by exploiting user preferences from a related auxiliary domain. Following a rigorous methodology for cold-start, we evaluate a number of recommendation methods on a dataset with positive-only feedback in the movie and music domains, both in single and cross-domain scenarios. Comparing the methods in terms of item ranking accuracy, diversity and catalog coverage, we show that cross-domain preference data is useful to provide more accurate suggestions when user feedback in the target domain is scarce or not available at all, and may lead to more diverse recommendations depending on the target domain. Moreover, evaluating the impact of the user profile size and diversity in the source domain, we show that, in general, the quality of target recommendations increases with the size of the profile, but may deteriorate with too diverse profiles.
Article
Nowadays, credit scoring is one of the major activities in banks and other financial institutions. Also, banks need to identify customers' behaviour to segment and classify valuable customers. Data mining techniques and RFM analysis method can help banks develop customer behaviour analysis and credit scoring systems. Many researchers have deployed credit scoring and RFM analysis method in their studies, separately. In this paper, a new hybrid model of behavioural scoring and credit scoring based on data mining and neural networks techniques is presented for the field of banking. In this hybrid model, a new enhanced WRFMLCs analysis method is developed using clustering and classification techniques. The results demonstrate that the proposed model can be deployed to effectively segment and classify valuable bank customers.
Chapter
The recent financial crisis has highlighted the importance of credit risk assessment for financial institutions, firms, and supervisors. Credit scoring systems are important tools for credit risk evaluation and monitoring. This chapter describes the process for building and testing credit scoring models and illustrates how multicriteria techniques based on disaggregation analysis can be used in this area. Empirical results are also presented, derived from an application to a large sample of Greek firms.