ArticlePDF Available

Estimating Information Value for Credit Scoring Models

Authors:

Abstract and Figures

The paper deals with the Information value, which enjoy high popularity in the industry. Commonly it is computed by discretisation of data into intervals using deciles. One constraint is required to be met in this case. Number of cases have to be nonzero for all intervals. If this constraint is not fulfilled there are some issues to solve for preserving reasonable results. To avoid these computational issues, I proposed an alternative algorithm for estimating the Information value, named the empirical estimate with supervised interval selection. This advanced estimate is based on requirement to have at least k, where k is a positive integer, observations of scores of both good and bad clients in each considered interval. Simulation study with normally distributed scores shows high dependency on choise of the parameter k. If we choose too small value, we get overestimated value of the Information value, and vice versa. The quality of the estimate was assessed using MSE.
Content may be subject to copyright.
ESTIMATING INFORMATION VALUE
FOR CREDIT SCORING MODELS
ŘEZÁČ Martin, (CZ)
Abstract. Assessing the predictive power of credit scoring models is an important question to
financial institutions. Because it is impossible to use a scoring model effectively without
knowing how good it is, quality indexes like Gini, Kolmogorov-Smirnov statisic and
Information value are used to adress this problem. The paper deals with the Information value,
which enjoy high popularity in the industry. Commonly it is computed by discretisation of data
into intervals using deciles. One constraint is required to be met in this case. Number of cases
have to be nonzero for all intervals. If this constraint is not fulfilled there are some issues to
solve for preserving reasonable results. To avoid these computational issues, I proposed an
alternative algorithm for estimating the Information value, named the empirical estimate with
supervised interval selection. This advanced estimate is based on requirement to have at least k,
where k is a positive integer, observations of scores of both good and bad clients in each
considered interval. Simulation study with normally distributed scores shows high dependency
on choise of the parameter k. If we choose too small value, we get overestimated value of the
Information value, and vice versa. The quality of the estimate was assessed using MSE.
According to this criteria, adjusted square root of number of bad clients seems to be a
reasonable compromise.
Keywords: Credit scoring, Quality indexes, Information value, Empirical estimate, Normaly
distributed scores
Mathematics Subject Classification: Primary 62G05, 62P05; Secondary 65C60.
1 Introduction
Credit scoring is a set of statistical techniques used to determine whether to extend credit (and
if so, how much) to a borrower. When performing credit scoring, a creditor will analyze a relevant
data sample to see what factors have the most effect on credit worthiness. Once these factors and
their importances are known, a model is developed to calculate a credit score for new applicants.
Methodology of credit scoring models and some measures of their quality were discussed in
works like Hand and Henley (1997) or Crook at al. (2007) and books like Anderson (2007), Siddiqi
Aplimat
J
ournalofAppliedMathematics
volume4(2011),number3
342
(2006), Thomas et al. (2002) and Thomas (2009). Further remarks connected to credit scoring
issues can be found there as well.
Once a scoring model is available, it is natural to ask how good it is. To measure the partial
processes of a financial institution, especially their components like scoring models or other
predictive models, it is possible to use quantitative indexes such as Gini index, K-S statisic, Lift,
Information value and so forth. They can be used for comparison of several developed models at the
moment of development. It is possible to use them for monitoring the quality of models after the
deployment into real business as well. See Wilkie (2004) or Siddiqi (2006) for more details.
The paper deals primarily with the Information value. Commonly it is computed by
discretisation of data into bins using deciles with requirement on the nonzero number of cases for
all bins. As an alternative method to the empirical estimates one can use the kernel smoothing
theory, which allows to estimate unknown densities and consequently, using some numerical
method for integration, to estimate value of the Information value. See Koláček and Řezáč (2010)
for more details.
The main objective of this paper is a description of the empirical estimate with supervised
interval selection. This advanced estimate is based on requirement to have at least k, where k is a
positive integer, observations of scores of both good and bad clients in each considered interval.
Simulation study with normally distributed scores shows high dependency on choise of the
parameter k. If we choose too small value, we get overestimated value of the Information value, and
vice versa. The quality of the estimate is assessed using MSE. According to this criteria, I proposed
a rule for choice of k, which seems to be a reasonable compromise.
2 Basic notations
Consider the realization ∈ of random value (score) is available for each client. Let be
the indicator of good and bad client
1, 
0,  (1)
and let , denote cumulative distribution functions of score of bad and good clients, i.e.
 | 0,
  | 1, . (2)
Aassume , and their corresponding densities ,
are continuous on.
In practice, the empirical distribution functions are used
 ∧0
 ∧1,,, (3)
where is the score of -th client, ,  are the number of good and bad clients, respectively and
. is the minimum value of given score, is the maximum value. Finally, we denote
the proportion of bad clients.
3 The Information value
AplimatJournalofAppliedMathematics
volume4(2011),number3
343
Very popular quality index, which is based on densities of scores of good and bad clients, is
the Information value (statisic) defined as



, (4)
where



ln
. (5)
Note that the Information value is also called Divergence. See Wilkie (2004), Hand and Henley
(1997) or Thomas (2009) for more details. The example of

 for 10% of bad clients with
:0,1 and 90% of good clients with
:4,2 is illustrated in Figure 1.
Figure 1: Contribution to Information value.
However, in practice, the procedure of computation of the Information value can be a little bit
complicated. Firsty, we don't know the right form of densities
,
generally and as the second,
mostly we don't know how to compute the integral. I show some approaches to solve these
computational problems.
3.1 Estimates for normally distributed data
In case of normally distributed data, we know everything what is needed. We just have to
discriminate between two cases. Firstly, we consider that scores of good and bad clients have
common variance. In this case we have


, (6)
where 

,
and
are expectations of scores of good and bad clients and is common
standard deviation, see Wilkie (2004) for more details. When equality of variances is not
considered, then in Řezáč (2009) one can find generalized form of

given by

1
1, (7)
Aplimat
J
ournalofAppliedMathematics
volume4(2011),number3
344
where


, 
,
and
are variances of scores of good and bad clients.
The similar formula can be found in Thomas (2009). For a given data, estimation of

is done by
replacing theoretical means and variances in (6) or (7) by their appropriate empirical expressions.
To explore behaviour of the expression (7) it is possible to use tools offered by system Maple.
See Hřebíček and Řezáč (2008) for more details. An example of usage of the Exploration Assistant
is given in Figure 2. We can see a quadratic dependence on difference of means in part (a).
Furthermore, it is clear from (7) that

takes quite high values when both variances are
approximately equal and smaller or equal to 1, and that it grows to infinity if ratio of the variances
tends to infinity or is nearby zero. These properties of

are illustrated in Figure 2, part (b).
(a) (b)
Figure 2: Maple Exploration Assistant for 3D-plot of I
val
. Dependence of I
val
(a) on
and
for fixed
and
, (b) on
and
for fixed
and
.
3.2 Empirical estimates
The main idea of this chapter is to replace unknown densities by their empirical estimates.
Let's have score values
,1,…, for bad clients and score values
,1,…, for
good clients and denote (resp. ) as the minimum (resp. maximum) of all values. Let's divide the
interval , up to subintervals 
,
,
,
,,

,
, where
1,
 and
,1,…,1 are appropriate quantiles of score of all clients. Set


∈

,



∈

,
,1,, (8)
AplimatJournalofAppliedMathematics
volume4(2011),number3
345
observed counts of bad or good clients in each interval. Denote  the contribution to the
information value on jth interval, calculated by

ln
,1,,. (9)
Then the empirical information value is given by
 
 . (10)
However in practice, there could occur computational problems. The Information value index
becomes infinite in cases when some of or are equal to 0. When this arises there are
numerous practical procedures for preserving finite results. For example one can replace the zero
entry of numbers of goods or bads by a minimum constant of say 0.0001. Choosing of the number
of bins is also very important. In the literature and also in many applications in credit scoring, the
value 10 is preferred.
3.3 Empirical estimates with supervised interval selection
This approach follows ideas in the previous chapter. Estimation of information value is given
again by formulas (8) to (10). The main difference lies in construction of the intervals. Because we
want to avoid zero values of and , I simply looked for such selection of intervals, which
provides such values and , which are all positive. This will lead to situation when all
fractions and logarithms in (9) are defined and finite.
More generally, I propose to require to have at least , where is a positive integer,
observations of socres of both good and bad client in each interval, i.e.  and  for
j1,,. Set L1
F

∙
,1,…,
(11)
 H,
where F

is the empirical quantile function appropriate to the empirical cumulative distribution
function of scores of bad clients. means lower integer part of number . Usage of quantile
function of scores of bad clients is motivated by the assumption, that number of bad clients is less
than number of good clients, which is quite natural assumption. If is not divisible by , it is
necessary to adjust our intervals, because we obtain number of scores of bad clients in the last
interval, which is less than . In this case, we have to merge the last two intervals. This will lead to
situation, when it holds  for all computed intervals of scores.
Furthermore we need to ensure, that the number of scores of good clients is as required in
each interval. To do so, we compute for all actual intervals. If we obtain  for jth interval,
we merge this interval with its neighbor on the right side. This is equivalent with the removal of
 from the sequence of borders of the intervals. This can be done for all intervals except the last
one. If we have  for the last interval, than we have to merge it with its neighbor on the left
side, i.e. we merge the last two intervals. However, this situation is not very probable. If we have a
Aplimat
J
ournalofAppliedMathematics
volume4(2011),number3
346
reasonable scoring model, we can assume that good clients have higher scores than bad clients. It
means that we can expect that the number of scores of good clients is higher than number of scores
of bad clients in the last interval. Due to construction of the intervals, number of scores of bad
clients in the last interval is greater than . Thus, it is natural to expect that number of scores of
good clients in the last interval is also greater than . After all, we obtain  and  for
all created intervals.
Very important is the choice of . If we choose too small value, we get overestimated value of
the Information value, and vice versa. As a reasonable compromise seems to be adjusted square root
of number of bad clients given by
, (12)
where means upper integer part of number .
Denote  the contribution to the information value on jth interval, calculated by

ln
,1,,. (13)
where and correspond to observed counts of good and bad clients in intervals created
according to the procedure described in this chapter. The empirical information value with
supervised interval selection is now given by

 . (14)
4 Simulation results
It is clear, and it is easy to show that  outperformes . However, this chapter is focused
on properties of  depending on choice of parameter k and depending on proportion of bad clients
and difference of means of scores of bad and good clients μμ. Consider 10000 clients,
100% of bad clients with :μ,1 and 1001% of good clients with :,1.
Set 0 and consider 0.5,1and1.5, 0.02,0.05,0.1and0.2. The case μμ
0.5, i.e.  0.25 in our settings, represents weak, μμ1 means high and μμ1.5
very high performance of given scoring model. 2% bad rate (0.02) represents low risk
portfolio, e.g. mortgages (before current crises). 20% bad represents very high risk portfolio, e.g.
subprime cash loans.
Appropriate data sets for simulation was randomly generated 1000 times. Quality of  was
assessed using mean squared error given by
  . (15)
Given this measure, denote
AplimatJournalofAppliedMathematics
volume4(2011),number3
347
.
(16)
Following Table 1 consists of  for all considered values of and μμ. Proposed values of
k, , are presented in the last row of the table.
Table 1:  depending on and .
We can see that  is increasing according to . This is maybe somewhat surprising, but it is
quite natural. The increasing means increasing number of bad clients, becasuse the number of all
clients was fixed to 10000. If we have enough of bad clients, then too small k leads to too many bins
and consequently to overestimated results. But what is surpricing, it is the dependence on μμ.
While for weak models it is optimal to take very high number of observation in each bin, the
contrary holds for high perfoming models. Overall,  seems to be a reasonbale
comprimise.
For completeness, Table 2 consists of average numbers of bins for all considered values of
and μμ. We can see that they took values from 8 to 127.
Table 2: Average number of bins depending on and .
The dependece of  on k is illustrated in Figure 3 to 5. The highlighted circles correspond
to values of k, where minimal value of the MSE is obtained. The diamonds correspond to values of k
given by (12).
0.02 0.05 0.1 0.2
0.5
29 42 62 84
1
12 18 23 32
1.5
6989
15 23 32 45
0.02 0.05 0.1 0.2
0.5
8,00 13,00 18,00 24,90
1
18,00 28,80 42,76 51,88
1.5
33,62 50,20 95,96 127,67
avg.#ofbins
Aplimat
J
ournalofAppliedMathematics
volume4(2011),number3
348
(a) (b)
Figure 3: Dependence of (a)

and (b) MSE on k, 100000 clients,

..
We can see that

is decreasing when k is increasing. In case of μ
μ
0.5, speed of this
decrease is very high for small values of k, while it is nearly negligible for values of k higher than
some critical value. The similar holds for MSE.
(a) (b)
Firuge 4: Dependence of (a)

and (b) MSE on k, 100000 clients,

.
When μ
μ
1, the speed of the decrease is lower compared to the previous case. Furthermore
MSE is not so flat, especially for
2%. But what is interesting and important here, our choice
of k is nearly optimal according to MSE. Moreover, it is valid for all considered values of
.
AplimatJournalofAppliedMathematics
volume4(2011),number3
349
(a) (b)
Figure 5: Dependence of (a)

and (b) MSE on k, 100000 clients,

..
Tha last considered difference of means of scores of good and bad clients was μ
μ
1.5. In
this case, the speed of the decrease of

is the lowest compared to the previous two cases. The
novelty, relative to the previous two cases, is the shape of MSE. Especially for the highest
considered value of proportion of bad clients, i.e.
20%, we can see that MSE has really sharp
minimum.
Overall, Figure 3 and Figure 4 show that curves of MSE are quite flat nearby its minimum . It
means that a small deviation of k from

cause a small change in MSE. On the other hand Figure
5 shows the strong dependence on choice of k.
5 Conclusions
I focused on the Information value and described difficulties of its estimation. The most
popular method is the empirical estimator using deciles of given score. But it can lead to infinite
values of

and so a remedy is necessary. To avoid these difficulties I proposed the adjustment
for the empirical estimate, called the empirical estimate with supervised interval selection. It is
based on the assumption that we have at least some positive number of observed scores in each
interval. This directly leads to situation when all fractions and all logarithms are defined and finite.
Consequently,

is defined and finite.
The simulation study was focused on properties of

depending on choice of parameter k
and depending on proportion of bad clients and difference of means of scores of bad and good
clients. Quality of

was assessed using mean squared error, which is easy to compute for
normally distributed scores. Moreover, the optimal value of

was computed.
It was shown that

was increasing according to
. This is maybe somewhat surprising,
but it is quite natural. The increasing
means increasing number of bad clients, becasuse the
number of all clients was fixed in our case. If we have enough of bad clients, then too small k leads
to too many bins and consequently to overestimated results. But what was surpricing, it was the
dependence on μ
μ
. While for weak models it is optimal to take very high number of
observation in each bin, the contrary holds for high perfoming models. Overall,  seems to
be a reasonbale comprimise.
Aplimat
J
ournalofAppliedMathematics
volume4(2011),number3
350
On the other hand, the obtained results open additional possibilities for research. Especially, it
seems that inclusion of μμ, represented by appropriate estimates, to the rule of choise of k
could lead to significantly better estimates of  when using the proposed empirical estimate with
supervised interval selection.
References
[1] ANDERSON, R.: The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk
Management and Decision Automation. Oxford : Oxford University Press, 2007.
[2] CROOK, J.N., EDELMAN, D.B., THOMAS, L.C.: Recent developments in consumer
credit risk assessment. European Journal of Operational Research, 183 (3), 1447-1465,
2007.
[3] HAND, D.J. and HENLEY, W.E.: Statistical Classification Methods in Consumer Credit
Scoring: a review. Journal of the Royal Statistical Society, Series A. 160 (3), 523-541,
1997.
[4] HŘEBÍČEK, J., ŘEZÁČ, M.: Modelling with Maple and MapleSim In: 22nd European
Conference on Modelling nad Simulation ECMS 2008 Proceedings, Dudweiler, 60-66,
2008.
[5] KOLÁČEK, J., ŘEZÁČ, M.: Assessment of Scoring Models Using Information Value. In:
Compstat’ 2010 proceedings. Paris, 1191-1198, 2010.
[6] ŘEZÁČ, M.: Indexy kvality normálně rozložených skóre. Forum Statisticum Slovacum,
Bratislava, 2009.
[7] SIDDIQI, N.: Credit Risk Scorecards: developing and implementing intelligent credit
scoring. New Jersey: Wiley, 2006.
[8] TERRELL, G.R.: The Maximal Smoothing Principle in Density Estimation. Journal of
the American Statistical Association, 85, 470-477, 1990.
[9] THOMAS, L.C.: Consumer Credit Models: Pricing, Profit, and Portfolio. Oxford:
Oxford University Press, 2009.
[10] THOMAS, L.C., EDELMAN, D.B., CROOK, J.N.: Credit Scoring and Its Applications.
Philadelphia: SIAM Monographs on Mathematical Modeling and Computation, 2002.
[11] WAND, M.P. and JONES, M.C.: Kernel smoothing. London: Chapman and Hall, 1995.
[12] WILKIE, A.D.: Measures for comparing scoring systems. In: Thomas, L.C., Edelman,
D.B., Crook, J.N. (Eds.), Readings in Credit Scoring. Oxford: Oxford University Press, s.
51-62, 2004.
Current address
Martin Řezáč, Mgr., Ph.D.,
Department of Mathematics and Statistics,
Faculty of Science, Masaryk University,
611 37 Brno, Czech Republic, tel. +420 549 493 919,
e-mail: mrezac@math.muni.cz
... The extracted features were then fused by channel concatenation to obtain a feature vector with a fixed length of 8192, and incorporated with clinical information. Finally, 544 of the most important features were selected by setting the importance threshold to 0.02, and combined with clinical factors to classify FLLs using a classification CNN (39). ...
Article
Background: Routine clinical factors play an important role in the clinical diagnosis of focal liver lesions (FLLs); however, they are rarely used in computer-assisted diagnosis. Therefore, we developed a deep learning (DL) radiomics model, and investigated its effectiveness in diagnosing FLLs using long-range contrast-enhanced ultrasound (CEUS) cines and clinical factors. Methods: Herein, 303 patients with pathologically confirmed FLLs after surgery at three hospitals were retrospectively enrolled and divided into a training cohort (n=203), internal validation (IV) cohort (n=50) from one hospital with the ratio of 4:1, and external validation (EV) cohort (n=50) from the other two hospitals. Four DL radiomics models, namely Four Stream 3D convolutional neural network (FS3DU) (trained with CEUS cines only), FS3DU+A (trained with CEUS cines and alpha fetoprotein), FS3DU+H (trained with CEUS cines and hepatitis), and FS3DU+A+H (trained with CEUS cines, alpha fetoprotein, and hepatitis), were formed based on 3D convolutional neural networks (CNNs). They used approximately 20-s preoperative CEUS cines and/or clinical factors to extract spatiotemporal features for the classification of FLLs and the location of the region of interest. The area under curve of the receiver operating characteristic and diagnosis speed were calculated to evaluate the models in the IV and EV cohorts, and they were compared with those of two radiologists. Two-sided Delong tests were used to calculate the statistical differences between the models and radiologists. Results: FS3DU+A+H, which incorporated CEUS cines, hepatitis, and alpha fetoprotein, achieved the highest area under curve of 0.969 (95% CI: 0.901-1.000) and 0.957 (95% CI: 0.894-1.000) among radiologists and other models in IV and EV cohorts, respectively. A significant difference was observed when comparing FS3DU and radiologist 2 (all P<0.05). The diagnosis speed of all the models was the same (10.76 s per patient), and it was two times faster than those of the radiologists (radiologist 1: 23.74 and 27.75 s; radiologist 2: 25.95 and 29.50 s in IV and EV cohorts, respectively). Conclusions: The proposed DL radiomics demonstrated excellent performance on the benign and malignant diagnosis of FLLs by combining CEUS cines and clinical factors. It could help the individualized characterization of FLLs, and enhance the accuracy of diagnosis in the future.
Conference Paper
Full-text available
The paper presents new possibilities of Maple 16 for mathematical model-ing and interactive solving economic phenomena. There is shortly introduced Click-able Mathematics and concept of its usage by students, teachers and researches in economic modeling. Educational demonstration worksheets of Maple 16 for solving simple economic phenomena are introduced.
Article
http://www.ssds.sk/casopis/archiv/2009/fss0209.pdf#page=146
Chapter
The purpose of this note is to describe a variety of measures that can be used to compare different scoring systems with one another. A description of these measures may assist practitioners in deciding which they may find useful for their own purposes and also when discussing scoring systems with clients. It may also help clients to understand what practitioners mean when discussing what may be described as the ‘relative power of different scoring systems. when applied to the same data or to different populations. Different individuals may find some measures more intuitively a appealing than others. In every case it is assumed that each ‘case in a given data set can be classified either as ‘good ‘ or as ‘bad ‘. For each case a number of characteristics are recorded, each of which is usually subdivided into discrete categories. A scoring system is a set of (usually integer) scores, one score being associated with each possible category within each relevant characteristic. The score for the case is the sum of the appropriate scores for the categories applicable to that case.
Book
Tremendous growth in the credit industry has spurred the need for Credit Scoring and Its Applications, the only book that details the mathematical models that help creditors make intelligent credit risk decisions. Creditors of all types make risk decisions every day, often haphazardly. This book addresses the two basic types of decisions and offers sound mathematical models to assist with the decision-making process. The first decision creditors face is whether to grant credit to a new applicant (credit scoring), and the second is how to adjust the credit restrictions or the marketing effort directed at a current customer (behavioral scoring). The authors have filled an important niche with this groundbreaking book. Currently, only the most sophisticated creditors use the models contained in this book to make these decisions, but all creditors can know these aids to successful lending. The book contains a comprehensive review of the objectives, methods, and practical implementation of credit and behavioral scoring. The authors review principles of the statistical and operations research methods used in building scorecards, as well as the advantages and disadvantages of each approach. The book also contains a description of practical problems encountered in building, using, and monitoring scorecards and examines some of the country-specific problems caused by bankruptcy, equal opportunities, and privacy legislation. This important feature addresses the fact that the credit lending industry has become more international as consumers from one country use credit cards from lending institutions of a second country to make purchases in a third country. Also included in this book is a discussion of economic theories of consumers' use of credit. The reader will gain an understanding of what lending institutions seek to achieve by using credit scoring and the changes in their objectives. Despite their widespread use in business, no other book details credit scoring variations that should be used with standard statistical and operations research techniques such as discriminant analysis, logistic regression, linear programming, neural nets, and genetic algorithms. Other unique features include methods of monitoring scorecards and deciding when to update them, as well as different applications of scoring, including direct marketing, profit scoring, tax inspection, prisoner release, and payment of fines. Focusing on small data problems is useful pedagogically; therefore, the authors have included a CD-ROM containing a database, mainly to emphasize the data analysis aspects of credit scoring.
Book
The Credit Scoring Toolkit provides an all-encompassing view of the use of statistical models to assess retail credit risk and provide automated decisions. In eight modules, the book provides frameworks for both theory and practice. It first explores the economic justification and history of Credit Scoring, risk linkages and decision science, statistical and mathematical tools, the assessment of business enterprises, and regulatory issues ranging from data privacy to Basel II. It then provides a practical how-to-guide for scorecard development, including data collection, scorecard implementation, and use within the credit risk management cycle. Including numerous real-life examples and an extensive glossary and bibliography, the text assumes little prior knowledge making it an indispensable desktop reference for graduate students in statistics, business, economics and finance, MBA students, credit risk and financial practitioners.
Article
We propose a widely applicable method for choosing the smoothing parameters for nonparametric density estimators. It has come to be realized in recent years (e.g., see Hall and Marron 1987; Scott and Terrell 1987) that cross-validation methods for finding reasonable smoothing parameters from raw data are of very limited practical value. Their sampling variability is simply too large. The alternative discussed here, the maximal smoothing principle, suggests that we consider using the most smoothing that is consistent with the estimated scale of our data. This greatly generalizes and exploits a phenomenon noted in Terrell and Scott (1985), that measures of scale tend to place upper bounds on the smoothing parameters that minimize asymptotic mean integrated squared error of density estimates such as histograms and frequency polygons. The method avoids the extreme sampling variability of cross-validation by using ordinary scale estimators such as the standard deviation and interquartile range, which have order n −1 variability; cross-validated parameters have orders of variability such as n −1/5. The disadvantage is that maximal smoothing parameters are conservative, rather than asymptotically optimal. Because they tend to lose information, they should be used in conjunction with other data displays that retain more of the features of the original sample. On the other hand, such conservative methods are widely valued by statisticians because they discourage naive overinterpretation of one's data. Maximal smoothing parameters are here derived for histograms and kernel methods, using not only the standard deviation but several more resistant methods of scale estimation. The method is then applied to density estimation on the half-line, on finite intervals, and in several variables.
Article
Consumer credit risk assessment involves the use of risk assessment tools to manage a borrower’s account from the time of pre-screening a potential application through to the management of the account during its life and possible write-off. The riskiness of lending to a credit applicant is usually estimated using a logistic regression model though researchers have considered many other types of classifier and whilst preliminary evidence suggest support vector machines seem to be the most accurate, data quality issues may prevent these laboratory based results from being achieved in practice. The training of a classifier on a sample of accepted applicants rather than on a sample representative of the applicant population seems not to result in bias though it does result in difficulties in setting the cut off. Profit scoring is a promising line of research and the Basel 2 accord has had profound implications for the way in which credit applicants are assessed and bank policies adopted.