ArticlePDF Available

Estimating Information Value for Credit Scoring Models

January 2011

January 2011

Authors:

Masaryk University

The paper deals with the Information value, which enjoy high popularity in the industry. Commonly it is computed by discretisation of data into intervals using deciles. One constraint is required to be met in this case. Number of cases have to be nonzero for all intervals. If this constraint is not fulfilled there are some issues to solve for preserving reasonable results. To avoid these computational issues, I proposed an alternative algorithm for estimating the Information value, named the empirical estimate with supervised interval selection. This advanced estimate is based on requirement to have at least k, where k is a positive integer, observations of scores of both good and bad clients in each considered interval. Simulation study with normally distributed scores shows high dependency on choise of the parameter k. If we choose too small value, we get overestimated value of the Information value, and vice versa. The quality of the estimate was assessed using MSE.

Contribution to Information value.

…

Maple Exploration Assistant for 3D-plot of I val . Dependence of I val (a) on

…

Figures - uploaded by Martin Rezac

Content may be subject to copyright.

Content uploaded by Martin Rezac

Content may be subject to copyright.

ESTIMATING INFORMATION VALUE

FOR CREDIT SCORING MODELS

ŘEZÁČ Martin, (CZ)

Abstract. Assessing the predictive power of credit scoring models is an important question to

financial institutions. Because it is impossible to use a scoring model effectively without

knowing how good it is, quality indexes like Gini, Kolmogorov-Smirnov statisic and

Information value are used to adress this problem. The paper deals with the Information value,

which enjoy high popularity in the industry. Commonly it is computed by discretisation of data

into intervals using deciles. One constraint is required to be met in this case. Number of cases

have to be nonzero for all intervals. If this constraint is not fulfilled there are some issues to

solve for preserving reasonable results. To avoid these computational issues, I proposed an

alternative algorithm for estimating the Information value, named the empirical estimate with

supervised interval selection. This advanced estimate is based on requirement to have at least k,

where k is a positive integer, observations of scores of both good and bad clients in each

considered interval. Simulation study with normally distributed scores shows high dependency

on choise of the parameter k. If we choose too small value, we get overestimated value of the

Information value, and vice versa. The quality of the estimate was assessed using MSE.

According to this criteria, adjusted square root of number of bad clients seems to be a

reasonable compromise.

Keywords: Credit scoring, Quality indexes, Information value, Empirical estimate, Normaly

distributed scores

Mathematics Subject Classification: Primary 62G05, 62P05; Secondary 65C60.

1 Introduction

Credit scoring is a set of statistical techniques used to determine whether to extend credit (and

if so, how much) to a borrower. When performing credit scoring, a creditor will analyze a relevant

data sample to see what factors have the most effect on credit worthiness. Once these factors and

their importances are known, a model is developed to calculate a credit score for new applicants.

Methodology of credit scoring models and some measures of their quality were discussed in

works like Hand and Henley (1997) or Crook at al. (2007) and books like Anderson (2007), Siddiqi

Aplimat

–

ournalofAppliedMathematics

volume4(2011),number3

342

(2006), Thomas et al. (2002) and Thomas (2009). Further remarks connected to credit scoring

issues can be found there as well.

Once a scoring model is available, it is natural to ask how good it is. To measure the partial

processes of a financial institution, especially their components like scoring models or other

predictive models, it is possible to use quantitative indexes such as Gini index, K-S statisic, Lift,

Information value and so forth. They can be used for comparison of several developed models at the

moment of development. It is possible to use them for monitoring the quality of models after the

deployment into real business as well. See Wilkie (2004) or Siddiqi (2006) for more details.

The paper deals primarily with the Information value. Commonly it is computed by

discretisation of data into bins using deciles with requirement on the nonzero number of cases for

all bins. As an alternative method to the empirical estimates one can use the kernel smoothing

theory, which allows to estimate unknown densities and consequently, using some numerical

method for integration, to estimate value of the Information value. See Koláček and Řezáč (2010)

for more details.

The main objective of this paper is a description of the empirical estimate with supervised

interval selection. This advanced estimate is based on requirement to have at least k, where k is a

positive integer, observations of scores of both good and bad clients in each considered interval.

Simulation study with normally distributed scores shows high dependency on choise of the

parameter k. If we choose too small value, we get overestimated value of the Information value, and

vice versa. The quality of the estimate is assessed using MSE. According to this criteria, I proposed

a rule for choice of k, which seems to be a reasonable compromise.

2 Basic notations

Consider the realization ∈ of random value  (score) is available for each client. Let  be

the indicator of good and bad client

1, 

0,  (1)

and let , denote cumulative distribution functions of score of bad and good clients, i.e.

 | 0,

  | 1, ∈. (2)

Aassume , and their corresponding densities ,

 are continuous on.

In practice, the empirical distribution functions are used





∑



 ∧0





∑



 ∧1,∈,, (3)

where  is the score of -th client, ,  are the number of good and bad clients, respectively and

.  is the minimum value of given score,  is the maximum value. Finally, we denote



 the proportion of bad clients.

3 The Information value

Aplimat–JournalofAppliedMathematics

volume4(2011),number3

343

Very popular quality index, which is based on densities of scores of good and bad clients, is

the Information value (statisic) defined as







∞







, (4)

where













ln













. (5)

Note that the Information value is also called Divergence. See Wilkie (2004), Hand and Henley

(1997) or Thomas (2009) for more details. The example of 



 for 10% of bad clients with





:0,1 and 90% of good clients with 



:4,2 is illustrated in Figure 1.

Figure 1: Contribution to Information value.

However, in practice, the procedure of computation of the Information value can be a little bit

complicated. Firsty, we don't know the right form of densities 



,



generally and as the second,

mostly we don't know how to compute the integral. I show some approaches to solve these

computational problems.

3.1 Estimates for normally distributed data

In case of normally distributed data, we know everything what is needed. We just have to

discriminate between two cases. Firstly, we consider that scores of good and bad clients have

common variance. In this case we have









, (6)

where 











, 



and 



are expectations of scores of good and bad clients and  is common

standard deviation, see Wilkie (2004) for more details. When equality of variances is not

considered, then in Řezáč (2009) one can find generalized form of 



given by





1

∗



1, (7)

Aplimat

–

ournalofAppliedMathematics

volume4(2011),number3

344

where 

∗























, 

































, 





and 





are variances of scores of good and bad clients.

The similar formula can be found in Thomas (2009). For a given data, estimation of 



is done by

replacing theoretical means and variances in (6) or (7) by their appropriate empirical expressions.

To explore behaviour of the expression (7) it is possible to use tools offered by system Maple.

See Hřebíček and Řezáč (2008) for more details. An example of usage of the Exploration Assistant

is given in Figure 2. We can see a quadratic dependence on difference of means in part (a).

Furthermore, it is clear from (7) that 



takes quite high values when both variances are

approximately equal and smaller or equal to 1, and that it grows to infinity if ratio of the variances

tends to infinity or is nearby zero. These properties of 



are illustrated in Figure 2, part (b).

(a) (b)

Figure 2: Maple Exploration Assistant for 3D-plot of I

val

. Dependence of I

val

(a) on 



and 



for fixed 





and 





, (b) on 





and 





for fixed 



and 



3.2 Empirical estimates

The main idea of this chapter is to replace unknown densities by their empirical estimates.

Let's have  score values 





,1,…, for bad clients and  score values 





,1,…, for

good clients and denote  (resp. ) as the minimum (resp. maximum) of all values. Let's divide the

interval , up to  subintervals 



,



,



,



,…,



,



, where 



1,



 and





,1,…,1 are appropriate quantiles of score of all clients. Set







∑











∈



,











∑











∈



,



,1,…, (8)

Aplimat–JournalofAppliedMathematics

volume4(2011),number3

345

observed counts of bad or good clients in each interval. Denote  the contribution to the

information value on jth interval, calculated by

 



ln

,1,…,. (9)

Then the empirical information value is given by

 ∑



 . (10)

However in practice, there could occur computational problems. The Information value index

becomes infinite in cases when some of  or  are equal to 0. When this arises there are

numerous practical procedures for preserving finite results. For example one can replace the zero

entry of numbers of goods or bads by a minimum constant of say 0.0001. Choosing of the number

of bins is also very important. In the literature and also in many applications in credit scoring, the

value 10 is preferred.

3.3 Empirical estimates with supervised interval selection

This approach follows ideas in the previous chapter. Estimation of information value is given

again by formulas (8) to (10). The main difference lies in construction of the intervals. Because we

want to avoid zero values of  and , I simply looked for such selection of intervals, which

provides such values  and , which are all positive. This will lead to situation when all

fractions and logarithms in (9) are defined and finite.

More generally, I propose to require to have at least , where  is a positive integer,

observations of socres of both good and bad client in each interval, i.e.  and  for

j1,…,. Set L1

F



∙

,1,…,

 (11)



 H,

where F



∙is the empirical quantile function appropriate to the empirical cumulative distribution

function of scores of bad clients.  means lower integer part of number . Usage of quantile

function of scores of bad clients is motivated by the assumption, that number of bad clients is less

than number of good clients, which is quite natural assumption. If  is not divisible by , it is

necessary to adjust our intervals, because we obtain number of scores of bad clients in the last

interval, which is less than . In this case, we have to merge the last two intervals. This will lead to

situation, when it holds  for all computed intervals of scores.

Furthermore we need to ensure, that the number of scores of good clients is as required in

each interval. To do so, we compute  for all actual intervals. If we obtain  for jth interval,

we merge this interval with its neighbor on the right side. This is equivalent with the removal of

 from the sequence of borders of the intervals. This can be done for all intervals except the last

one. If we have  for the last interval, than we have to merge it with its neighbor on the left

side, i.e. we merge the last two intervals. However, this situation is not very probable. If we have a

Aplimat

–

ournalofAppliedMathematics

volume4(2011),number3

346

reasonable scoring model, we can assume that good clients have higher scores than bad clients. It

means that we can expect that the number of scores of good clients is higher than number of scores

of bad clients in the last interval. Due to construction of the intervals, number of scores of bad

clients in the last interval is greater than . Thus, it is natural to expect that number of scores of

good clients in the last interval is also greater than . After all, we obtain  and  for

all created intervals.

Very important is the choice of . If we choose too small value, we get overestimated value of

the Information value, and vice versa. As a reasonable compromise seems to be adjusted square root

of number of bad clients given by

√, (12)

where  means upper integer part of number .

Denote  the contribution to the information value on jth interval, calculated by





ln

,1,…,. (13)

where  and  correspond to observed counts of good and bad clients in intervals created

according to the procedure described in this chapter. The empirical information value with

supervised interval selection is now given by

 ∑



 . (14)

4 Simulation results

It is clear, and it is easy to show that  outperformes . However, this chapter is focused

on properties of  depending on choice of parameter k and depending on proportion of bad clients

 and difference of means of scores of bad and good clients μμ. Consider 10000 clients,

100∙% of bad clients with :μ,1 and 100∙1% of good clients with :,1.

Set 0 and consider 0.5,1and1.5, 0.02,0.05,0.1and0.2. The case μμ

0.5, i.e.  0.25 in our settings, represents weak, μμ1 means high and μμ1.5

very high performance of given scoring model. 2% bad rate (0.02) represents low risk

portfolio, e.g. mortgages (before current crises). 20% bad represents very high risk portfolio, e.g.

subprime cash loans.

Appropriate data sets for simulation was randomly generated 1000 times. Quality of  was

assessed using mean squared error given by

  . (15)

Given this measure, denote

Aplimat–JournalofAppliedMathematics

volume4(2011),number3

347

.

 (16)

Following Table 1 consists of  for all considered values of  and μμ. Proposed values of

k, √, are presented in the last row of the table.

Table 1:  depending on  and .

We can see that  is increasing according to . This is maybe somewhat surprising, but it is

quite natural. The increasing  means increasing number of bad clients, becasuse the number of all

clients was fixed to 10000. If we have enough of bad clients, then too small k leads to too many bins

and consequently to overestimated results. But what is surpricing, it is the dependence on μμ.

While for weak models it is optimal to take very high number of observation in each bin, the

contrary holds for high perfoming models. Overall, √ seems to be a reasonbale

comprimise.

For completeness, Table 2 consists of average numbers of bins for all considered values of 

and μμ. We can see that they took values from 8 to 127.

Table 2: Average number of bins depending on  and .

The dependece of  on k is illustrated in Figure 3 to 5. The highlighted circles correspond

to values of k, where minimal value of the MSE is obtained. The diamonds correspond to values of k

given by (12).

0.02 0.05 0.1 0.2

0.5

29 42 62 84

12 18 23 32

1.5

6989

15 23 32 45

0.02 0.05 0.1 0.2

0.5

8,00 13,00 18,00 24,90

18,00 28,80 42,76 51,88

1.5

33,62 50,20 95,96 127,67

avg.#ofbins

Aplimat

–

ournalofAppliedMathematics

volume4(2011),number3

348

(a) (b)

Figure 3: Dependence of (a) 





and (b) MSE on k, 100000 clients, 







..

We can see that 



is decreasing when k is increasing. In case of μ



μ



0.5, speed of this

decrease is very high for small values of k, while it is nearly negligible for values of k higher than

some critical value. The similar holds for MSE.

(a) (b)

Firuge 4: Dependence of (a) 





and (b) MSE on k, 100000 clients, 







.

When μ



μ



1, the speed of the decrease is lower compared to the previous case. Furthermore

MSE is not so flat, especially for 



2%. But what is interesting and important here, our choice

of k is nearly optimal according to MSE. Moreover, it is valid for all considered values of 



Aplimat–JournalofAppliedMathematics

volume4(2011),number3

349

(a) (b)

Figure 5: Dependence of (a) 





and (b) MSE on k, 100000 clients, 







..

Tha last considered difference of means of scores of good and bad clients was μ



μ



1.5. In

this case, the speed of the decrease of 



is the lowest compared to the previous two cases. The

novelty, relative to the previous two cases, is the shape of MSE. Especially for the highest

considered value of proportion of bad clients, i.e. 



20%, we can see that MSE has really sharp

minimum.

Overall, Figure 3 and Figure 4 show that curves of MSE are quite flat nearby its minimum . It

means that a small deviation of k from 



cause a small change in MSE. On the other hand Figure

5 shows the strong dependence on choice of k.

5 Conclusions

I focused on the Information value and described difficulties of its estimation. The most

popular method is the empirical estimator using deciles of given score. But it can lead to infinite

values of 



and so a remedy is necessary. To avoid these difficulties I proposed the adjustment

for the empirical estimate, called the empirical estimate with supervised interval selection. It is

based on the assumption that we have at least some positive number of observed scores in each

interval. This directly leads to situation when all fractions and all logarithms are defined and finite.

Consequently, 



is defined and finite.

The simulation study was focused on properties of 



depending on choice of parameter k

and depending on proportion of bad clients and difference of means of scores of bad and good

clients. Quality of 



was assessed using mean squared error, which is easy to compute for

normally distributed scores. Moreover, the optimal value of 



was computed.

It was shown that 



was increasing according to 



. This is maybe somewhat surprising,

but it is quite natural. The increasing 



means increasing number of bad clients, becasuse the

number of all clients was fixed in our case. If we have enough of bad clients, then too small k leads

to too many bins and consequently to overestimated results. But what was surpricing, it was the

dependence on μ



μ



. While for weak models it is optimal to take very high number of

observation in each bin, the contrary holds for high perfoming models. Overall, √ seems to

be a reasonbale comprimise.

Aplimat

–

ournalofAppliedMathematics

volume4(2011),number3

350

On the other hand, the obtained results open additional possibilities for research. Especially, it

seems that inclusion of μμ, represented by appropriate estimates, to the rule of choise of k

could lead to significantly better estimates of  when using the proposed empirical estimate with

supervised interval selection.

References

[1] ANDERSON, R.: The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk

Management and Decision Automation. Oxford : Oxford University Press, 2007.

[2] CROOK, J.N., EDELMAN, D.B., THOMAS, L.C.: Recent developments in consumer

credit risk assessment. European Journal of Operational Research, 183 (3), 1447-1465,

2007.

[3] HAND, D.J. and HENLEY, W.E.: Statistical Classification Methods in Consumer Credit

Scoring: a review. Journal of the Royal Statistical Society, Series A. 160 (3), 523-541,

1997.

[4] HŘEBÍČEK, J., ŘEZÁČ, M.: Modelling with Maple and MapleSim In: 22nd European

Conference on Modelling nad Simulation ECMS 2008 Proceedings, Dudweiler, 60-66,

2008.

[5] KOLÁČEK, J., ŘEZÁČ, M.: Assessment of Scoring Models Using Information Value. In:

Compstat’ 2010 proceedings. Paris, 1191-1198, 2010.

[6] ŘEZÁČ, M.: Indexy kvality normálně rozložených skóre. Forum Statisticum Slovacum,

Bratislava, 2009.

[7] SIDDIQI, N.: Credit Risk Scorecards: developing and implementing intelligent credit

scoring. New Jersey: Wiley, 2006.

[8] TERRELL, G.R.: The Maximal Smoothing Principle in Density Estimation. Journal of

the American Statistical Association, 85, 470-477, 1990.

[9] THOMAS, L.C.: Consumer Credit Models: Pricing, Profit, and Portfolio. Oxford:

Oxford University Press, 2009.

[10] THOMAS, L.C., EDELMAN, D.B., CROOK, J.N.: Credit Scoring and Its Applications.

Philadelphia: SIAM Monographs on Mathematical Modeling and Computation, 2002.

[11] WAND, M.P. and JONES, M.C.: Kernel smoothing. London: Chapman and Hall, 1995.

[12] WILKIE, A.D.: Measures for comparing scoring systems. In: Thomas, L.C., Edelman,

D.B., Crook, J.N. (Eds.), Readings in Credit Scoring. Oxford: Oxford University Press, s.

51-62, 2004.

Current address

Martin Řezáč, Mgr., Ph.D.,

Department of Mathematics and Statistics,

Faculty of Science, Masaryk University,

611 37 Brno, Czech Republic, tel. +420 549 493 919,

e-mail: mrezac@math.muni.cz

Deep learning radiomics for focal liver lesions diagnosis on long-range contrast-enhanced ultrasound and clinical factors

Article

Jan 2021

Background: Routine clinical factors play an important role in the clinical diagnosis of focal liver lesions (FLLs); however, they are rarely used in computer-assisted diagnosis. Therefore, we developed a deep learning (DL) radiomics model, and investigated its effectiveness in diagnosing FLLs using long-range contrast-enhanced ultrasound (CEUS) cines and clinical factors. Methods: Herein, 303 patients with pathologically confirmed FLLs after surgery at three hospitals were retrospectively enrolled and divided into a training cohort (n=203), internal validation (IV) cohort (n=50) from one hospital with the ratio of 4:1, and external validation (EV) cohort (n=50) from the other two hospitals. Four DL radiomics models, namely Four Stream 3D convolutional neural network (FS3DU) (trained with CEUS cines only), FS3DU+A (trained with CEUS cines and alpha fetoprotein), FS3DU+H (trained with CEUS cines and hepatitis), and FS3DU+A+H (trained with CEUS cines, alpha fetoprotein, and hepatitis), were formed based on 3D convolutional neural networks (CNNs). They used approximately 20-s preoperative CEUS cines and/or clinical factors to extract spatiotemporal features for the classification of FLLs and the location of the region of interest. The area under curve of the receiver operating characteristic and diagnosis speed were calculated to evaluate the models in the IV and EV cohorts, and they were compared with those of two radiologists. Two-sided Delong tests were used to calculate the statistical differences between the models and radiologists. Results: FS3DU+A+H, which incorporated CEUS cines, hepatitis, and alpha fetoprotein, achieved the highest area under curve of 0.969 (95% CI: 0.901-1.000) and 0.957 (95% CI: 0.894-1.000) among radiologists and other models in IV and EV cohorts, respectively. A significant difference was observed when comparing FS3DU and radiologist 2 (all P<0.05). The diagnosis speed of all the models was the same (10.76 s per patient), and it was two times faster than those of the radiologists (radiologist 1: 23.74 and 27.75 s; radiologist 2: 25.95 and 29.50 s in IV and EV cohorts, respectively). Conclusions: The proposed DL radiomics demonstrated excellent performance on the benign and malignant diagnosis of FLLs by combining CEUS cines and clinical factors. It could help the individualized characterization of FLLs, and enhance the accuracy of diagnosis in the future.

Mathematical Modeling of Economic Phenomena with Maple

Conference Paper

Full-text available

Sep 2012

Jiří Hřebíček

The paper presents new possibilities of Maple 16 for mathematical model-ing and interactive solving economic phenomena. There is shortly introduced Click-able Mathematics and concept of its usage by students, teachers and researches in economic modeling. Educational demonstration worksheets of Maple 16 for solving simple economic phenomena are introduced.

Assessment of Scoring Models Using Information Value

Conference Paper

Jan 2010

Indexy kvality normálně rozložených skóre

Article

Jan 2009

Martin Rezac

http://www.ssds.sk/casopis/archiv/2009/fss0209.pdf#page=146

Measures for comparing scoring systems

Chapter

Jul 2004

A D Wilkie

The purpose of this note is to describe a variety of measures that can be used to compare different scoring systems with one another. A description of these measures may assist practitioners in deciding which they may find useful for their own purposes and also when discussing scoring systems with clients. It may also help clients to understand what practitioners mean when discussing what may be described as the ‘relative power of different scoring systems. when applied to the same data or to different populations. Different individuals may find some measures more intuitively a appealing than others. In every case it is assumed that each ‘case in a given data set can be classified either as ‘good ‘ or as ‘bad ‘. For each case a number of characteristics are recorded, each of which is usually subdivided into discrete categories. A scoring system is a set of (usually integer) scores, one score being associated with each possible category within each relevant characteristic. The score for the case is the sum of the appropriate scores for the categories applicable to that case.

Intelligent Credit Scoring: Building and Implementing Better Credit Risk Scorecards

Book

Jan 2017

Naeem Siddiqi

Kernel Smoothing

Article

Feb 1996
TECHNOMETRICS

Credit scoring and its applications. Incl. 1 CD-ROM

Article

Jan 2002

Credit Scoring and its Applications

Book

Jan 2002

Tremendous growth in the credit industry has spurred the need for Credit Scoring and Its Applications, the only book that details the mathematical models that help creditors make intelligent credit risk decisions. Creditors of all types make risk decisions every day, often haphazardly. This book addresses the two basic types of decisions and offers sound mathematical models to assist with the decision-making process. The first decision creditors face is whether to grant credit to a new applicant (credit scoring), and the second is how to adjust the credit restrictions or the marketing effort directed at a current customer (behavioral scoring). The authors have filled an important niche with this groundbreaking book. Currently, only the most sophisticated creditors use the models contained in this book to make these decisions, but all creditors can know these aids to successful lending. The book contains a comprehensive review of the objectives, methods, and practical implementation of credit and behavioral scoring. The authors review principles of the statistical and operations research methods used in building scorecards, as well as the advantages and disadvantages of each approach. The book also contains a description of practical problems encountered in building, using, and monitoring scorecards and examines some of the country-specific problems caused by bankruptcy, equal opportunities, and privacy legislation. This important feature addresses the fact that the credit lending industry has become more international as consumers from one country use credit cards from lending institutions of a second country to make purchases in a third country. Also included in this book is a discussion of economic theories of consumers' use of credit. The reader will gain an understanding of what lending institutions seek to achieve by using credit scoring and the changes in their objectives. Despite their widespread use in business, no other book details credit scoring variations that should be used with standard statistical and operations research techniques such as discriminant analysis, logistic regression, linear programming, neural nets, and genetic algorithms. Other unique features include methods of monitoring scorecards and deciding when to update them, as well as different applications of scoring, including direct marketing, profit scoring, tax inspection, prisoner release, and payment of fines. Focusing on small data problems is useful pedagogically; therefore, the authors have included a CD-ROM containing a database, mainly to emphasize the data analysis aspects of credit scoring.

The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk Management and Decision Automation

Book

Aug 2007

Raymond Albert Anderson

The Credit Scoring Toolkit provides an all-encompassing view of the use of statistical models to assess retail credit risk and provide automated decisions. In eight modules, the book provides frameworks for both theory and practice. It first explores the economic justification and history of Credit Scoring, risk linkages and decision science, statistical and mathematical tools, the assessment of business enterprises, and regulatory issues ranging from data privacy to Basel II. It then provides a practical how-to-guide for scorecard development, including data collection, scorecard implementation, and use within the credit risk management cycle. Including numerous real-life examples and an extensive glossary and bibliography, the text assumes little prior knowledge making it an indispensable desktop reference for graduate students in statistics, business, economics and finance, MBA students, credit risk and financial practitioners.

The Maximal Smoothing Principle in Density Estimation

Article

Jun 1990

George R. Terrell

We propose a widely applicable method for choosing the smoothing parameters for nonparametric density estimators. It has come to be realized in recent years (e.g., see Hall and Marron 1987; Scott and Terrell 1987) that cross-validation methods for finding reasonable smoothing parameters from raw data are of very limited practical value. Their sampling variability is simply too large. The alternative discussed here, the maximal smoothing principle, suggests that we consider using the most smoothing that is consistent with the estimated scale of our data. This greatly generalizes and exploits a phenomenon noted in Terrell and Scott (1985), that measures of scale tend to place upper bounds on the smoothing parameters that minimize asymptotic mean integrated squared error of density estimates such as histograms and frequency polygons. The method avoids the extreme sampling variability of cross-validation by using ordinary scale estimators such as the standard deviation and interquartile range, which have order n −1 variability; cross-validated parameters have orders of variability such as n −1/5. The disadvantage is that maximal smoothing parameters are conservative, rather than asymptotically optimal. Because they tend to lose information, they should be used in conjunction with other data displays that retain more of the features of the original sample. On the other hand, such conservative methods are widely valued by statisticians because they discourage naive overinterpretation of one's data. Maximal smoothing parameters are here derived for histograms and kernel methods, using not only the standard deviation but several more resistant methods of scale estimation. The method is then applied to density estimation on the half-line, on finite intervals, and in several variables.

Recent developments in Consumer Credit Risk assessment

Article

Feb 2007
EUR J OPER RES

Consumer credit risk assessment involves the use of risk assessment tools to manage a borrower’s account from the time of pre-screening a potential application through to the management of the account during its life and possible write-off. The riskiness of lending to a credit applicant is usually estimated using a logistic regression model though researchers have considered many other types of classifier and whilst preliminary evidence suggest support vector machines seem to be the most accurate, data quality issues may prevent these laboratory based results from being achieved in practice. The training of a classifier on a sample of accepted applicants rather than on a sample representative of the applicant population seems not to result in bias though it does result in difficulties in setting the cut off. Profit scoring is a promising line of research and the Basel 2 accord has had profound implications for the way in which credit applicants are assessed and bank policies adopted.

Estimating Information Value for Credit Scoring Models

Abstract and Figures

Recommended publications

Composed Bisimulation for Tree Automata

Analogie entre séquences. Définition, calcul et utilisation en apprentissage supervisé

Fill Rate of Single-Stage General Periodic Review Inventory Systems: Extended Version with Computati...

Composed Bisimulation for Tree Automata.