ChapterPDF Available


Nowadays, more and more decision procedures are supported or even guided by automated processes. An important technique in this automation is data mining. In this chapter we study how such automatically generated decision support models may exhibit discriminatory behavior towards certain groups based upon, e.g., gender or ethnicity. Surprisingly, such behavior may even be observed when sensitive information is removed or suppressed and the whole procedure is guided by neutral arguments such as predictive accuracy only. The reason for this phenomenon is that most data mining methods are based upon assumptions that are not always satisfied in reality, namely, that the data is correct and represents the population well. In this chapter we discuss the implicit modeling assumptions made by most data mining algorithms and show situations in which they are not satisfied. Then we outline three realistic scenarios in which an unbiased process can lead to discriminatory models. The effects of the implicit assumptions not being fulfilled are illustrated by examples. The chapter concludes with an outline of the main challenges and problems to be solved.
Chapter 3
Why Unbiased Computational Processes Can
Lead to Discriminative Decision Procedures
Toon Calders
TU Eindhoven, NL
Indrė Žliobaitė
Bournemouth University, UK
Abstract Nowadays, more and more decision procedures are being supported or
even guided by automated processes. An important technique in this automation is
data mining. In this chapter we study how such automatically generated decision
support models may exhibit discriminatory behavior towards certain groups based
upon, e.g., gender or ethnicity. Surprisingly, such behavior may even be observed
when sensitive information is removed or suppressed and the whole procedure is
guided by neutral arguments such as predictive accuracy only. The reason for this
phenomenon is that most data mining methods are based upon assumptions that
are not always satisfied in reality, namely, that the data is correct and represents
the population well. In this chapter we discuss the implicit modeling assumptions
made by most data mining algorithms and show situations in which they are not
satisfied. Then we outline three realistic scenarios in which an unbiased process
can lead to discriminatory models. The effects of the implicit assumptions not be-
ing fulfilled are illustrated by examples. The chapter concludes with an outline of
the main challenges and problems to be solved.
3.1 Introduction
Data mining is becoming an increasingly important component in the construction
of decision procedures (See Chapter 2 of this book). More and more historical data
is becoming available, from which automatically decision procedures can be de-
rived. For example, based on historical data, an insurance company could apply
data mining techniques to model the risk category of customers based on their age,
profession, type of car, and history of accidents. This model can then be used to
advise the agent on pricing when a new client applies for car insurance.
In this chapter we will assume that a data table is given for learning a model, for
example, data about past clients of an insurance company and their claims. Every
2 Toon Calders and Indrė Žliobaitė
rows of the table represent an individual case, called an instance. In the insurance
company example, every row could correspond to one historical client. The in-
stances are described by their characteristics, called attributes or variables. The at-
tributes of a client could for example be his or her gender, age, years of driving
experience, a type of car, a type of insurance policy. For every client the exact
same set of attributes is specified. Usually there is also one special target attrib-
ute, called the class attribute that the company is interested to predict. For the in-
surance example, this could, e.g., be whether or not the client has a high accident
risk. The value of this attribute can be determined by the insurance claims of the
client. Clients with a lot of accidents will be in the high risk category, the others in
the low risk category. When a new client arrives, the company wants to predict his
or her risk as accurately as possible, based upon the values of the other attributes.
This process is called classification. For classification we need model the depend-
ency of the class attribute on the other attributes. For that purpose many classifica-
tion algorithms have been developed in machine learning, data mining and pattern
recognition fields, e.g. a decision tree, a support vector machine, logistic regres-
sion. For a given classification task a model that relates the value of the class at-
tribute to the other attributes needs to be learned on the training data; i.e., instanc-
es of which the class attribute is known. A learned model for a given task could be
for example a set of rules such as:
IF Gender=male and car type=sport THEN risk=high.
Once a model is learned, it can be deployed for classifying new instances of which
the class attribute is unknown. The process of learning a classifier from training
data is often referred to as Classifier Induction. For a more detailed overview of
classifiers and how they can be derived from historical data, see Chapter 2.
In this chapter we will show that data mining and classifier induction can lead to
similar problems as for human decision makers, including basing their decisions
upon discriminatory generalizations. This can be particularly harmful since data
mining methods are often seen as solidly based upon statistics and hence purely
rational and without prejudice. Discrimination is the prejudiced treatment of an
individual based on their membership in a certain group or category. In most Eu-
ropean and Northern-American countries, it is forbidden by law to discriminate
against certain protected-by-law groups (See Chapter 4 of this book for an over-
view). Although we do not explicitly refer to the anti-discrimination legislation of
a particular country, most of our examples will directly relate to EU directives and
legislation. The European Union has one of the strongest anti-discrimination legis-
lations (See, e.g., Directive 2000/43/EC, Directive 2000/78/EC/ Directive
2002/73/EC, Article 21 of the Charter of Fundamental Rights and Protocol
12/Article 14 of the European Convention on Human Rights), describing discri m-
ination on the basis of race, ethnicity, religion, nationality, gender, sexuality, disa-
bility, marital status, genetic features, language and age. It does so in a number of
settings, such as employment and training, access to housing, public services, edu-
cation and health care; credit and insurance; and adoption. European efforts on the
non-discrimination front make clear the fundamental importance for Europe's citi-
Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures 3
zens of the effective implementation and enforcement of non-discrimination
norms. As a recent European Court of Justice case-law on age discrimination sug-
gests, non-discrimination norms constitute fundamental principles of the European
legal order. (See, e.g., Case 144/04 [2005] ECR I-9981 (ECJ), Judgment of the
Court of 22 November 2005, Werner Mangold v Rüdiger Helm; Case C-555/07
[2010], Judgment of the Court (Grand Chamber) of 19 January 2010, Seda
Kücükdeveci v Swedex GmbH & Co. KG.) Therefore it is best interest of banks,
insurance companies, employment agencies, the police and other institutions that
employ computational models for decision making upon individuals to ensure that
these computational models are free from discrimination. In this chapter, discrimi-
nation is considered to be present if for two individuals that have the same charac-
teristic relevant to the decision making and differ only in the sensitive attribute
(e.g., gender or race) a model outputs different decisions.
The main reason that data mining can lead to discrimination is that the computa-
tional model construction methods are often based upon assumptions that turn out
not to be true in practice. For example, in general it is assumed that the data on
which the model is learned follows the same distribution as the data on which the
classifier will have to work; i.e., the situation will not change. In section 4.2 we
elaborate on the implicit assumptions made during classifier construction and il-
lustrate with fictitious examples how they may be violated in real situations. In
Section 4.3 we move on to show how this mismatch between reality and the as-
sumptions could lead to discriminatory decision processes. We show three types
of problems that may occur: sampling bias, incomplete data, or incorrect labeling.
We show detailed scenarios in which the problems are illustrated. In Section 4.4
we discuss some simple solutions to the discrimination problem, and show why
these straightforward approaches do not always solve the problem. Section 4.5
then concludes the chapter by giving an overview of the research problems and
challenges in discrimination-aware data mining and connects them to the other
chapters in this book.
We would like to stress that all examples in this chapter are purely fictitious; they
do not represent our experiences with discrimination in real life, or our belief of
where these processes are actually happening. Instead this chapter is a purely me-
chanical study of how we believe such processes occur.
3.2 Characterization of the Computational Modeling Process
Computational models are mathematical models that predict an outcome from
characteristics of an object. For example, banks use computational models (classi-
fiers) for credit scoring. Given characteristics of an individual, such as age, in-
come, credit history, the goal is to predict whether a given client will repay the
4 Toon Calders and Indrė Žliobaitė
loan. Based on that prediction a decision whether to grant a credit is made. Banks
build their models using their historical databases of customer performance. The
objective is to achieve as good accuracy as possible on unseen new data. Accuracy
is the share of correct predictions in the total number of predictions.
Computational models are built and trained by data mining experts using historical
data. The performance and properties of a model depend, among other factors, on
the historical data that has been used to train it. This section provides an overview
of the computational modeling process and discusses the expected properties of
the historical data. The next section will discuss how these properties translate into
models that may result in biased decision making.
3.2.1 Modeling Assumptions
Computational models typically rely on the assumptions, that (1) the characteris-
tics of the population will stay the same in the future when the model is applied,
and (2) the training data represents the population well. These assumptions are
known as the i.i.d. setting, which stands for independently identically distributed
random variables (see e.g. Duda, Hart and Stork, 2001).
The first assumption is that the characteristics of the population from which the
training sample is collected are the same as the characteristics of the population on
which the model will be applied. If this assumption is violated, models may fail to
perform accurately (Kelly, Hand and Adams, 1999). For instance, the repayment
patterns of people working in the car manufacturing industry may be different at
times of economic boom as compared to times of economic crisis. A model
trained at times of boom may not be that accurate at times of crises. Or, a model
trained on data collected in Brazil may not be correct to predict the performance of
customers in Germany.
The second assumption is satisfied if our historical dataset closely resembles the
population of the applicants in the market. That means, for instance, that our train-
ing set needs to have the same share of good and bad clients as the market, the
same distribution of ages as in the market, the proportions of males and females,
and the same proportions high-skilled and low-skilled labor. In short, the second
assumption implies that our historical database is a small copy of a large popula-
tion out there in the market. If the assumption is violated, then our training data is
incomplete and a model trained on such data may perform sub-optimally (Zadroz-
ny, 2004).
The representation of the population in our database may be inaccurate in two
ways. Either the selection of people to be included may be biased or the selection
of attributes by which people are described in our database may be incomplete.
Suppose that a bank collects a dataset consisting only of people that live in a ma-
Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures 5
jor city. A model is trained on this data and then it is applied to all incoming cus-
tomers, including the ones that live in remote rural areas, and have different em-
ployment opportunities and spending habits. The model may not perform well on
the rural customers, since the training was forced to focus on the city customers.
Or suppose that a bank collects a representative sample of clients, but does not ask
about the stability of income of people, which is considered to be one of the main
factors in credit performance. Without this information the model will treat all the
individuals as if they earn the same and thus lose the opportunity to improve upon
accuracy for people with very high and very low income stability.
If the two assumptions are satisfied, it is reasonable to expect that models will
transfer the knowledge from the historical data to the future decision making. On
the other hand, however, if the historical data is prejudiced, the models trained on
this data can be expected to yield prejudiced decisions. As we will see in the fol-
lowing subsection the assumptions may not hold in reality due to the origins of da-
ta. If the i.i.d. assumptions are not satisfied, the computational models built in
such settings might still be valid; however, possible effects of these breaches need
to be taken into account when interpreting the results
3.2.2 Origins of Training Data
In order to identify the sources of possible discrimination in trained models we
need to analyze the origins and the characteristics of the training data. Data Collection
First of all, the data collection process may be intentionally or unintentionally bi-
ased. For instance, Turner & Skidmore (1999) discuss different stages of the
mortgage lending process that potentially may lead to racial discrimination. Ad-
vertising and promotions can be sent to selected neighborhoods. Pre-application
consultancy may be offered on a biased basis. These actions may lead to a situa-
tion when the historical database of applicants does not represent the potential cli-
ents. Other examples of biased data collection include racial profiling of crime
suspects or selecting people for further security checks at airports. If people of
particular ethnic backgrounds are stopped for searches more often, even if they
were never convicted for carrying forbidden items, the historical database will
contain a skewed representation of a population. Relations between Attributes in Data
Second, the attributes that characterize our subjects may not be independent from
each other. For example, a postal code of a person may be highly correlated with
ethnicity, since people may tend to choose to live close to relatives, acquaintances
or a community (see Rice, 1996 for more examples in lending). A marital status
6 Toon Calders and Indrė Žliobaitė
may be correlated with gender, for instance, the statuses as “wife” or “husband”
directly encode gender, while “divorced” does not relate to gender.
If the attributes are independent, every attribute contributes its separate share to
the decision making in the model. If variables are related to each other, it is not
straightforward to identify and control which variable contributes to what extent to
the final prediction. Moreover, it is often impossible to collect all the attributes of
a subject or take all the environmental factors into account with a model. There-
fore our data may be incomplete, i.e., missing some information and some hidden
information may be transferred indirectly via correlated attributes. Data Labeling
Third, the historical data to be used for training a model contains the true labels,
which in certain cases may be incorrect and contain prejudices. Labels are the tar-
gets that an organization wants to predict for new incoming instances. The true la-
bels in the historical data may be objective or subjective. The labels are objective
when assigning these labels, no human interpretation was involved; the labels are
hard in the sense that there can be no disagreement about their correctness be-
tween different human observers. Examples of objective labels include the indica-
tors weather an existing bank customer repaid a credit or not, whether a suspect
was wearing a concealed weapon, or whether a driver tested positive or negative
for alcohol intoxication. Examples of subjective labels include the assessment of a
human resource manager if a job candidate is suitable for a particular job, if a cli-
ent of a bank should get a loan or not, accepting or denying a student to a universi-
ty, the decision whether or not to detain a suspect. For the subjective labels there
is a gray area in which human judgment may have influenced the labeling result-
ing in a bias in the target attribute. In contrast to the objective labels, here there
may be disagreement between different observers; different people may assess a
job candidate or student application differently; the notion of what is the correct
label is fuzzy.
The distinction between subjective and objective labels is important in assessing
and preventing discrimination. Only the subjective labels can be incorrect due to
biased decision making in the historical data. For instance, if females have been
discriminated in university admission, some labels in our database saying whether
persons should be admitted will be incorrect according to the present non-
discriminatory regulations. Objective labels, on the other hand, will be correct
even if our database is collected in a biased manner. For instance, we may choose
to detain suspects selectively, but the resulting true label whether a given suspect
actually carried a gun or not will be measurable and is thus objectively correct.
The computational modeling process requires an insightful analysis of the origins
and properties of training data. Due to origins of data the computational models
Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures 7
trained on this data may be based on incorrect assumptions, and as a result, as we
will see in the next section, may lead to biased decision making.
3.3 Types of Problems
In this section we discuss three scenarios that show how the violation of the as-
sumptions sketched in the previous section may affect the validity of models
learned on data and lead to discriminatory decision procedures. In all three scenar-
ios we explicitly assume that the only goal of data mining is to optimize accuracy
of predictions, i.e. there is no incentive to discriminate based on taste. Before we
go into the scenarios, we first recall the important notion of accuracy of predic-
tions and we explain how we will assess discrimination of a classifier. Then we
will deal with three scenarios illustrating the following situations:
Labels are incorrect: due to historical discrimination the labels are biased. Even
though the labels accurately represent decisions of the past, for the future task
they are no longer appropriate. Reasons could be, e.g., explicit discrimination,
or a change in labeling in the future. This corresponds to assumption 1 of Sec-
tion 4.2.1 being violated.
The sampling procedure is biased: the labels are correct and unbiased, but par-
ticular groups are under- or overrepresented in the data, leading to incorrect in-
ferences by the classifier induction. This corresponds to assumption 2 (first
principled way) of Section 4.2.1 being violated.
The data is incomplete; there are hidden attributes: often not all attributes that
determine the label are being monitored. Often because of reasons of privacy or
just because they are difficult to observe. In such a situation it may happen that
sensitive attributes are used as a proxy and indirectly lead to discriminatory
models. This corresponds to assumption 2 (second principled way) of Section
4.2.1 being violated.
3.3.1 Accuracy and Discrimination
Suppose that the task is to learn a classifier that divides new bank customers into
two groups: likely to repay and unlikely to repay. Based on historical data of exist-
ing customers and whether or not they repaid their loans, we learn a classifier. A
classifier is a mathematical model that allows us to extrapolate based on observa-
ble attributes such as gender, age, profession, education, income, address, and out-
standing loans to make predictions. Recall that the accuracy of a classifier learned
on such data is defined as the percentage of predictions of the classifier that are
correct. To assess this key performance measure before actually deploying the
model in practice, usually some labeled data (i.e., instances of which we already
8 Toon Calders and Indrė Žliobaitė
know the outcome) is used, that has been put aside for this purpose and not been
used during the learning process.
Our analysis is based upon the following two assumptions about classification
Assumption 1: the classifier learning process is only aimed at obtaining an accu-
racy as high as possible. No other objective is strived for during the data mining
Assumption 2: A classifier discriminates with respect to a sensitive attribute, e.g.
gender, if for two persons which only differ by their gender (and maybe some
characteristics irrelevant for the classification problem at hand) that classifier pre-
dicts different labels.
Note that the two persons in assumption 2 only need to agree on relevant charac-
teristics. Otherwise one could easily circumvent the definition by claiming that a
person was not discriminated based on gender, but instead because she was wear-
ing a skirt. Although people “wearing a skirt” do not constitute a protected-by-law
subpopulation, using such an attribute would be unacceptable given its high corre-
lation with gender and that characteristics such as “wearing a skirt” are considered
to be irrelevant for credit scoring. Often, however, it is far less obvious to separate
relevant and irrelevant attributes. For instance, in a mortgage application an ad-
dress may at the same time be important to assess the intrinsic value of a property,
and reveal information about the ethnicity of a person. As we will see in Chapter 8
on explainable and non-explainable discrimination, however, it is not at all easy to
measure and assess such possibilities for indirect discrimination in practical cases.
The legal review in Chapter 4 shows that our definition of discrimination is in line
with current legislation forbidding direct as well as indirect discrimination. Article
2 of Directive 2000/43/EC by the European commission explicitly deals with indi-
rect discrimination: indirect discrimination shall be taken to occur where an ap-
parently neutral provision, criterion or practice would put persons of a racial or
ethnic origin at a particular disadvantage compared with other persons, unless
that provision, criterion or practice is objectively justified by a legitimate aim and
the means of achieving that aim are appropriate and necessary.”
3.3.2 Scenario 1: Incorrect Labels
In this scenario the labels do not accurately represent the population that we are
interested in. In many cases there is a difference in the labels in the training data
and the labels that we want to predict on the basis of test data.
The labels in the historical data are the result of a biased and discriminative
decision making process. Sample selection bias exists when, instead of simply
missing information on characteristics important to the process under study, the
researcher is also systematically missing subjects whose characteristics vary
Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures 9
from those of the individuals represented in the data (Blank et al, 2004). For
example, an employment bureau wants to implement a module to suggest suit-
able jobs to unemployed people. For this purpose, a model is built based upon
historical records of former applicants successfully acquiring a job by linking
characteristics such as their education and interests to the job profile. Suppose,
however, that historically women have been treated unfairly by denying higher
board functions to them. A data mining model will pick up this relation be-
tween gender and higher board functions and use it for prediction.
Labeling changes in time. Imagine a bank wanting to make special offers to its
more wealthy customers. For many customers only partial information is avail-
able, because, e.g., they have accounts and stock portfolios with other banks as
well. Therefore, a model is learned that, based solely upon demographic char-
acteristics, decides if a person is likely to have a high income or not. Suppose
that one of the rules found in the historical data states that, overall, men are
likely to have a higher income than women. This fact can be exploited by the
classifier to deny the special offer to women. Recently, however, gender
equality programs and laws have resulted in closing the gender gap in income,
such that this relation between gender and income that exists in the historical
data is expected to vanish, or at least become less apparent than in the historical
data. For instance, the distance Learning Center (2009) provides data indicating
the earning gap between male and female employees. Back in 1979 women
earned 59 cents for every dollar of income that men earned. In 2009 that figure
has risen to 81 cents for every dollar of income that men earned. In this exam-
ple, the target attribute changes between the training data and the new data to
which the learned model is applied, i.e. the dependence on the attribute gender
decreases. Such background knowledge may encourage an analyst to apply dis-
crimination-aware techniques that try to learn the part of the relation between
the demographic features and the income that is independent of the gender of
that person. In this way the analyst kills two birds with one stone: the classifier
will be less discriminatory and at the same time more accurate.
3.3.3 Scenario 2: Sampling Bias
In this scenario training data may be biased, i.e. some groups of individuals may
be over- or underrepresented, even though the labels themselves are correct. As
we will show, such a sample bias may lead to biased decisions.
Let us consider the following example of over- and underrepresented groups in
studies. To reduce the number of car accidents, the police increases the number of
alcohol checks in a particular area. It is generally accepted that young drivers
cause more accidents than older drivers; for example, a study by Jonah (1986)
confirms that young (1625) drivers (a) are at greater risk of being involved in a
casualty accident than older drivers and (b) this greater risk is primarily a func-
tion of their propensity to take risks while driving). Because of that, the police of-
10 Toon Calders and Indrė Žliobaitė
ten specifically targets this group of young drivers in their checks. People in the
category “over 40” are checked only sporadically, when there is a strong incentive
or suspicion of intoxication. After the campaign, it is decided to analyze the data
in order to find specific groups in society that are particularly prone to alcohol
abuse in traffic. A classification model is learned on the data to predict, given the
age, ethnicity, social class, car type, gender, whether a person is more or less like-
ly to drive while being intoxicated. Since only the labels are known for those peo-
ple that were actually checked, only this data is used in the study. Due to data col-
lection procedure there is a clear sample bias in the training data: only those
people that were checked are in the dataset, while this is not a representative sam-
ple of all people that participate in the traffic. Analysis of this dataset could sur-
prisingly conclude that particularly women of over 40 represent a danger of being
intoxicated while driving. Such a finding is explainable by the fact that according
to the examples presented to the classifier, middle aged women are more intoxi-
cated than on average. A factor that was disregarded in this analysis, however, is
that middle-aged women were only checked by the police when there was a more
than serious suspicion of intoxication. Even though in this example it is obvious
what went wrong in the analysis, sample bias is a very common and hard to solve
problem. Think, e.g., of medical studies only involving people exhibiting certain
symptoms, or enquiries by telephone that are only conducted for people whose
phone number appeared on the list used by the marketing bureau. Depending on
the source of the list that may have been purchased from other companies, particu-
lar groups may be over- or underrepresented.
3.3.4 Scenario 3: Incomplete Data
In this scenario training data contains only partial information of the factors that
influence the class label. Often important characteristics are not present because
of, e.g., privacy reasons, or because that data is hard to collect. In such situations a
classifier will use the remaining attributes and get the best accuracy out of it, often
overestimating the importance of the factors that are present in the dataset. Next
we discuss an example of such a situation.
Consider an insurance company that wants to determine the risk category of new
customers, based upon their age, gender, car type, years of driving experience etc.
An important factor that the insurance company cannot take into account, howev-
er, is the driving style of the person. The reason for the absence of this information
is obvious: gathering it; e.g., by questioning his or her relatives, following the per-
son while he or she is driving, getting information on the number of fines the per-
son had during the last few years, would not only be extremely time-consuming,
but would also invade that person’s privacy. Therefore, as a consequence, the data
is often incomplete and the classifier will have to base its decisions on other avail-
able attributes. Based upon the historical data it is observed that in our example
Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures 11
next to the horsepower of the car, age and gender of a person are highly correlated
to the risk (the driving style is hidden for the company), see Table 1.
Table 1. Example (fictitious) dataset on risk assessment for car insurances based
on demographic features. The attribute Driving style is hidden for the insurance
Driving style
30 years
35 years
24 years
18 years
65 years
54 years
21 years
29 years
From this dataset it is clear that the true decisive factor is the driving style of the
driver, rather than gender or age; all high risk drivers have an aggressive driving
style, and vice versa, only one aggressive driver does not have a high risk. There is
an almost perfect correlation between being an aggressive driver and presenting a
high accident risk in traffic. The driving style, however, is tightly connected to
gender and age. Young male drivers will thus, according to the insurance compa-
ny, present a higher danger and hence receive a higher premium. In such a situa-
tion we say that the gender of a person is a so-called proxy for the difficult to ob-
serve attribute driving style. In statistics, a proxy variable describes something that
is probably not in itself of any great interest, but from which a variable of interest
can be obtained
. An important side effect of this treatment, however, will be that
a calm male driver will actually receive a higher insurance premium than an ag-
gressive female driving the same car and being of the same age. The statistical
discrimination theory (see Fang and Moro, 2010) states that inequality may exist
between demographic groups even when economic agents (consumers, workers,
employers) are rational and non-prejudiced, as stereotypes may be based on the
discriminated group's average behavior
. Even if that is rational, according to anti-
discrimination laws, this may constitute an act of discrimination, as the male per-
son is discriminated on the basis of a characteristic that pertains to males as a
group, but not to that person individually. Of course, a classifier will have to base
its decisions upon some characteristics, and the incompleteness of the data will in-
evitably lead to similar phenomena; e.g., an exaggerated importance in the deci-
sion procedure on the color of the car, the horsepower, the city the person lives in,
etc. The key issue here, however, is that some attributes are considered by law to
Wikipedia: Proxy (statistics).
Wikipedia: Statistical discrimination (economics).
12 Toon Calders and Indrė Žliobaitė
be inappropriate to generalize upon, such as gender, age, religion, etc., but others,
such as horsepower or a color of a car are not.
3.4 Potential Solutions for Discrimination Free Computation
We argued that unbiased computational processes may lead to discriminatory de-
cisions due to historical data being incorrect or incomplete. In this section we dis-
cuss the main principles how to organize computational modeling in such a way
that discrimination in decision making is prevented. In addition, we outline the
main challenges and problems to be solved for such modeling.
3.4.1 Basic Techniques that do not Solve the Problem
We start with discussing the limitations of several basic solutions for training
computational models.
Removing the Sensitive Attribute
Table 2. Example (fictitious) dataset on lending decisions.
Work exp.
Postal code
Loan decision
12 years
2 years
5 years
10 years
10 years
5 years
12 years
2 years
The first possible solution is to remove the sensitive attribute from the training da-
ta. For example, if gender is the sensitive attribute in university admission deci-
sions, one would first think of excluding the gender information from the training
data. Unfortunately, as we saw in the previous section (Table 1), this solution does
not help if some other attributes are correlated with the sensitive attribute.
Consider an extreme example on a fictitious lending decisions dataset in Table 2.
If we remove the column “Ethnicity” and learn a model over the remaining da-
taset, the model may learn that if the postal code starts with 12 then the decision
should be positive, otherwise the decision should be negative. We see that, for in-
stance, customers #4 and #5 have identical characteristics except the ethnicity, and
they will be offered different decisions. Such a situation is generally considered to
be discriminatory.
Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures 13
The next step would be to remove the correlated attributes as well. This seems
straightforward in our example dataset; however, it is problematic if the attribute
to be removed also carries some objective information about the label. Suppose a
postal code is related to ethnicity, but also carries information about real estate
prices in the neighborhood. A bank would like to use the information about the
neighborhood, but not information about the ethnicity in deciding for a loan. If the
ethnicity is removed from the data, a computational model still can predict the
ethnicity (internally) indirectly, based on the postal code. If we remove the postal
code, we also remove the objective information about real estate prices that would
be useful for decision making. Therefore, more advanced discrimination handling
techniques are required.
Building Separate Models for the Sensitive Groups
The next solution that comes to mind is to train separate models for individual
sensitive groups, for example, one for males, and one for females. It may seem
that each model is objective, since individual models do not include gender infor-
mation. Unfortunately, this does not solve the problem either if the historical deci-
sions are discriminatory.
Table 3. Example (fictitious) dataset on university admissions.
Test score
Consider a simplified example of a university admission case in Table 3. If we
build a model for females using only data from females, the model will learn that
every female that scores at least 80 in the test, should be accepted. Similarly, a
model trained only on male data will learn that every male that scores over 70 in
the test should be accepted. We see that, for instance, applicants #3 and #4 will
have identical characteristics except the gender, yet they will be offered different
decisions. This situation is generally considered to be discriminatory as well.
3.4.2 Computational Modeling for Discrimination Free Decision Making
Two main principles can be employed for making computational models discrimi-
nation free when historical data is biased. A data miner can either correct the train-
ing data or impose constraints on the model during training.
14 Toon Calders and Indrė Žliobaitė Correcting the Training Data
The goal of correcting the training data is to make the dataset discrimination free
and/or unbiased. If the training data is discrimination free and unbiased, then we
expect a learned computational model to be discrimination free.
Different techniques or combinations of those techniques can be employed for
modifying data that include, but are not limited to:
1. modifying labels of the training data,
2. duplicating or deleting individual samples,
3. adding synthetic samples,
4. transforming data into new representation space.
Several existing approaches for discrimination free computational modeling use
data correction techniques (Kamiran & Calders, 2010) (Kamiran & Calders,
2009). For more information see Chapter 12, where selected data correcting tech-
niques are discussed in more detail. Imposing constraints on the model training
Alternatively to correcting the training data, a model training process can be di-
rected in such a way that anti-discrimination constraints are enforced. The tech-
niques how to do that will depend on specific computational models employed.
Several approaches for imposing such constraints while training exist (Calders &
Verwer, 2010) (Kamiran, Calders, & Pechenizkiy, 2010). For more information
see Chapter 14, where selected techniques for model training with constraints are
discussed in more detail.
3.5 Conclusion and Open Problems
We discussed the mechanisms may produce computational models that may pro-
duce discriminatory decisions. A purely statistics-based, unbiased learning algo-
rithm may produce biased computational models if our training data is biased, in-
complete or incorrect due to discriminatory decisions in the past or due to
properties of the data collection. We have outlined how different implicit assump-
tions in the computational techniques for inducing classifiers are often violated,
and how this leads to discrimination problems. Because of the opportunities pre-
sented by growing amounts of data available for analysis automatic classification
gains importance. Therefore, it is necessary to develop classification techniques
that prevent this unwanted behavior.
Building discrimination free computational models from biased, incorrect or in-
complete data is in its early stages, however, in spite of the fact that a number of
Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures 15
case studies searching for discrimination evidence are available (see e.g. Turner &
Skidmore, 1999). Removing discrimination from computational models is chal-
lenging. Due to incompleteness of data and underlying relations between different
variables it is not sufficient to remove the sensitive attribute or apply separate
treatment to the sensitive groups.
In the last few years several non discriminatory computational modeling tech-
niques have been developed but there are still large challenges ahead: In our view
two challenges require urgent research attention in order to bring non-
discriminatory classification techniques to deployment in applications. The first
challenge is how to measure discrimination in real, complex data with a lot of at-
tributes. According to the definition, a model is discriminatory if it outputs differ-
ent predictions for candidates that differ only in the sensitive attribute and other-
wise are identical. If real application data is complex, it is unlikely for every data
point to find the “identical twin” that would differ only in the value of the sensi-
tive attribute. To solve this problem, legally grounded and sensible from data min-
ing perspective notions and approximations of similarity of individuals for non-
discriminatory classification need to be established. The second major challenge is
how to find out which part of information carried by a sensitive (or correlated) at-
tribute is sensitive and which is objective, as in the example of a postal code carry-
ing the ethnicity information and the real estate information. Likewise, the notions
of partial explainability of decisions by individual or groups of attributes need to
be established, and they need to be legally grounded and sensible from data min-
ing perspective.
Blank, R., Dabady, M., Citro, C. (2004). Measuring Racial Discrimination.
Natl Academy Press.
Brian A. Jonah (1986). Accident risk and risk-taking behavior among young
drivers, Accident Analysis & Prevention, 18(4), 255-271.
Calders, T., & Verwer, S. (2010). Three Naive Bayes Approaches for
Discrimination-Free Classification. Data Mining and Knowledge Discovery,
21(2), 277-292.
Distance Learning Center (2009). Internet Based Benefit and Compensation
Administration: Discrimination in Pay (Chapter 26). . Accessed: No-
vember, 2011
Duda, R. O., Hart, P. E. & Stork, D. G. (2001). Pattern Classification (2nd
edition), John Wiley & Sons.
Fang, H. & Moro, A. (2010). Theories of Statistical Discrimination and
Affirmative Action: A Survey In J. Benhabib, A. Bisin and M., Jackson (Ed.)
Handbook of Social Economics (pp. 133-200).
16 Toon Calders and Indrė Žliobaitė
Kamiran, F., & Calders, T. (2010). Classification with no discrimination by
preferential sampling. Proceedings of the 19th Annual Machine Learning
Conference of Belgium and the Netherlands (BENELEARN’10) , 1-6.
Kamiran, F., & Calders, T. (2009). Classifying without Discrimination. IEEE
International Conference on Computer, Control and Communication (IEEE-IC4),
Kamiran, F., Calders, T., & Pechenizkiy, M. (2010). Discrimination Aware
Decision Tree Learning. Proceedings IEEE ICDM International Conference on
Data Mining (ICDM’10), 869 - 874.
Kelly, M.G., Hand, D.J., and Adams, N.M. (1999). The Impact of Changing
Populations on Classifier Performance. Proceedings of the fifth ACM SIGKDD in-
ternational conference on Knowledge discovery and data mining (KDD’99), 367-
Rice, W. (1996). Race, Gender, Redlining, and the Discriminatory Access to
Loans, Credit, and Insurance: An Historical and Empirical Analysis of Consumers
Who Sued Lenders and Insurers in Federal and State Courts, 1950-1995, San Die-
go Law Review 33(1996), 637-46.
Turner, A., & Skidmore, F. (1999). Introduction, Summary, and
Recommendations. In A. Turner, & F. Skidmore, Mortgage Lending
Discrimination: A Review of Existing Evidence (Urban Institute Monograph
Series on Race and Discrimination) (pp. 1-22). Washington, DC: Urban Institute
Widmer G. and Kubat M. (1996). Learning in the presence of concept drift and
hidden contexts. Machine Learning 23(1), 69-101.
Zadrozny, B. (2004). Learning and Evaluating Classifiers under Sample Selec-
tion Bias. Proceedings of the 21st International Conference on Machine Learning
(ICML'04), 903-910.
... It is important to state that those harms could stem from unconscious behavior due to the inconclusiveness of a dataset, which leads to the second bias type -technical constraints. Calders and Žliobaitė (2013) present three types of scenarios for why data models could lead to "discriminatory decision procedures." The first type is concerned with incorrect labels resulting from historical biases. ...
... While this harm could arise simply from a technical constraint, wherein the algorithm is simply not trained and designed to recognize wounds, it could also be a symptom of using block-lists to avoid describing cruel scenes. However, Calders and Žliobaitė (2013) emphasize that incompleteness of data as well as "underlying . ...
Full-text available
The use of AI-generated image captions has been increasing. Scholars of disability studies have long studied accessibility and AI issues concerning technology bias, focusing on image captions and tags. However, less attention has been paid to the individuals and social groups depicted in images and captioned using AI. Further research is needed to understand the underlying representational harms that could affect these social groups. This paper investigates the potential representational harms to social groups depicted in images. There is a high risk of harming certain social groups, either by stereotypical descriptions or erasing their identities from the caption, which could affect the understandings, beliefs, and attitudes that people hold about these specific groups. For the purpose of this article, 1,000 images with human-annotated captions were collected from news agencies “politics” sections. Microsoft's Azure Cloud Services was used to generate AI-generated captions with the December 2021 public version. The pattern observed from the politically salient images gathered and their captions highlight the tendency of the model used to generate more generic descriptions, which may potentially harm misrepresented social groups. Consequently, a balance between those harms needs to be struck, which is intertwined with the trade-off between generating generic vs. specific descriptions. The decision to generate generic descriptions, being extra cautious not to use stereotypes, erases and demeans excluded and already underrepresented social groups, while the decision to generate specific descriptions stereotypes social groups as well as reifies them. The appropriate trade-off is, therefore, crucial, especially when examining politically salient images.
... Biased outcomes may also occur in cases where the data is unbiased but is not well-sampled [29]. For instance, overrepresentation of a certain group in the data set can lead to disproportionate adverse effects on those groups [17]. ...
Full-text available
Increasingly, the combination of clinical judgment and predictive risk modelling have been assisting social workers to segregate children at risk of maltreatment and recommend potential interventions of authorities. A critical concern among governments and research communities worldwide is that misinterpretations due to poor modelling techniques will often result in biased outcomes for people with certain characteristics (e.g., race, socioeconomic status). In the New Zealand care and protection system, the over-representation of M\=aori might be incidentally intensified by predictive risk models leading to possible cycles of bias towards M\=aori, ending disadvantaged or discriminated against, in decision-making policies. Ensuring these models can identify the risk as accurately as possible and do not unintentionally add to an over-representation of M\=aori becomes a crucial matter. In this article we address this concern with the application of predictive risk modelling in the New Zealand care and protection system. We study potential factors that might impact the accuracy and fairness of such statistical models along with possible approaches for improvement.
... Language of publicationUSE OF TECHNOLOGY IN THE SELECTION PROCESSJournal of Applied Psychology from the American Psychological Association publisher[4], Sex roles from Springer Publishing House, and Journal of Applied Social Psychology[3] from Wiley Publishing House are the top three journals that focus on human bias in the ...
Full-text available
This article examines, with bibliometrics, the publication on bias in organizations’ personnel selection processes, whether they use automated decision-making systems or human-made decisions. While human bias is dynamic, restricted, mutate, and easier to determine the source; algorithmic bias is large-scale, static, and unpredictable. Despite the apparent discrepancy, there is a symbiotic relationship between those two, but somehow only one of them is getting any attention regarding the consequences of fairness on personnel selection and how this influences organizational diversity. So, looking for a better understanding of organizational behaviour, we conduct a bibliometric review to mappings the relations of these two. Here we reviewed 55 articles from the Web of Science Core Collection, from the earliest research published in 1979 to 2021. Only papers of the document type “article” was considered. The tool used for bibliometric data analysis were bibliometrix packages from the RStudio system version 3.6.3. According to our review, the number of studies on the subject is still tiny, and most of them were conducted under controlled conditions without considering the error agent of an organizational environment such as time, organizational culture, and the emotions of the recruiter; this makes it impossible to develop practices to avoid discrimination in these spaces. Concerning the theme, studies on human bias are the most common, with a focus on gender bias, and have recently adopted diversity. Hardly studies on algorithm decision-making consider the process’s fairness as a topic for investigation. However, neither study demonstrates a correlation or systematic approach between them. More interdisciplinary and empirical research should be the focus of future studies.
... One prominent way a race could play such a role is for it to be an input of the algorithm. Even if race is not included explicitly as such, as in our Chicago case, it could still indirectly determine one's risk score in terms of its proxies, such as ZIP code and family structure (Berk et al., 2021;Calders & Žliobaitė, 2013;Selbst, 2018). Furthermore, Black people and Black men composed 30 and 15 percent of Chicago's population in 2017, respectively, but Black men were victims of approximately 72 percent of homicides in the same year (Chicago Police Department, n.d.a, p. 70). ...
Full-text available
This paper examines racial discrimination and algorithmic bias in predictive policing algorithms (PPAs), an emerging technology designed to predict threats and suggest solutions in law enforcement. We first describe what discrimination is in a case study of Chicago’s PPA. We then explain their causes with Broadbent’s contrastive model of causation and causal diagrams. Based on the cognitive science literature, we also explain why fairness is not an objective truth discoverable in laboratories but has context-sensitive social meanings that need to be negotiated through democratic processes. With the above analysis, we next predict why some recommendations given in the bias reduction literature are not as effective as expected. Unlike the cliché highlighting equal participation for all stakeholders in predictive policing, we emphasize power structures to avoid hermeneutical lacunae. Finally, we aim to control PPA discrimination by proposing a governance solution—a framework of a social safety net.
When Machine Learning technologies are used in contexts that affect citizens, companies as well as researchers need to be confident that there will not be any unexpected social implications, such as bias towards gender, ethnicity, and/or people with disabilities. There is significant literature on approaches to mitigate bias and promote fairness, yet the area is complex and hard to penetrate for newcomers to the domain. This article seeks to provide an overview of the different schools of thought and approaches that aim to increase the fairness of Machine Learning. It organises approaches into the widely accepted framework of pre-processing, in-processing, and post-processing methods, subcategorizing into a further 11 method areas. Although much of the literature emphasizes binary classification, a discussion of fairness in regression, recommender systems, and unsupervised learning is also provided along with a selection of currently available open source libraries. The article concludes by summarising open challenges articulated as five dilemmas for fairness research.
Full-text available
The following article deals with the topic of discrimination “by” a recommender system. Several reasons can create discriminating recommendations, especially the lack of diversity in training data, bias in training data or errors in the underlying modelling algorithm. The legal frame is still not sufficient to nudge developers or users to effectively avoid those discriminations, especially data protection law as enshrined in the EU General Data Protection Regulation (GDPR) is not feasible to fight discrimination. The same applies for the EU Unfair Competition Law, that at least contains first considerations to allow an autonomous decision of the subjects involved to know about possible forms of discrimination. Furthermore, with the Digital Service Act (DSA) and the AI Act (AIA) there are first steps into a direction that can inter alia tackle the problem. Most effectively seems a combination of regular monitoring and audit obligations and the development of an information model, supported by information by legal design, that allows an autonomous decision of all individuals using a recommender system.
Full-text available
Data analytics provides versatile decision support to help employees tackle the rising complexity of today’s business decisions. Notwithstanding the benefits of these systems, research has shown their potential for provoking discriminatory decisions. While technical causes have been studied, the human side has been mostly neglected, albeit employees mostly still need to decide to turn analytics recommendations into actions. Drawing upon theories of technology dominance and of moral disengagement, we investigate how task complexity and employees’ expertise affect the approval of discriminatory data analytics recommendations. Through two online experiments, we confirm the important role of advantageous comparison, displacement of responsibility, and dehumanization, as the cognitive moral disengagement mechanisms that facilitate such approvals. While task complexity generally enhances these mechanisms, expertise retains a critical role in analytics-supported decision-making processes. Importantly, we find that task complexity’s effects on users’ dehumanization vary: more data subjects increase dehumanization, whereas richer information on subjects has the opposite effect. By identifying the cognitive mechanisms that facilitate approvals of discriminatory data analytics recommendations, this study contributes toward designing tools, methods, and practices that combat unethical consequences of using these systems.
Conference Paper
Full-text available
Classification models usually make predictions on the basis of training data. If the training data is biased towards certain groups or classes of objects, e.g., there is racial discrimination towards black people, the learned model will also show discriminatory behavior towards that particular community. This partial attitude of the learned model may lead to biased outcomes when labeling future unlabeled data objects. Often, however, impartial classification results are desired or even required by law for future data objects in spite of having biased training data. In this paper, we tackle this problem by introducing a new classification scheme for learning unbiased models on biased training data. Our method is based on massaging the dataset by making the least intrusive modifications which lead to an unbiased dataset. On this modified dataset we then learn a non-discriminating classifier. The proposed method has been implemented and experimental results on a credit approval dataset show promising results: in all experiments our method is able to reduce the prejudicial behavior for future classification significantly without loosing too much predictive accuracy.
Conference Paper
Full-text available
Recently, the following discrimination aware classification problem was introduced: given a labeled dataset and an attribute B, find a classifier with high predictive accuracy that at the same time does not discriminate on the basis of the given attribute B. This problem is motivated by the fact that often available historic data is biased due to discrimination, e.g., when B denotes ethnicity. Using the standard learners on this data may lead to wrongfully biased classifiers, even if the attribute B is removed from training data. Existing solutions for this problem consist in “cleaning away” the discrimination from the dataset before a classifier is learned. In this paper we study an alternative approach in which the non-discrimination constraint is pushed deeply into a decision tree learner by changing its splitting criterion and pruning strategy. Experimental evaluation shows that the proposed approach advances the state-of-the-art in the sense that the learned decision trees have a lower discrimination than models provided by previous methods, with little loss in accuracy.
Full-text available
In this paper, we investigate how to modify the naive Bayes classifier in order to perform classification that is restricted to be independent with respect to a given sensitive attribute. Such independency restrictions occur naturally when the decision process leading to the labels in the data-set was biased; e.g., due to gender or racial discrimination. This setting is motivated by many cases in which there exist laws that disallow a decision that is partly based on discrimination. Naive application of machine learning techniques would result in huge fines for companies. We present three approaches for making the naive Bayes classifier discrimination-free: (i) modifying the probability of the decision being positive, (ii) training one model for every sensitive attribute value and balancing them, and (iii) adding a latent variable to the Bayesian model that represents the unbiased label and optimizing the model parameters for likelihood using expectation maximization. We present experiments for the three approaches on both artificial and real-life data.
Full-text available
Classifier learning methods commonly assume that the training data consist of randomly drawn examples from the same distribution as the test examples about which the learned model is expected to make predictions. In many practical situations, however, this assumption is violated, in a problem known in econometrics as sample selection bias. In this paper, we formalize the sample selection bias problem in machine learning terms and study analytically and experimentally how a number of well-known classifier learning methods are affected by it. We also present a bias correction method that is particularly useful for classifier evaluation under sample selection bias.
Full-text available
On-line learning in domains where the target concept depends on some hidden context poses serious problems. Context shifts can induce changes in the target concepts, producing what is known as concept drift. We describe a family of learning algorithms that flexibly react to concept drift and can take advantage of situations where contexts reappear. The general approach underlying all these algorithms consists of (1) keeping only a window of currently trusted examples and hypotheses; (2) storing concept descriptions and re-using them if a previous context re-appears; and (3) controlling both of these functions by a heuristic that constantly monitors the system's behavior. The paper reports on experiments that test the systems' performance under various levels noise and different extent and speed of concept drift. Key words. Incremental concept learning, on-line learning, context dependence, concept drift, forgetting 1 Introduction The work presented here relates to the global model o...
Conference Paper
An assumption fundamental to almost all work on super- vised classification is that the probabilities of class member- ship, conditional on the feature vectors, are stationary. However, in many situations this assumption is untenable. We give examples of such population drift, examine its nature, show how the impact of population drift depends on the chosen measure of classification performance, and propose a strategy for dynamically updating classification rules.
This chapter surveys the theoretical literature on statistical discrimination and affirmative action. This literature suggests different explanations for the existence and persistence of group inequality. This survey highlights such differences and describes in these contexts the effects of color-sighted and color-blind affirmative action policies, and the efficiency implications of discriminatory outcomes.
This paper reviews the evidence relevant to the hypotheses that young (16-25) drivers are at greater risk of being involved in a casualty accident than older drivers and this greater risk is primarily a function of their propensity to take risks while driving. The first hypothesis is clearly supported by epidemiological research even when controlling for differences in the quantity and quality of road travel and driving experience. The second hypothesis is also supported by observational and self-report surveys of driving behaviour. Some of the research and theory bearing on risk perception and risk utility, possible mediators of risk-taking, is also reviewed.