Content uploaded by Tom Pakkanen
Author content
All content in this area was uploaded by Tom Pakkanen on Dec 15, 2017
Content may be subject to copyright.
The Effects of Coding Bias on Estimates of Behavioural
Similarity in Crime Linking Research of Homicides
TOM PAKKANEN
1,
*, ANGELO ZAPPALÀ
1,2
, CAROLINE GRÖNROOS
1
and
PEKKA SANTTILA
1
1
Åbo Akademi University, Åbo (Turku), Finland
2
Centre of Forensic Science, Turin, Italy
Abstract
This study explored whether a coding bias due to knowledge of which crimes have been
committed by the same offender exists when behavioural variables are coded in serial murder
cases. The study used an experimental approach where the information given to the
participants (N = 60) concerning correct linkages between a number of murder series was
manipulated. The participants were divided into three different groups (n= 20 in each). These
three groups received correct, incorrect, or no information about the linked series prior to the
coding. The results showed that there is no clear evidence to support the hypothesis of a bias
in the coding. The risk of expectancy effects and suggestions on how to minimise them in
behavioural crime linking research were discussed, and suggestions on how to improve
the validity of possible future replications of the experiment were given. The practical
implications of expectancy effects on behavioural crime linking decisions for the justice
system were also discussed. Copyright © 2012 John Wiley & Sons, Ltd.
Key words: crime linking; expectancy effect; coding bias; serial homicide; offender profiling
INTRODUCTION
In police investigations, behavioural similarity of crimes is sometimes used to identify a
crime series suspected to have been committed by the same offender (Woodhams, Hollin,
& Bull, 2007). Correctly identifying series of crimes is an effective investigative strategy
because a minority of perpetrators commit the majority of crimes. Methods used to identify
crimes committed by the same offender based on the analysis of behavioural similarity are
referred to as behavioural crime linking (Grubin, Kelly, & Brunsdon, 2001; Santtila,
Pakkanen, Zappalà, Bosco, Valkama, & Mokros, 2008; Woodhams et al., 2007).
Research carried out in behavioural crime linking, where coders of the data use coding
booklets to identify behavioural crime scene variables, has the potential problem of giving
*Corresppondence to: Tom Pakkanen, Åbo Akademi University, Åbo (Turku), Finland.
E-mail: tom.pakkanen@abo.fi
Journal of Investigative Psychology and Offender Profiling
J. Investig. Psych. Offender Profil. (2012)
Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/jip.1366
Copyright © 2012 John Wiley & Sons, Ltd.
biased results, as said coders most commonly are aware of which crimes have been
committed by the same offender. As a result, the research might overestimate behavioural
similarity in serial crime, distorting behavioural science experts’conclusions regarding the
potential of crime linking.
The present experiment aimed at testing whether prior knowledge of which crimes have
been committed by the same offender affects the coders’decisions when coding individual
behavioural variables in favour of increased perceived (i.e. coded) behavioural similarity
between the linked murders.
Behavioural crime linking
When utilising crime linking, one tries to draw conclusions about whether several crimes have
been committed by the same perpetrator (Grubin et al., 2001). The advantage of behavioural
crime linking over more traditional methods, such as DNA analyses, is that the latter are
highly expensive and time-consuming (Craik & Patrick, 1994), and less readily available at
most crime scenes (Grubin, Kelly, & Ayis, 1997).
Behavioural crime linking, such as offender profiling, has its roots in theories of
personality (Woodhams et al., 2007). Two critical assumptions underlie behavioural
crime linking. First, that offenders’behaviour across crimes is consistent, and second,
that there is variation between individual offenders; in other words, they behave
differently from each other (Alison, Bennell, Mokros, & Ormerod, 2002; Canter,
1995). The central hypothesis in behavioural crime linking is that these assumptions
of consistency and variability exist in the behaviour of the offender in a particular
crime type (Bennell & Canter, 2002; Crabbé, Decoene, & Vertommen, 2008).
A number of studies have looked at behavioural crime linking in several types of
crimes, using a variety of different methodologies. For example, Santtila, Fritzon,
and Tamelander (2004) found that 33% of cases of arson were correctly linked on
the basis of behaviour. Behavioural crime linking has also been studied in cases
of rape (Bennell, Jones, & Melnyk, 2009; Canter, 1995; Grubin et al., 1997; Grubin
et al., 2001; Santtila, Junkkila, & Sandnabba, 2005) and burglary (Bennell & Canter,
2002; Bennell & Jones, 2005; Goodwill & Alison, 2006; Green, Booth, & Biderman,
1976). Some studies have also successfully linked together cases of murder, analysing
crime scene behaviour. Salfati and Bateman (2005) were able to classify serial homi-
cides committed in the US into expressive and instrumental themes, and found that the
offenders were consistent with regard to this classification in their first three known
offences. In their study of Italian serial murders (N= 116), Santtila et al. (2008) man-
aged to correctly assign 63% of the cases to their correct series, using seven identified
dimensions of variation in the offenders’crime scene behaviour. Analysing the same
data further with Bayesian reasoning, Salo and his research team (2012) correctly
linked 84% of the cases. These results lend support to the notion that behavioural
crime linking is possible and that it would make crime investigation more effective
if used appropriately (Santtila et al., 2005).
The expectancy effect: A potential problem
The expectancy effect, also known as the observer effect, experimenter effect, and
experimenter bias, refers to situations where the expectations of the researchers led
to biased conclusions about their findings in favour of their expectations (Rosenthal,
T. Pakkanen et al.
Copyright © 2012 John Wiley & Sons, Ltd. J. Investig. Psych. Offender Profil. (2012)
DOI: 10.1002/jip
1966, 1994). Although the discussion about the development of behavioural crime
linking methodology has been mostly concerned with different theoretical models
and statistical methods (e.g. Alison et al., 2002; Crabbé et al., 2008; Woodhams
et al., 2007), the issue of experimenter effect has yet to receive attention. In a landmark
study on experimenter expectancy effect, Rosenthal and Fode (1963) demonstrated that
experimenters, with a manipulated prior knowledge of how well rats would perform in a
maze, influenced their results in favour of the experimenters’expectations. In another study,
Rosenthal and Jacobson (1968) showed that teachers’expectations influence the academic
and classroom behaviour of their students. Known as the Pygmalion effect, it holds that the
greater the expectation placed on the student, the better they perform. The pervasiveness of
observer effects has since been demonstrated across a wide range of experiments and conditions
(Risinger, Saks, Thompson, & Rosenthal, 2002). In behavioural crime linking research, this
could mean that the coders’prior knowledge of which crimes belong to the same series
might affect the behavioural similarity they perceive in the cases they code, thus introducing
a systematic bias that weakens the validity and reliability of the research and subsequently the
conclusions drawn from it.
Sheldrake (1998) reviewed 72 scientific publications of experiments in psychology
and found that only a small minority of them (6.9%) had used blind or double-blind
methods to guard against the experimenter effect. The authors of the present study
found no studies of behavioural crime linking where the researchers report blind-
methods with regard to their coding procedures. This suggests that the expectancy
effect has not been taken into consideration and that researchers (or in studies where
researchers have utilised pre-existing data—the ones who coded it) commonly know
which crimes are linked when coding the behavioural variables. On the basis of systematic
interviews, Sheldrake (1998) concluded that researchers tend to think of blind techniques
as guarding mainly against biases introduced by human subjects rather than the experimenters
themselves and that there often is a tacit assumption that the experimenter effects
are negligible.
Even with stringent coding schemes, where the behavioural analysis of the crime scene
concentrates on observable behaviour that is coded using dichotomous variables (e.g. Salfati,
1998; Salfati & Bateman, 2005; Santtila et al., 2005; Santtila et al., 2008), there is still room
for interpretation. It may, for example, be difficult to make decisions about variables such
as the victim was found naked and the body of the victim was covered, how many pieces of
clothing have to be removed for the victim to be considered naked, or what percentage of
the body has to be covered for the body to be considered covered? Having already coded
homicides committed by the same offender, the coder, perhaps expecting consistency in the
offender’s behaviour, may be prone to code certain behaviours in the same way they presented
in the offender’s previous homicides.
The expectancy effect also poses a problem on a more practical level, when crime
linking research is applied in the justice system and forensic experts are asked to
comment on whether a particular crime is linked to one or several other crimes. In
Daubert v. Merrell Dow Pharmaceuticals, Inc. (1993), the US Supreme Court stipulated that
to be admissible, the testimony of an expert witness has to be the product of reliable scientific
principles and methods. The Daubert decision, along with a few subsequent rulings, brought
about a substantial amendment to the Federal Rules of Evidence 702, which sets the standard
for the admission of testimony by expert witnesses in federal courts in the US (FRE 702,
2011). Bosco, Zappalà, and Santtila (2010) reviewed court cases where expert testimony in
the field of linkage analysis had been given and concluded that the most common reason
Expectancy effects in behavioural crime linking
Copyright © 2012 John Wiley & Sons, Ltd. J. Investig. Psych. Offender Profil. (2012)
DOI: 10.1002/jip
for excluding expert testimony in these cases was the lack of demonstrable reliability of the
methods used. In one particular ruling where the experts’testimony had been excluded,
the court noted that the expert had ignored the many differences between the two cases,
concentrated on the similarities, and thus overestimated them. In their review of observer
effects in forensic science, Risinger et al. (2002) proposed blind testing as the principal
method of preventing distortions caused by expectation.
Aim and hypotheses
Because information about case linkage and perceived behavioural similarity lie at the very
core of behavioural crime linking, and the authors not knowing of any study investigating
this so far, an experiment was devised to test for experimenter bias.
To test whether coders’prior knowledge of which crimes had been committed by the same
offender affected the similarity these coders perceived in the behaviour of the offender in
linked crimes, 60 Italian university students were randomly assigned into three groups of
20 persons each. Each group was then given 10 cases of Italian serial murders, where five
offenders had committed two offences each, and a list of behavioural crime scene variables
to code from the cases. Prior to the coding task, the information about linkage status was
manipulated between the groups. The first group, the Correctly Informed Group, was told
which offenders had committed each murder. The second group, the Incorrectly Informed
Group, was given false linkage information, and the third group, the Not Informed Group,
was not given any information regarding linkage status.
The experiment was set up to test two hypotheses. The first hypothesis was that the
Correctly Informed Group would code more similarity in the murders they knew belonged
together, than the Not Informed Group. This would provide evidence for a coding bias, as
the only difference between the groups was the fact that the Correctly Informed Group knows
the linkages. Any additional similarity coded by them would, therefore, be due to a bias.
The second hypothesis was that the Incorrectly Informed Group would code less similarity
in the two murders that wereactually linked, than the Not Informed Group would. Thiswould
also provide evidence for a coding bias, as the incorrect information would have led to
more perceived similarity between actually unlinked offences. In other words, the perceived
similarity would be smaller in the actual series for the Incorrectly Informed Group than the
Not Informed Group.
METHOD
Participants
The participants in the crime linking experiment consisted of 60 students from three
different areas in Italy. The age of the participants ranged from 23 to 55 years (M= 28
years). One-quarter (25%) of the participants were first year students from Reggio
Emilia, studying child and adolescent psychotherapy. Another quarter (25%) were
master’s degree students from the La Sapienza University in Rome, taking part in a
course on theories and methods in crime investigation. Half of the participants
(50%) were from the University of Pontificia Salesiana Roma Torino, Turin, who
participated in an extension course on psychology in criminology and criminal inves-
tigation. All participants were students of Dr Angelo Zappalà, and none of them had
T. Pakkanen et al.
Copyright © 2012 John Wiley & Sons, Ltd. J. Investig. Psych. Offender Profil. (2012)
DOI: 10.1002/jip
prior knowledge of the aim or hypotheses of the experiment. None of the participants
were familiar with the concept of behavioural consistency, and only a few had heard
of the idea of linking crimes utilising the offenders’observed crime scene behaviour
(these concepts were taught in the courses after the completion of the experiment).
Materials
Each of the participants was given information on 10 murders in the form of a brief
description: approximately 15 lines of text extracted, edited, and summarised from
court transcripts of Italian cases of serial murder. The vignettes included information
about the victim (age and gender), the modus operandi of the offender, the weapon
used in the killing, wound pattern(s) on the victims, and the place where the body
was found (Table 1). The cases were chosen from a larger set of Italian serial murders
used by Santtila et al. (2008). The data consisted of murders committed by five offen-
ders: two offences were chosen randomly from each offender’s series. All data were
edited for maximum uniformity, and all unique information that could identify a case,
such as dates, names of persons, and places, was removed. The participants were also
given a coding scheme with 92 dichotomous variables (available on request from the
authors) including situational variables (pertaining to the when and where of the homi-
cide), behaviour observable at the crime scene (the use of weapons, binds, and gags;
injuries of the victim; post-mortem activity such as moving, hiding, or destroying
the body; etc.), and victim characteristics (age, gender, marital status, employment,
Table 1. Three examples of case vignettes excerpted from the court transcripts
Case 1
According to the reconstruction, the offender had approached his victim, a 64-year-old prostitute,
and together they had gone to the aggressor’s home. After having had sexual intercourse, the
offender strangled his victim with a rope, at a moment when the victim had turned her back to the
offender. He then put the body in a bag, carried it to his car, and drove to the river XXX, a place
not far from his home. Standing on the riverbank, the offender threw the body into the water.
Case 2
After getting the woman, a 44-year-old prostitute into the car, the murderer had asked her to come
home with him. She refused and he had agreed to stay with her in the car. Before the intercourse
even started, while she was still undressing, the offender took advantage of the situation and shot
a single shot that hit her in the head, killing her instantaneously. After the murder, he had carried
the body to a nearby river. There was a cabin by the river, where he laid the body on a sofa and
set it on fire. Everything was destroyed: the cabin, the sofa, the victim’s personal belongings and
the body of the victim.
Case 3
In the afternoon of Sunday the 12th of April, a body was found in a train toilet by an employee of the
railways. The victim was a 32-year-old, married, Italian nurse. The body was found fully clothed and
the victim’s head was covered with a jacket. The offender had shot the victim in the head after
covering her with a blanket. The toilet door was locked from the inside, and the victim was taken
away by the train. The autopsy reported, that the external examination of the body immediately
revealed a lesion to the left retro auricular region of the head. It showed the typical features of a
gunshot, a single bullet shot into the skin. The autopsy did not find any other signs of violence on
the victim’s body. The corpse, however, still had a full bladder of urine which indicates that the
attack took place immediately after the victim entered the toilet.
The first two homicides were committed by the same offender, whereas the third was committed by a second
offender. The vignettes have been translated from Italian into English.
Expectancy effects in behavioural crime linking
Copyright © 2012 John Wiley & Sons, Ltd. J. Investig. Psych. Offender Profil. (2012)
DOI: 10.1002/jip
known health issues, etc.). The coding scheme was based on research by Salfati
(1998) and subsequently developed by Pakkanen, Santtila, Mokros, and Sandnabba
(2006) and Santtila et al. (2008).
Procedure
All the participants (N= 60) were randomly assigned to three different groups, with 20
students in each group. Two of the groups were told that the offences had been com-
mitted by five offenders, and the groups were given information on which murderer
had committed which offence. The first group got correct information about which
pairs of two murders belonged together (Correctly Informed Group), and the second
group got incorrect information about which pairs of two murders that were committed
by the same offender (Incorrectly Informed Group). All participants in the latter group
received the same erroneous information about linkage status; in other words, the pairs
of murders they were given were actually not committed by the same perpetrator. The
third and final group did not get any information about which murders were linked
(Not Informed Group) and, presumably, thought that all the murders were single
murders. The hypotheses were that by manipulating the information given to the
groups about linkage status (independent variable), there would be a measurable dif-
ference in the groups’perceived (i.e. coded) behavioural similarity of the cases
(dependent variable).
Next, all the groups were asked to read the descriptions of one murder at a time and
to code the variables in the provided scheme as present (1), absent (0), or missing (99)
for each of the 10 cases. After the coding task was completed, the third group
(Not Informed Group) was told that the murders had been committed by only five
murderers and asked to identify the five pairs of homicide. This was done to check
how easily the series could be identified and whether behavioural similarity was used
intuitively as a clue for linking crimes.
The students from Rome and Turin got their instructions verbally from the leader of the
experiment and filled out the forms in about 4 hours. These participants did not have a
possibility to discuss the task with each other. The students from Reggio Emilia received
their instructions and returned their results by e-mail. Although it is impossible to prove
that these participants did not discuss the task with each other, on the basis of discussions
with a subset of the students, the experimenters assumed they did not.
Statistical analyses
The coded behavioural variables were analysed with regard to behavioural similarity in
the cases that were actually linked and compared between the groups. To calculate the
behavioural similarity in the linked cases, the phi coefficient was used. Correlations were
calculated separatelyfor each subject and casein each experiment group, and the groups were
compared pairwise to check for any differences between the groups. This was done using a
generalised linear model repeated over subjects, checking if group allocation would predict
differences in coded behavioural similarity.
Inter-rater reliabilities were calculated separately for each group and case, using the Kuder–
Richardson Formula 20, to check for variance in the coding. The correct linking decisions
made by the Not Informed Group were calculated and compared with coded behavioural
T. Pakkanen et al.
Copyright © 2012 John Wiley & Sons, Ltd. J. Investig. Psych. Offender Profil. (2012)
DOI: 10.1002/jip
similarity, again using a generalised linear model, repeated over subjects, to check if coded
behavioural similarity predicted whether a series was correctly linked or not.
RESULTS
There were five series of two murders for each of the groups to code. The mean coded
behavioural similarity over all the five series was the variable studied. The means,
measured using the phi coefficient, and standard errors of the mean for behavioural similar-
ity are displayed in Table 2 and Figure 1.
The results show no significant difference in the coded behavioural similarity in the series
between the groups (Wald w
2
= 4.13, p= .127). Contrary to the hypothesis, the Correctly
Informed Group (M= .33, SE = .02) did not code more similarity in the series compared with
the group with no information (M= .37, SE = .02). The highest perceived similarity within-
series was found in the Not Informed Group, but as the pairwise comparison between this
group and the Correctly Informed Group show (Table 3), the difference was not significant
(Wald w
2
=2.94,p= .087).
The Correctly Informed Group had a slightly higher perceived similarity than the
Incorrectly Informed Group (M= .32, SE = .02), but the difference was small and also
not statistically significant (Wald w
2
= .07, p= .797). The biggest difference in
perceived similarity between two groups was found in the comparison of the Incor-
rectly Informed Group and the Not Informed Group. Although the difference
approached statistical significance (Wald w
2
= 3.34, p= .068), it remained a tendency.
Hence, no evidence for the experimenter bias could be found.
Inter-rater reliability was calculated separately for each group and case, and ranged from
.74 (acceptable) to .99 (excellent). A summary of the inter-rater reliabilities can be seen in
Table 4. The very high overall inter-rater reliability (M= .93) would suggest that the case
excerpts were (too) easy to code.
Linking decisions were available for 17 participants of the Not Informed Group. A total
amount of 52 correct linking decisions were made, and 33 series were linked erroneously.
The mean amount of correctly linked series per participant was three. The hardest series to
link seemed to be the second one, with only 41% of the participants linking the two
murders correctly, whereas series 4 was the easiest with roughly three-quarters (76%) of
the group getting it correct. See Table 5 for an overview.
When comparing the linking decisions of the different series to the coded similarity of
the same, there was a statistically significant relationship (B=!.9, SE = .04, p= .041),
meaning that the participants’linking decision outcome was dependent on the perceived
behavioural similarity of the series. In other words, the cases with higher perceived similar-
ity were the ones that were easier to link together.
Table 2. Means and standard errors of the mean for coded behavioural similarity, measured using
the phi coefficient, for each pair of two murders in the three experiment groups
Group M SE
Correctly Informed Group .33 .02
Incorrectly Informed Group .32 .02
Not Informed Group .37 .02
Wald w
2
= 4.13, p= .127.
Expectancy effects in behavioural crime linking
Copyright © 2012 John Wiley & Sons, Ltd. J. Investig. Psych. Offender Profil. (2012)
DOI: 10.1002/jip
DISCUSSION
Evaluating the hypotheses and the results
The aim of the present study was to explore whether a coding bias exists using an experi-
mental approach where the information given to the participants concerning correct
Mean behavioural similarity in the linked offences
0,40
0,38
0,36
0,34
0,32
0,30
0,28
0,26
0,24
0,22
0,20
Not Informed GroupIncorrectly Informed GroupCorrectly Informed Group
Figure 1. Mean behavioural similarity in the linked offences for the three groups. Error bar represents the stand-
ard errors of the mean.
Table 3. Pairwise comparisons of the differences of the means of the phi coefficients for the
experiment groups
Compared groups Differences in means SD Wald w
2
p
Correctly Informed Group versus Not Informed Group .047 .027 2.94 .087
Incorrectly Informed Group versus Not Informed Group .054 .030 3.34 .068
Table 4. Mean and range of the inter-rater reliabilities (Kuder–Richardson Formula 20) in the three
experiment groups
Group Min Max M
Correctly Informed Group .81 .92 .87
Incorrectly Informed Group .74 .98 .95
Not Informed Group .97 .99 .98
Table 5. The amount and percentage of correct linking decisions in the Not Informed Group (n= 17)
Murder series Amount of correct linking decisions Percentage of correct linking decisions
Series 1 11 65
Series 2 7 41
Series 3 9 53
Series 4 13 76
Series 5 12 71
Series 1–5 52 61
T. Pakkanen et al.
Copyright © 2012 John Wiley & Sons, Ltd. J. Investig. Psych. Offender Profil. (2012)
DOI: 10.1002/jip
linkages between a number of murder series was manipulated. Table 6 compares the results
to the hypotheses of the present study.
The first hypothesis was that the correct information about the linked murders would
result in more perceived similarity for the series compared with having no such informa-
tion. In contrast to this hypothesis, the Not Informed Group coded more similarity be-
tween actually linked murders than the Correctly Informed Group. Although not
formally reaching statistical significance, there was a tendency for this contra-intuitive
finding. On the basis of Rosenthal’s (1966, 1994) description and findings of the experi-
menter effect, one could assume that prior knowledge about which crimes had been com-
mitted by the same offender in the studied murder excerpts would distort participants’
perceived (i.e. coded) behavioural similarity in favour of their expectations. The results
of testing the first hypothesis contradict this assumption, thus providing no evidence
for the presence of a coding bias.
The second hypothesis, that the Incorrectly Informed Group would code less similarity
for linked murders compared with the Not Informed Group, seems to gain some support
from the results. The Incorrectly Informed Group did indeed code less similarity for the
linked cases than the Not Informed Group; however, the difference did not reach formal
levels of statistical significance. This finding contradicts the conclusion of the first hypoth-
esis, as it provides some evidence for a coding bias. It would seem that the Incorrectly
Informed Group coded more similarity in the series they incorrectly thought were linked
on the basis of the false information they received, thus leading to lower similarities in the
actually linked murders. Taken together, the results indicated that there is no unequivocal evi-
dence for a coding bias.
When considering the coding results of the three experiment groups, the odd one seems
to be that the Not Informed Group coded more similarity (not significant) in the linked
series than both the other groups. One possible explanation could be that not having any
prior information about case linkage, and thus not spending any mental effort on the ques-
tion ‘who committed which murder?’, enabled them to better concentrate on the task of
coding the behavioural variables, resulting in a more accurate coding. The higher inter-
rater reliabilities within this group would seem to support this idea. Although contradicting
the expectancy effect, perhaps this finding, in line with the recommendations of Risinger
et al. (2002), Sheldrake (1998), and Wilkinson (1999), shows that no expectation (blind
testing) is the most efficient way to go.
The amount of correct linking decisions for the Not Informed Group varied from 53% for
the third series to 76% for the fourth series. This finding is in line with previous research on
offenders’behavioural consistency, where consistency has been found to vary both between
offenders and within the series of one offender. Because the present study used two cases,
randomly picked from the five offenders’series, a certain variance in behavioural consistency
Table 6. Main results compared with the hypotheses of the study
1 Correctly Informed Group <Not Informed Group
Evidence for coding bias
No bias
(>)p= .087
2 Incorrectly Informed Group <Not Informed Group Bias
(<)p= .068
The signs (<and >) indicate the level of coded behavioural similarity between the experiment groups; ‘<’stands
for less coded similarity in the linked murders, and ‘>’stands for more coded similarity. The hypotheses of the
study are shown in parenthesis. The results of the hypotheses and their significance levels are displayed in the last
column.
Expectancy effects in behavioural crime linking
Copyright © 2012 John Wiley & Sons, Ltd. J. Investig. Psych. Offender Profil. (2012)
DOI: 10.1002/jip
(and perceived similarity) between the crimes was to be expected. Another question is
how conscious a lay person is about the fact that they ‘should’be looking for behavioural
similarities in crimes that they know are linked. The statistically significant relationship
between the linking decisions of the Not Informed Group and their perceived behavioural
similarity would suggest that the participants, who could be considered laymen in terms of
forensic expertise, didintuitively make their decisions concerning linkage on the basis of their
perceived similarity of the cases. Santtila et al. (2008) found that their statistical model was
able to correctly link 63% of the same Italian serial murders (N= 116) when analysing
the whole set of 23 serial killers with 2 to 17 victims each. Using a Bayesian method for their
analysis, Salo et al. (2012) reached an even higher portion of 84% with the same sample.
When only one prior murder was known, as in the present study, Salo et al.’s model correctly
linked 59% of the cases. The total amount of correct linking decisions in the present study was
61%. Because the excerpts of the present study were clearly easier to code, thus making
the linking task easier, this comparison could be taken to indicate that the aforementioned
statistical models are more efficient at linking crimes than university students.
Limitations of the study
The biggest limitation of the present study lies in the validity of the used murder excerpts.
Using the samecoding scheme, but withcomplete pre-trial investigation protocols rather than
excerpts, Pakkanen et al. (2006) had a significantly lower inter-rater reliability (.72) than the
present study. The high inter-rater reliability in the present study (.93) would suggest that
the murder excerpts were easy to code, thus reducing the variance of the perceived similarity
in the participants and groups. The heavily edited and summarised vignettes of the court
transcripts might have limited the participants to code in a more reliable manner, making it
less probably for any bias to have noticeable effects. More thorough documentation, better
reflecting the operational situation of the police handling comprehensive amounts of pre-trial
investigation data, could have made the coding task more challenging, leaving moreroom for
variation and for bias to emerge. In making the material more realistic for the participants, and
hence more ecologically valid, the trade-off would be less uniformity and a significantly more
laborious experiment to conduct.
Bennell and Jones (2005) pointed out that solved cases might show higher levels of
consistency and inter-individual variation than unsolved cases. It might be partly because of
these characteristics that the cases are easier to solve. There might also be a deeper inherent
problem in using court transcripts in linkage research: they might overestimate behavioural
similarity, as court clerks summarise solved cases that are known to be linked. Thus, the
expectancy effect could already have taken place in court, when the vast data of the pre-trial
investigation protocols presented during the trial have been compiled and summarised,
possibly inflating the similarity of the offences in the courts’transcripts. Much conscious
effort has also gone into developing the coding schemes. One possibility is that the variables
are defined well enough for there to not to be enough variation in the coding for the experi-
menter effect to take place. The ideal situation would perhaps be where the crimeinvestigators
tick off the coding scheme right at the beginning of the investigation, before any information
of the suspect is even available, making the coding blind with regard to prior knowledge of
case linkage. It is also worth noting that the present study was carried out using only cases
of serial murder. Different types of crimes utilise different coding schemes, and the question
posed in the present study concerning a possible bias in the coding would, therefore, have to
be tested separately for other crimes as well.
T. Pakkanen et al.
Copyright © 2012 John Wiley & Sons, Ltd. J. Investig. Psych. Offender Profil. (2012)
DOI: 10.1002/jip
Bosco et al.’s (2010) recommendation is to increase the reliability of forensic research to
add to the chance of expert testimony being admitted in the courts. The more specific
recommendation of Wilkinson (1999) is for researchers to describe the specific methods
used to deal with experimenter bias, especially if the researchers have gathered their data
themselves. Sheldrake (1998) went on to state that there is plenty of evidence for the
experimenter effect, but scarce evidence for the lack of it, and therefore proposes to test
possible experimenter effects by comparing results of an experiment under both open
and blind conditions, as the present study has done. Risinger et al. (2002) agreed proposing
blind testing as the principal method of preventing distortions caused by expectation in
forensic science.
Conclusion and suggestions for future research
In police investigations, behavioural similarity of crimes is used to identify series suspected
of having been committed by the same offender (Woodhams et al., 2007). Studies of crime
linking, where the coders of the data use coding schemes, have the potential problem of
giving biased results, as the coders usually are aware of which crimes are committed by
the same offender. The studies might thereby overestimate behavioural similarity in serial
crime, distorting the conclusions drawn by behavioural science experts and automated
computer systems about crime linkage. An even more acute problem with the lack of blind
testing is when forensic experts are asked to give testimony about whether two or more
crimes have been committed by the same offender. Knowing that the police suspect the
same offender of committing the crimes and that behavioural similarity is the key to linking
them, the risk of overestimating similarity and making a false linking decision is imminent.
This issue, however, needs to be studied separately using police pre-trial investigation
reports, in order to grasp the extent of a possible expectancy effect on estimates of behav-
ioural similarity in testimonies given by forensic experts.
The aim of the present study was to explore whether a coding bias exists when using a
coding scheme to record crime scene behaviour in serial murder cases. It seems that there is
no clear evidence to support the hypothesis of such a bias in the coding; none of the results
of the tested hypotheses were significant.
The present study would have the power to confirm a strong bias but not to exclude a weak
one; the results would suggest the lack of a confounding bias, but the experiment of the
present study might be too robust to detect a smaller one. Replications of the experiment
are needed, with specific consideration to the discussed validity of the material used,
especially the length and complexity of the vignettes and the source of the data (court
transcripts versus pre-trial investigation protocols). Also, the present study was carried out
using only cases of serial murder. Replications of the experiment could benefit from studying
other types ofcrimes as well. It is the view of the authors that the issue of expectancy effects in
behavioural crime linking needs to be studied further and addressed more systematically by
reporting inter-rater reliabilities and favouring blind methods with regard to data coding and
giving expert testimony to the courts on the issue.
REFERENCES
Alison, L., Bennell, C., Mokros, A., & Ormerod, D. (2002). The personality paradox in offender
profiling: A theoretical review of the processes involved in deriving background characteristics
from crime scene actions. Psychology, Public Policy, and Law,8, 115–135.
Expectancy effects in behavioural crime linking
Copyright © 2012 John Wiley & Sons, Ltd. J. Investig. Psych. Offender Profil. (2012)
DOI: 10.1002/jip
Bennell, C., & Canter, D. (2002). Linking commercial burglaries by modus operandi: Tests using
regression and ROC analysis. Science & Justice,42, 153–164.
Bennell, C., & Jones, N. (2005). Between a ROC and a hard place: a method for linking serial burglar-
ies by modus operandi. Journal of Investigative Psychology and Offender Profiling,2, 23–41.
Bennell, C., Jones, N., & Melnyk, T. (2009). Addressing problems with traditional crime linking
methods using receiver operating characteristic analysis. Legal and Criminological Psychology,
14, 293–310.
Bosco, A., Zappalà, A., & Santtila, P. (2010). The admissibility of offender profiling in courtroom:
A review of legal issues and court opinions. International Journal of Law and Psychiatry,33,
184–191.
Canter, D. (1995). Psychology of offender profiling. In R. Bull, & D. Carson (Eds.), Handbook of
psychology in legal contexts (pp. 343–355). New York: John Wiley & Sons Ltd.
Crabbé, A., Decoene, S., & Vertommen, H. (2008). Profiling homicide offenders: A review of the
assumptions and theories. Aggression and Violent Behavior,13, 88–106.
Craik, M., & Patrick A. (1994). Linking serial offences. Policing,10, 181–187.
Daubert v. Merrell Dow Pharmaceuticals, Inc. 509 U.S 579 (1993).
Federal Rules of Evidence 702. (2011). Testimony by expert witnesses. Retrieved May 13, 2012
from http://www.law.cornell.edu/rules/fre/rule_702
Goodwill, A., & Alison, L. (2006). The development of a filter model for prioritizing suspects in
burglary offences. Psychology, Crime & Law,12, 395–416.
Green, E., Booth, C., & Biderman, M. (1976). Cluster analysis of burglary M/Os. Journal of Police
Science and Administration,4, 382–388.
Grubin, D., Kelly, P., & Ayis, S. (1997). Linking serious sexual assault. London: Home Office.
Grubin, D., Kelly, P., & Brunsdon, C. (2001). Linking serious sexual assaults through behavior.
London: Home Office.
Pakkanen, T., Santtila, P., Mokros, A., & Sandnabba, K. (2006). Profiling hard-to-solve homicides.
Identifying dimensions of offending and associating them with situational variables and offender
characteristics. Unpublished manuscript.
Risinger, M., Saks, M., Thompson, W., & Rosenthal, R. (2002). The Daubert/Kumho implications of
observer effects in forensic science: Hidden problems of expectation and suggestion. California
Law Review,90,1–56.
Rosenthal, R. (1966). Experimenter effects in behavioral research. New York: Appleton-Century-Crofts.
Rosenthal, R. (1994). Interpersonal expectancy effect: A 30-year perspective. Current Directions in
Psychological Science,3, 176–179.
Rosenthal, R., & Fode, K. (1963). The effect of experimenter bias on performance of the albino rat.
Behavioral Science,8, 183–189.
Rosenthal, R., & Jacobson, L. (1968). Pygmalion in the classroom. The Urban Review,3, 16–20.
Salfati, G. (1998). Homicide: A behavioural analysis of crime scene actions and associated offender
characteristics. Unpublished doctoral dissertation. University of Liverpool, UK.
Salfati, G., & Bateman, A. (2005). Serial homicide: An investigation of behavioral consistency.
Journal of Investigative Psychology and Offender Profiling,2, 121–144.
Salo, B., Sirén, J., Corander, J., Zappalà, A. Bosco, D., Mokros, A., & Santtila, P. (2012). Using
Bayes’theorem in behavioral crime linking of serial homicide. Legal and Criminological
Psychology. DOI: 10.1111/j.2044–8333.2011.02043.x
Santtila, P., Fritzon, K., & Tamelander, A. (2004). Linking arson incidents on the basis of crime
scene behavior. Journal of Police and Criminal Psychology,19,1–16.
Santtila, P., Junkkila, J., & Sandnabba, K. (2005). Behavioural linking of stranger rapes. Journal of
Investigative Psychology and Offender Profiling,2, 87–103.
Santtila, P., Pakkanen, T., Zappalà, A., Bosco, D., Valkama, M., & Mokros, A. (2008). Behavioral
crime linking in serial homicide. Psychology, Crime & Law,14, 245–265.
Sheldrake, R. (1998). Experimenter effects in scientific research: How widely are they neglected?
Journal of Scientific Exploration,12, 73–78.
Wilkinson, L. (Task Force on Statistical Interference, Board of Scientific affairs, APA) (1999).
Statistical methods in psychology journals: Guidelines and explanations. American Psychologist,
54, 594–604.
Woodhams, J., Hollin, C., & Bull, R. (2007). The psychology of linking crimes: A review of the evi-
dence. Legal and Criminological Psychology,12, 233–249.
T. Pakkanen et al.
Copyright © 2012 John Wiley & Sons, Ltd. J. Investig. Psych. Offender Profil. (2012)
DOI: 10.1002/jip