ArticlePDF Available

Addressing problems with traditional crime linking methods using receiver operating characteristic analysis



Purpose. Through an examination of serial rape data, the current article presents arguments supporting the use of receiver operating characteristic (ROC) analysis over traditional methods in addressing challenges that arise when attempting to link serial crimes. Primarily, these arguments centre on the fact that traditional linking methods do not take into account how linking accuracy will vary as a function of the threshold used for determining when two crimes are similar enough to be considered linked.Methods. Considered for analysis were 27 crime scene behaviours exhibited in 126 rapes, which were committed by 42 perpetrators. Similarity scores were derived for every possible crime pair in the sample. These measures of similarity were then subjected to ROC analysis in order to (1) determine threshold-independent measures of linking accuracy and (2) set appropriate decision thresholds for linking purposes.Results. By providing a measure of linking accuracy that is not biased by threshold placement, the analysis confirmed that it is possible to link crimes at a level that significantly exceeds chance (AUC = .75). The use of ROC analysis also allowed for the identification of decision thresholds that resulted in the desired balance between various linking outcomes (e.g. hits and false alarms).Conclusions. ROC analysis is exclusive in its ability to circumvent the limitations of threshold-specific results yielded from traditional approaches to linkage analysis. Moreover, results of the current analysis provide a basis for challenging common assumptions underlying the linking task.
Addressing problems with traditional crime
linking methods using receiver operating
characteristic analysis
Craig Bennell*, Natalie J. Jones and Ta mara Melnyk
Carleton University,Ottawa, Ontario,Canada
Purpose.Through an examination of serial rape data, the current article presents
arguments supporting the use of receiver operating characteristic (ROC) analysis over
traditional methods in addressing challenges that arise when attempting to link serial
crimes. Primarily,these arguments centre on the fact that traditional linking methods do
not takeinto account how linking accuracy will varyasafunction of the threshold used
for determining when two crimes aresimilar enough to be consideredlinked.
Methods. Considered for analysis were27crime scene behaviours exhibited in
126 rapes, which werecommitted by 42 perpetrators. Similarity scores werederived
for everypossible crime pair in the sample.These measures of similarity werethen
subjected to ROCanalysis in order to (1) determine threshold-independent measures
of linking accuracy and (2) set appropriate decision thresholds for linking purposes.
Results. By providing ameasure of linking accuracy that is not biased by threshold
placement, the analysis confirmed that it is possible to link crimes at alevel that
significantly exceeds chance (AUC ¼:75). The use of ROCanalysis also allowed for the
identification of decision thresholds that resulted in the desired balance between
various linking outcomes (e.g. hits and false alarms).
Conclusions. ROCanalysis is exclusive in its ability to circumvent the limitations of
threshold-specific results yielded from traditional approaches to linkage analysis.
Moreover,results of the current analysis provide abasis for challenging common
assumptions underlying the linking task.
Of paramount importance in police investigations is the ability to accuratelylink crimes
committed by the same offender.The correct identification of an offenceseries allows
investigatorstopoolinformation from all relevant crime scenes, thusresulting in amore
efficient use of investigative resources (Grubin, Kelly,&Brunsdon, 2001).Despite the
practical importance of this task, it has been the subject of limited empirical research. In
fact, it has only been in the last decade that any notable effort has been madeto
*Correspondence should be addressed to Dr Craig Bennell, Department of Psychology,Carleton University,Ottawa, Ont.,
Canada, K1S 5B6 (e-mail:
Legal and Criminological Psychology(2009), 14, 293–310
q2009 The British Psychological Society
understand the processes underlying the linking task and to systematically determine
the degree to which it is possible to successfully link aseries of crimes (e.g. Bennell &
Canter, 2002; Bennell &Jones, 2005; Ewart, Oatley, &Burn, 2005; Grubin et al.,2001;
Santtila,Fritzon, &Tamelander,2005; Santtila,Junkkila, &Sandnabba, 2005; Santtila,
Korpela, &Hakkanen,2004; Woodhams, Grant, &Price, 2007; Wo odhams, Hollin, &
Bull, 2007; Woodhams&Toye, 2007).
Recently,Woodhams, Hollinet al. (2007) conducted acomprehensive review
of empirical researchthat has examined the linking task. Thisreviewgenerally found
that therewas supportfor the practice of linkageanalysis and it concluded by
recommending the use of an analytical methodfor studying/conducting linkageanalysis
that was originallyproposed by the first author on the present article (Bennell, 2002;
Bennell &Canter,2002; Bennell &Jones, 2005). This method, borrowed directly from
the field of signaldetection theory(Green &Swets, 1966),isknown as receiver
operatingcharacteristic (ROC) analysis. The principles underlying this analytical
technique have been discussed elsewhere (Swets, 1996),ashas its relevance to the area
of policing(Bennell, 2005). The purpose of the current article is rather to: (1) present
theoreticaland practicalarguments supporting theuse of this approach for
studying/conducting linkageanalysis over alternative methods; (2) illustrate the
practical application of this approach to the linking task through an empirical analysis of
serial rape data; and (3) challenge commonly held assumptions about linkageanalysis
based on the empirical findings that emergefrom this analysis.
The major problem with traditional approaches to linkage analysis
In orderfor it to be possible to accurately link aseriesofcrimes, it is generallythought that
twoassumptions must be supported(Bennell&Canter,2002; Canter,1995; Grubinet al. ,
2001;Woodhams, Hollin et al. ,2007).First, it is assumedthatoffenders must exhibit
reasonably high levels of behaviouralstability across theirrespectivecrime series, reflecting
thedegreetowhich each individual manifeststhe same behaviours across his/herown
crimes (the behaviouralstability assumption). Second,itisassumed that offenders must
also exhibitreasonablyhighlevelsofbehavioural distinctiveness, wherebythe actions
that agiven serial offender exhibits across his/hercrimes differ from thoseexhibited
by otheroffenders committing similartypes of crimes (the behaviouraldistinctiveness
assumption). In general, research from thefieldofpersonality psychology supports
thenotionthatindividuals will exhibitindividualdifferencesinbehaviour in arelatively
stable fashionacrosssimilar (but notnecessarily different) situations (see Mischel, 2004,
forareview). Likewise,asubstantialdegreeofevidencefor behaviouralstability
anddistinctiveness exists within theforensicdomainwhenconsidering thecrime
scenebehavioursexhibited by serialoffenders (see Woodhams, Hollin et al. ,2007,
Despite aconsensus on the importance of these assumptions forthe linking task,
there is disagreement amongstresearcherswith respect to the approach that should
be used to study the linking task. As Wo odhams, Hollinet al. (2007) illustrate, arangeof
methodsare available forthis purpose. Theseinclude, but are not limited to, the
use of across-crime similarity coefficients(e.g. Canter et al.,1991), cluster analysis
techniques (e.g.Green,Booth, &Biderman, 1976), multidimensional scaling
procedures (e.g. Canter et al.,1991;Santtila, Junkkila et al.,2005), discriminant
functionanalysis (e.g.Santtila et al.,2004),and logistic regression modelling
(e.g. Bennell &Canter,2002; Bennell &Jones, 2005; Woodhams&To ye, 2007).
294 Craig Bennel et al.
For reasons to be discussed shortly,weargue that each of these various approaches is
limited in its ability to provide basic information about the stability and distinctiveness
of offending behaviour.For example,inour opinion, noneofthe approaches can
provideavalid measure of the extent to which serial offenders exhibit behavioural
stability or distinctiveness. Further,wecontend that these approaches offer only limited
utility in resolving keypractical concerns encountered in lawenforcement settings as
related to the linking task.For example, noneofthe analytical methods listedabove
yield information on the specific degree of similarity required between two crimes in
order forthose crimes to be considered linked.
To illustrate our point, consider Canter et al.’s (1991) examination of the linking task.
Theirsample wascomprised of 12 solved serial crimes committed by four different
offenders(three crimes per offender). For each pair of crimes in their sample, some
representing crimes that were linked in reality and some representing crimes that were
unlinked in reality,74dichotomouslycoded crime scene behaviourswere used to
calculateanacross-crime similarity score(see Ta ble 1). The particular similarity
coefficient employed ranged from 0to1,with values closer to 1indicating agreater
degree of behavioural stability between agiven pairofcrimes.This approach appearsto
providearelativelysimple, yetdirect test of the degree to which asample of serial
offendersmay exhibit stability and distinctiveness across their crimes.
The ideabehind the use of across-crime similarity scores is that high intra-series
(‘same offender’) scores indicate stability while lowinter-series (‘differentoffenders’)
Ta ble 1. Across-crime similarity coefficients reported by Canter et al. (1991)
Same offender Different offenders
23B1 B2 B3 C1 C2 C3 D1 D2 D3
A1 .11 .42* .27 .32* .27 .15 .17 .06 .43* .17 .26
A2 .14 .29 .29 .11 .15 .07 .12 .29 .26 .18
A3 .27 .27 .23 .09 .11 .05 .27 .14 .33*
B1 .45* .26 .06 .08 .05 .21 .08 .18
B2 .41* .07 .07 .02 .27 .07 .16
B3 .27 .31* .27 .12 .14 .06
C1 .38* .48* .22 .33* .16
C2 .36* .10 .11 .02
C3 .07 .20 .11
D1 .21 .46*
D2 .17
Note.Within this table,the letters A, B, C, and Drefer to different serial offenders, and the numbers
1, 2, and 3refer to different crimes. Thus, A1 refers to the first crime committed by offender A, A2
refers to the second crime committed by offender A, and so on. As an example of how the table should
be read, the cell in the upper-left corner of the table (A1-2) refers to the degree of behavioural stability
(.11) exhibited across the first and second crimes of offender A(high similarity scores across crimes
committed by the same offender equate to high levels of behavioural similarity). In contrast, the cell
corresponding to A1–B1 refers to the degree of behavioural stability (.27) exhibited across the first
crime of offender Aand the first crime of offender B(low similarity scores across crimes committed by
different offenders equate to high levels of behavioural distinctiveness). The *in this table indicates
those instances wherethe similarity scoreexceeds an imposed threshold of $.30.
Crime linking methods 295
scoressignifydistinctiveness. The degree to which it is possible to accurately link
crimes in the sample is then determined by (1) selecting adecision threshold (i.e. a
specific across-crime similarity score) fordeciding whentwo crimes are similar enough
to be considered linked and (2) calculating the proportion of correct and incorrect
linking decisions made when applying that threshold. Canteret al. (1991) selected a
threshold of $.30 fordetermining linkages. When applying this threshold to their data,
theycorrectly classified7out of 12 (58.33%) crime pairsthat were committedbythe
same offender and 49 out of 54 (90.74%) crime pairs that werecommittedbydifferent
offenders. Thus, based on the percentageofcorrect decisions,the overall linking
accuracy achieved in this study was 84.84%.
Although these results appear to be promising, it is important to consider what they
actually convey.For instance, to what extent are the serial offenders in Canter et al.’s
(1991)sample actually exhibiting behavioural stability and distinctiveness? To what
extent can one actually use the proposed linking approach to distinguish between
crimes committed by differentoffenders? Practically speaking, what does this study
actually indicate with respect to the degree of similarity that must exist betweentwo
crimes before investigatorsshould consider them partofthe sameseries?
The primaryproblem in answering such questions is that, arguably,the results
provided by Canter et al. (1991) are invalid forthe purposeofaddressing these issues.
Indeed, the interpretation of their results is confined to the decisionthreshold that they
adopted (i.e. $.30). In fact, answers to each of the questions posed above would vary
considerably depending on the location of the threshold. With respect to linking
accuracy,for example, although the decisionthreshold of $.30 yielded an accuracy
score of 84.84%, adoptingathreshold of $.10 would result in an accuracy score of
40.90%, while athreshold of $.50 would result in an accuracy score of .00%. To our way
of thinking, asignificant problem is posed by the fact that Canter et al.’s results are so
obviouslybiased by the placementofthe decision threshold.Tothe best of our
knowledge,the same problem exists with everyother commonly used approachfor
tackling the linking task.
For example, in the study conductedbyGrubin et al. (2001),each crime in their
sample of serial crimes was treated as atarget offence.Apre-specified percentage(10%)
of the remaining sample that was most behaviourally similar to each target offencewas
examinedtodetermine how many offences belonging to the target offenceseries (and
not belonging to the series) wereincluded in the subsample. This number was then
compared to the numberoflinksthat wouldbeexpected by chance. In these cases,the
pre-specified percentagecut-offacts as the threshold (i.e.itindicates the degree of
similarity required between crimes in the subsampleand the target offencefor one to
consider them partofthe same crime series). As in Canter et al.’s (1991) study,ifthis
threshold were altered,the number of correct (and incorrect)links would change.
The point of this argument is not that decision thresholds should be circumvented.
On the contrary, in researchand practical contexts alike, threshold setting is an inherent
and unavoidable step of the linking process. Rather, we are arguing that the decision of
wheretoplace athreshold forlinking purposes has afundamentalimpact on the
empirical results that are generated. By addressingthis issueexplicitly,itwill be possible
to increase the validity of researchinthe area and,byextension, researchcan better
informpractical decision-making in investigative contexts. Thus, the solution is to use a
method of analysis that can quantify the degree of linking accuracy achieved under any
given set of conditions, unbiased by threshold placement. From apractical standpoint,
the method of analysis would ideally also guide decisions about threshold placement
296 Craig Bennel et al.
such as to optimize performance on the linking task. It is our contention that ROC
analysis adheres to these criteria and, more generally,that signal detection theory
provides aproductive way of re-conceptualizing the linking task.
Addressing the problem of threshold-specific results
ROCanalysis was originally developed in the field of signaldetection (Green&Swets,
1966),but it is now commonly employed to evaluate and improve decision-making
performance in avariety of diagnostic fields ranging from radiologytopsychiatry
(Swets, 1996). In its inception, signal detection theoryliterally involved the presentation
of asignal (e.g. on aradar screen), which had to be distinguished from random
background noise. Later,‘signal detection’ assumed amoregeneric meaning and it
began to include almostany eventofinterest that had to be distinguished from other,
typically less important events. Forexample, adoctor might be faced with the task of
diagnosing adiseased eyeamongst abackground of normal eyes (Swets, Dawes, &
Monahan, 2000).
We have previously argued that linkageanalysis can be conceptualized as asignal
detection problem, at least when the linking task involves the consideration of whether
apairofcrimes has been committed by the sameoffender (Bennell, 2005; Bennell &
Canter, 2002; Bennell &Jones, 2005).Indeed, there are many similarities between this
linking task and other diagnostic decisions.For example, the goal in this linking task is
very similar to the goal forany diagnostic task, which is to identify arelativelyraresignal
(a linked crime) against abackground of noise (unlinked crimes). In addition, linking
decisions of this type must often be based on ambiguous evidence, such as ahigh across-
crime similarity score that can arise from an examination of both linked and unlinked
crimes.Reliance on ambiguous evidence is the norminmany diagnostic tasks (Swets
et al.,2000).Moreover,the types of decision outcomesinthis linking task are similar to
those that emergewhen making other diagnostic decisions. When faced with apair of
crimes,two predictions can be made (linked/unlinked), while two potential realities
exist (linked/unlinked).Combining these possibilities results in the four decision
outcomes that are present in all yes–no type diagnostic tasks, namely hits (predict
linked/actually linked), correct rejections (predictunlinked/actuallyunlinked), false
alarms (predictlinked/actually unlinked), and misses (predictunlinked/actually linked).
Finally, the primaryobjective forthe decision maker faced with this linking task is the
same as forany diagnostician. The decisionmakermust attempt to maximize the
probability of rendering acorrect decision while minimizingthe probability of making
an incorrect decision.
In signal detection theory, diagnostic decisions areoften conceptualized using apair
of probability distributions (Swets et al.,2000), and this may prove to be auseful way of
thinking aboutthe linking task.For our purposes, consider alarger scale, hypothetical
version of Canter et al.’s (1991) study that yieldsahigher numberofacross-crime
similarity scores than are currently in Table 1. If this data were turned into atable (like
Table 1) and scores from the left and right side of this new table were plotted separately
on agraph, with the x -axisrepresenting the degree of similarity (from 0to1)between
crime pairs and the y-axisrepresenting the probability (from 0to1)that acrime pair
possesses agiven degree of similarity,two distributions like those in Figure 1might
As suggested above,the right-hand (‘sameoffender’) distribution is an indication of
behavioural stability and the left-hand (‘differentoffenders’) distribution signifies
behavioural distinctiveness. In ageneral sense then, the degree to which one can
Crime linking methods 297
distinguish between crimes committed by different offenders is indicated by the extent
of overlap between these two distributions.Alower degree of overlap between
distributions signals an increased ability to distinguish betweencrimes committed by
the sameoffender versus differentoffenders and, by extension, alower degree of
overlap enhances one’spotential to performthe linking task successfully.
As indicated above, most researchersspecify one particular point along the x-axis
as their decision threshold (e.g. $.30). Following this, theydetermine the likelihood
of rendering both correct and incorrect decisions. When taking this approach, the
probabilities of making the various linking decisions can easily be calculated(see
Swets et al.,2000). The probability of ahit (pH) whenusing aparticular threshold is
equal to the frequency of hits divided by the frequency of hits and misses. This value
would be represented in Figure 1bythe areaunder the ‘same offender’ distribution
to the right of the ($.30) threshold. The probability of making afalse alarm(pFA)
when using aparticular threshold is equaltothe frequency of false alarms divided by
the frequency of false alarms and correct rejections. This value is represented in
Figure 1bythe area under the ‘differentoffenders’ distribution to the right of the
($.30) threshold. The probabilities of misses (pM) and correction rejections (pCR)
are simply the complements of pHand pFA,respectively.
Upon examination of Figure 1, it becomesclear that the likelihoodofmaking
particular linking decisions will varyacross different thresholds, even when basing
decisions on the exact sameevidence(i.e.when astabledegree of overlap exists).
Consequently,results that emergefrom the use of only one threshold are likely to
provideanextremely distorted picture of one’sabilitytolink crimes.ROC analysis is
uniqueinits ability to resolve this issue. In the current context, ROCanalysis illustrates
how the probabilities of making the various types of linking decisionsare subject to
changeasthresholds are varied from strict to lenient. Essentially, one calculates and
plots the coordinates of pHasafunction of pFA across arangeofthresholds (Swets,
1988).When the points are connected on the graph, the result is typically aconcave
downward curve, knownasaROCcurve. Thiscurve starts at the lower leftcorner of
the graph(where the thresholdsare strict) and ends in the upper right corner (where
the thresholds are lenient).
Figure1.Hypothetical distributions of across-crime similarity scores for crimes committed by the
same offender versus different offenders. The x -axis represents the degreeofsimilarity (from 0to1)
between crime pairs and the y -axis represents the probability (from 0to1)that acrime pair possesses
any given degree of similarity.
298 Craig Bennel et al.
The area under the curve,commonly referred to as the AUC ,acts as ameasure
of linking accuracy forthe particular linking approach (or linking evidence) that gave
rise to that curve. The smaller the degree of overlap between the two probability
distributions representing crimes committed by the same versus differentoffenders, the
higher the resulting curve in the ROCgraph and the greater the linking accuracy.
This areameasure can rangefrom 1.00 (perfect discrimination) to .50 (chance discri-
mination). An AUCof 1.00 represents aROC curve that follows the left and upper axes
of the graph, whereas an AUCof .50 corresponds to aROC curve that follows
the positive diagonal on the graph, going from the bottomleft corner to the upper
right corner.
The primaryadvantageofusing the AUCas ameasure of linking accuracy is that it is
independent of the particular threshold adopted (Swets, 1988).This is the case because
the AUCrepresents the position of the entire ROCcurve rather than any singlepoint
along it. Thus,aROCcurve generated from Canter et al.’s (1991) data would provide a
measure of linking accuracy that is not specific to their threshold of $.30, but rather to
their general approach of using an across-crime similarity coefficient (derived from a
specific set of crime scene behaviours) to link crimes. Thus, using the AUCas ameasure
of linking accuracy is the only waytodetermine whether performance on the linking
task is due to the inherent discriminatory power of the approach (or evidence) under
investigation, or simply to the threshold that was adopted.
Once aROC curve has been constructed, one can attempt to identify apoint along
the curve (i.e. adecision threshold) that will result in the desired balance between the
variousdecisionoutcomes. Such athreshold can be selected via any number of
procedures (Swets, 1992).For example, one common method, although it is not always
appropriate, is to select athreshold that maximizes pHwhile minimizing pFA.For any
given ROCcurve,this threshold falls at apoint on the curve that is closest to the upper
left corner of the graph (where p H¼1:00 and p FA ¼:00). Another approach, which is
illustrated by Swets et al. (2000),istoidentify athreshold according to pre-determined
cut-offvalues forpHorpFA.For instance, one might hypothetically argue that due to
limited investigativeresources,apolice force may not wish to exceed aFArate of .20
when making decisions about potential burglaryseries. This constraint would thus
dictate the parametersfor establishing the decision threshold in an attempt to produce
as many hits as possible withoutexceedingthis pre-determined rate of false alarms.
Current study
As arguedabove, ROCanalysis is exclusive in its ability to circumvent the limitations of
threshold-specific results yielded from traditional approaches to linkageanalysis. The
following empirical study of serial rape data aims to demonstrate the practical
application of ROCanalysis to the linking task. Moreover,the authorsillustrate some
basic procedures forselecting appropriate decision thresholds.Based on the results
of the analysis, adiscussionensues on the fundamentalassumptions underlying
The following hypothetical scenario serves to illustrate the practical interpretation of the AUC.Anacross-crime similarity
score based on aset of 20 crime scene behaviours is calculated across pairs of crimes that are either the work of the same
offender or different offenders(under the assumption that larger similarity scores will be found forcrimes committed by the
same offender). These scores are subjected to ROC analysis and result in an AUCof .80. For this sample of crimes,this means
that there is an 80% chance that arandomly selected pair of crimes committed by the same offender will have alarger
similarity score than arandomly selected pair of crimes committedbydifferent offenders.
Crime linking methods 299
linkageanalysis. Specifically, the authorschallenge the assumptions that high levels of
behavioural stability and distinctiveness are required forsuccessful linking to occur.
The currentinvestigation is based on data originally collected foraprevious linking
study (Canter,Wilson, Jack, &Butterworth, 1996). The data consist of 126 offencesof
rape committed across the UK, which were perpetrated by atotal of 42 convicted serial
rapists.The original sampling procedure limited the data to three crimes per offender.
Common practice in linking research, this restriction is typically imposed to ensure that
analyses are not biased by undue weight beingassigned to highly prolific offenders
displaying particularly high or lowlevels of behavioural stability and/ordistinctiveness
(e.g. Bennell &Canter,2002; Santtila,Junkkila et al.,2005; Wo odhams&To ye, 2007).
All of the data wereextracted directly from victim statements,which were preparedby
police officersinthe context of criminal investigations.
For the purposeofthe present study,27variables relatingdirectly to the behaviour of
the offender at the scene of the crime were extracted from the originaldata set. These
variables were originally identified by trained researchersthrough acontent analysis of
victim statements.The content categories wereinitiallyderived from the published
literature on rape and from athorough analysis of the victim statements. Adetailed
content dictionarywas developed and applied to the sample (seeAppendix). For each
crime, behaviourswere either coded as 1(indicating their presence) or 0(indicating
their absence). Although levels of inter-rater agreement are unavailable forthe original
data, Alison and Stein (2001) have reported that similar data has been coded with ahigh
level of reliability (averagelevels of disagreement in the 3–4% range). As such, the
27 dichotomous variables codedacross the 126 offences provided the data matrix upon
which the present analysis was conducted.
As othershave noted, there are potential limitations associated with the use of victim
statements as data sources and the results from this study should therefore be viewed
with an appropriate level of caution (Alison, Snook, &Stein,2001).For example, when
describing their experiences, rape victimsmay emphasizeparticular aspects of the
crime over others,potentiallyhighlighting behavioursdepicting the traumatic nature of
the assault. Moreover,victimsmay omit salient details from their reports due to factors
such as memoryimpairment and/or embarrassment. In addition, victim statements are
obviouslyonly representative of rapes that have been reported to the police and may
reveal littleabout the largenumber of rapes that remain unreported and unsolved.
However,itshould also be recognized that everysource of investigative data will be
biased in avariety of ways. Unlikeotherdata sources, victim statements are advant-
ageous not only because theyprovide information from the victim’s perspective, but
also because theyare collected under conditions in which the testimony could be
challengedincourt(Bennell, Alison, Stein,Alison, &Canter,2001).Assuch, there is a
certain degree of pressure placed upon the investigating officer to recordinformation
reliably and in sufficient detail to withstand legal scrutiny.
In order to derive across-crimesimilarityscores, thedichotomously codedvariableswere
enteredintoacomputerprogram,which wasspecifically designed to calculatesimilarity
300 Craig Bennel et al.
coefficientsbetween everypairofcrimes in amannerconsistentwithCanteret al.(1991).
Theparticularsimilaritycoefficientemployedinthe current studywas Jaccard’scoefficient
(Jaccard,1908),which wasusedinCanteret al. ’s studyand many otherstudies sincethat
time (e.g.Bennell &Canter, 2002;Bennell &Jones,2005; Goodwill &Alison, 2006;Salfati,
2000;Salfati &Bateman,2005; Wo odhams &Toye, 2007). When calculatingacross-crime
similarityfor apairofcrimes, Jaccard’scoefficient( J)iscalculatedasa /(aþbþc ), where
arefers to thefrequency of behaviours presentinbothcrimes,and bandcrefertothe
frequencyofbehaviourspresent in onecrime but absent in theother.
Jaccard’scoefficientisoften regarded as thesimilaritycoefficientofchoiceinthe linking
contextbecause (asisevident by theformula)thismeasure ignoresjoint non-occurrences
of an event(Woodhams, Hollin et al. ,2007).Itisrationalizedthatthe recorded absenceofa
behaviourinagivencrime maybedue to factorsother than theactualnon-occurrenceof
theevent andtherefore across-crimesimilarity should notincreaseasaresult of jointnon-
occurrences.For example, thevictimmay notrememberthe behaviourorthe interviewer
mayfailtoelicitand/orrecordthe information. Despitethispotentially useful featureof
Jaccard’s, andits generalpopularity, it should be notedthatthiscoefficientisacrude
measureofacross-crimesimilaritythatmay notresultinoptimal linkingperformance.
Unfortunately,researchhas just beguntoemergethatcomparesthe degree of linking
accuracyachievedwithcoefficients otherthanJaccard’s (Bennell,Gauthier, &Gauthier,
2008;Bennell,Jones,&Melnyk,2007; Wo odhams, Grantet al. ,2007).Given this lack of
research,and thefactthatthe existing research does notprovide clearsupport forany one
coefficient,there is reallynobasis forchoosingone coefficient over anotheratthispoint in
time.Ultimately, ourdecisiontouse Jaccard’swas basedonits simplicity andthe fact that it
wasusedbyCanteret al.(1991),which is thestudy upon whichweare building to
demonstratethe utilityofROC analysis in thelinking context.
Unlike previous studiesthathaveattempted to identify subsetsofcrime scene
behaviours that arebestsuitedtothe linkingtask(e.g. Bennell&Canter,2002; Bennell&
Jones, 2005;Grubinet al. ,2001; Wo odhams&To ye,2007),all 27 behaviours in ourdataset
were simultaneously used to calculateJ .Again,thisapproachwas adoptedbecause we felt
that it made most sensetostayinlinewithCanteret al.’s original procedure. Thereisof
course nothinginherentlywrong with this procedure, although it obviouslydoesprevent
onefromcomparing therelativelinking accuracy that is achieved when focusing on various
behaviouraldomains that existinthe data.Havingsaidthat, we conductedsome
exploratoryanalysesonour rape data andfound that an inclusivemethod(i.e. includingall
behaviours in theanalysis) resulted in greaterlinking accuracy compared to usingvarious
subsetsofbehaviour (these analyses canbeobtainedbyrequest from thefirst author).
Forevery crimepair in thesample,asimilarity coefficient wasderived from the
computational procedureoutlined above. Distributions of the similarityscoresassociated
with crimepairscommitted by the same offender anddifferentoffenders were plotted
separately. Thesescoreswerethenusedtoconstruct an empiricalROC graph in orderto
evaluate thedegreetowhich thecrime scene behaviours underexamination,and their
corresponding similarityscores,are conducive to successful linkageanalysis. TheROC
analysis wasperformed using theROC subroutineinthe SPSS software package(version15).
Descriptive analysis
Prior to conducting the ROCanalysis, adescriptive analysis of the similarity scores was
conducted (see Table 2). Specifically,descriptive statistics were calculated across all
Crime linking methods 301
crime pairscommitted by the sameoffender and different offenders. Significance testing
revealed that crimes committed by the sameoffender are associated with significantly
higher similarity scores compared to crimes committed by differentoffenders.
Therefore, adegree of behavioural stability and distinctiveness is exhibited by the serial
rapists represented in the present sample. Nonetheless,itisalso clear from these results
that crimes committed by the sameoffender are occasionally characterized by relatively
low levels of across-crime similarity,and crimes committed by different offendersare
not absolutely distinct.
The implicationofthis last point is apparent in the graphical representation of
similarity scores (see Figure 2). As indicated previously,distributions with minimal
overlap are the most apt at discriminating between crimes committed by the same
offender versusdifferent offenders. The fact that there is asubstantial degree of overlap
between the diagnostic alternatives with respect to their across-crime similarity scores
suggests that it will not be possible to achieveperfect discrimination accuracy with this
sample. Thisistrueregardless of where the decision threshold is placed. The degree to
which it is actually possible to discriminate between crimes committed by the same
offender versus different offendersinthe present sample can only be determined by the
AUCyielded through ROCanalysis. Importantly,this analysis also provides the necessary
information to select an appropriate decision threshold.
ROC analysis
AROC curve derived from the similarity scores is presented in Figure 3. As would be
expected given the distributions presented in Figure 2, the results of the ROCanalysis
confirmthat it is indeed possible to discriminate between crimes committed by the
same offender and differentoffendersatalevel that significantly exceeds chance
( AUC¼:75, SE ¼0 :03, 95% CI ¼: 70 : 80).However,aswas also expected given the
degree of distributionoverlap, this AUCis significantly less than 1.00. According to
criteria set out by Swets (1988), this AU Crepresentsagood level of accuracy.
Figure2.Distributions of across-crime similarity scores for crimes committed by the same offender
versus different offenders using Jaccard’s coefficient.
Ta ble 2. Descriptiveanalysis of across-crime similarity scores using Jaccard’scoefficient
Ty pe of crime pair Mean SD Range
Committed by the same offender 0.41 0.17 .00 –.80
Committed by different offenders 0.27 0.13 .00 –1.00
302 Craig Bennel et al.
With respect to identifying an appropriate decision threshold fordetermining the
point at which two crimes should be considered linked, both of the procedures
discussed above wereused (i.e.aprocedure that maximizes pHwhile minimizingpFA
and aprocedure that maximizes pHwhile not exceedingapre-determined limiton
pFA). In ordertomaximize pHand minimize pFA,the threshold falling at the point on
the curve that is closest to the upper leftcorner of the graph wasadopted.Formally, this
point was identified by drawing anegative diagonal on the graph (fromthe upper left
corner to the lower right corner)and finding the point at which this diagonal bisects the
ROCcurve. The threshold at this point on the curve correspondstoasimilarity score of
$.33, which is only slightly higher than Canter et al.’s (1991) threshold of $.30. The pH
and pFA valuesthat resulted when this threshold was adopted were .72 and .32,
respectively. To illustrate the alternative procedure forsetting adecisionthreshold, a
limit of .20 wasset on the FA rate. The threshold resulting in the maximum pHpossible
(.61),while also respecting the pre-determined ceiling forpFA (.20), was $.37.
Despite the practical importance of linkageanalysis to investigativesettings, it has only
recently becomethe subject of empirical examination. Although anumber of analytical
approaches are currently available to study the linking task,results yielded from these
techniques are inherently biased by the placement of decision thresholds.Moreover,
traditional approachestothe linking task fail to address important practical issues, such
as the determination of an appropriate threshold to mark acriterion of similarity that
must be achieved fortwo crimes to be considered linked. In the currentpaper,the
authorsadvocated ROCanalysis as amethodfor studying/conducting linkageanalysis
Figure3.AROC graph representing the degree of linking accuracy associated with serial rape
Crime linking methods 303
due to its unique abilitytocircumvent the abovelimitations. Having demonstrated the
application of ROCanalysis to the linking task, further detail is now presented on the
variousadvantagesassociated with this technique. Based on the results of the current
empirical study,wewill also reconsider the importance of central linking assumptions.
We conclude by providing suggestions forfuture research.
Advantages of receiver operating characteristic analysis
Establishes an unbiased measure of linking accuracy
As discussed, the most obvious advantageofusing ROCanalysis in the linking contextis
its ability to produce apure measure of linking accuracy (i.e.the AUC )that is
independent of decision threshold placement. Thus,the AUCof .75 achieved in the
present study reflectsthe inherentlinking power of the approachunder examination,
which involved the use of Jaccard’scoefficient to calculate across-crime similarity scores
on the basis of 27 serial rape behaviours. It does not reflect any sortofarbitrary
threshold selected forthe purpose of linking rapes.Consequently,the AUCcan be
considered amorevalid measure forthe purposeoflinkageanalysis compared to
alternative measures that are biased by threshold placement (e.g. percentagecorrect).
This benefit also extends beyond the current study.Indeed, the usefulnessofhaving
an unbiased measure of linking accuracy can perhaps be best appreciated if one
considersastudy in which the primarygoal is to compare the relative performance of
different decisionmakersonthe linking task.Consider arecent study by Bennell,
Bloomfield, Snook, Taylor,and Barnes (in press), forexample,involving acomparison
of university students and police professionals’ abilitytoeffectively discriminate
between crimes committed by the same offender versus differentoffenders.Ifthe two
groups were hypothetically shown to differ with respect to their linking decisions,one
may be tempted to attribute this finding to group differences in the ability to accurately
link crimes.However,this disparity in linking decisions could just as readily be
attributable to group differences in the use of decision thresholds (e.g.students may be
more liberal than professionals in their criteria fordeciding whether two crimes are
linked). Without subjectingsuch data to ROCanalysis, it would be difficulttodetermine
whetherthe groups fall at different points along the same ROCcurve (i.e. same level of
accuracy,different decision thresholds)orondifferent ROCcurves(i.e. differentlevels
of accuracy).
Permits appropriate setting of decisionthresholds
As suggested above,asecond advantageofapplying ROCanalysis to the linking task is
that the technique can be used to identify appropriate decision thresholdsfor
determining whether agiven crime pair has been committed by the same perpetrator.
Interestingly,the importance of this issue has largely been ignored by researchersinthe
linking field. Instead, attention is moreoften accorded to identifying the crime scene
behavioursthat are best suited to linkageanalysis. In contrast, we argue that it is futile
to recognize the general utility of aparticular set of crime scene behaviours forlinking
two crimes together withoutalso considering the degree of similarity that must exist
between the crimes forthe two offences to be considered linked.
The importance of the threshold issue is elucidated by the presentation of Canter
et al.’s (1991) findings as well as the findings from the current study.Both of these
studies makeitclear that therewill rarely existcompleteseparation betweenthe
304 Craig Bennel et al.
distributions of similarity scores derived from crimes committed by the same offender
versus different offenders. Under these suboptimal conditions, aspecific decision
threshold will be required. Given how linking performance can varyacross thresholds,
the choice of location forthat threshold is crucial. In the current study,two different
procedures were used foridentifying an appropriate threshold. The first procedure
allowed one to maximize pHwhile also minimizingpFA,resultinginathreshold of
$.33. The second procedure, resulting in athreshold of $.37, allowed one to maximize
the hit rate while not exceeding apre-specified limit on the rate of false alarms. Both
methodsare rational and,arguably,bothyield results that are moresensible in
producing the desired balance of decision outcomes than would occur if an arbitrary
threshold were selected.
Technically,however,neither of these approaches can be considered optimal
(Bennell &Jones, 2005).The optimal approach forselecting athreshold would ideally
account forthe base-rate probabilities of encountering crimes committed by the same
offender versusdifferent offendersinthe jurisdiction under consideration, along with
the costs and benefits of the various linking decisions (Swets, 1992). Unfortunately,at
the moment, it is difficult to assign quantitative valuestosome of these terms (e.g. what
is the cost of making afalse alarminthe linking task?). However,ifthese issues could be
resolved in the futurebycareful study,advancesare highly likely to emergeinthe areaof
Affords flexibility
Beyond its abilitytoproduce an unbiased measure of linking accuracy and its capacity
to allow forthe identification of appropriate decisionthresholds, ROCanalysis is
also advantageous in its flexibility.One way in which ROCanalysis is flexible is that
it can be used to measure linking accuracy regardless of the linking approach
under consideration. To date, the ROCprocedure has most commonly been used in
combination with logistic regression analysis (Bennell &Canter,2002; Bennell &Jones,
2005; Woodhams&Toye, 2007),and in the currentstudy the procedure was
applied directly to across-crime similarity coefficients. However,there is no reason why
ROCanalysis could not also be applied to results emerging from techniques like
multidimensional scaling (wherebythe proximities betweenvariables would act as
thresholds) or any other potential linking procedure.
The flexibilityofthe ROCprocedure is also apparent through yet another
application. In averydirect manner,ROC analysis can be used to examineawide range
of moderatorvariables. From asignal detection perspective, moderator effects are
represented by the degree of overlap between distributions like those illustrated in
Figure 1. By extension, these effectsare reflected as ROCcurveswith differentAUC s.
Thus, it is possible to illustrate amoderatoreffect on asingle ROCgraph with multiple
ROCcurves, each reflecting adifferent level of the moderatorvariable. Potential
moderatorsofinterest might includethe linking approach under examination, the
nature of crime scene behavioursused to assess across-crime similarity,the type of
similarity coefficient adopted, and so on. Having the flexibilitytocompare different
moderators, aloneorincombination,isahighly attractive featureofROC analysis.
The importance of behavioural stability and distinctiveness
The last issue to be addressed is the importance of behavioural stability and distincti-
veness as underlying assumptions of the linking process.Ashighlightedinthe
Crime linking methods 305
introduction, the existence of high levels of stability and distinctiveness are generally
viewed as prerequisites forsuccessful linking (Canter,1995; Grubin et al.,2001;
Woodhams, Hollin et al.,2007). The present authors have adopted asimilar view in the
past (Bennell &Canter, 2002; Bennell &Jones, 2005). However,conceptualizing linkage
analysis as asignal detection task leads one to re-evaluate the validity of these
Consider Figure 1for the purposeofillustration. In this hypothetical situation,
stability and distinctiveness are bothrelatively high, with the right distribution
positioned to the farright of the x-axis and the left distribution positioned to the farleft.
However,the distributions need not be in these positions in order to achieveahigh
rate of linking accuracy. Indeed, largeAUCsmay emergewhen thereislittle overlap
between these underlying distributions,regardless of where the distributions lie along
the x -axis. Within the current study,for example, it can hardly be said that ahigh rate of
behavioural stability exists. The mean similarity score forcrimes committed by the same
offender was just 0.41. Yet, arespectable level of linking accuracy was achieved
( AUC¼:75). Therefore, despite prior assumptions, it seems that high levels of stability
and distinctiveness are not absolutely necessaryfor achieving ahigh rate of linking
accuracy.Rather,itisalow level of distribution overlap that is crucial.
If this is true, areconceptualization of the linking task may be required, carrying with
it important implications. For example,attempts to identify procedures to enhance the
degree of behavioural stability that can be uncovered in agiven sample of crimes (e.g. by
using differenttypes of similarity coefficients) are unlikely to positivelyimpact linking
accuracy unless these procedures also result in less distribution overlap (e.g. by also
increasing the degree of behavioural distinctiveness that can be uncovered).
Directions for futureresearch
Anumber of potential avenues forfutureresearchwarrant consideration, beyond the
obviouslyimportant next step of replicating the results reported here on amuch larger
sample of rapes to ensure that our results are generalizable. First, given the frequent
application of linking approaches such as logistic regression modelling, discriminant
function analysis, and multidimensional scaling, it would be sensible to use ROCanalysis
in order to evaluate the relative effectiveness of these methods. Second,applying ROC
analysis to different crime types would be beneficial. Serial burglaryhas been the focus
of most previous research(Bennell, 2002; Bennell &Canter,2002; Bennell &Jones,
2005),but the recent study by Wo odhamsand To ye (2007) on commercial robbery
suggests that this is changing.
Third, effortsshould be made to identify the types of behavioursmosteffective for
linking purposes. In the currentstudy,wesimply relied on asingle across-crime
similarity score calculated foreach crime pair that wasbased on all 27 rape behaviours
in our data. However,ROC analysis can be used to examineavariety of factorsthat are
potentially important in maximizing linking accuracy,including the role of behavioural
frequencies and the degree to which behavioursare situationally driven. It has been
argued that each of these factorsmay play arole in linkageanalysis (Bennell &Canter,
2002; Bennell &Jones, 2005; Canter,Bennell,Alison, &Reddy,2003; Santtila,Junkkila
et al.,2005; Woodhams&To ye, 2007 ). Fourth, using ROCanalysis to explore the
potential impact of differentsimilarity coefficients on linking performance is warranted.
Woodhams, Grant et al. (2007 )have made important advances in this area, although we
have recently failed to replicate their findings (Bennell et al.,2008).
306 Craig Bennel et al.
Fifth, in an effort to derive optimal decision thresholds,itwould be extremely useful
to startconducting formal analyses in an attempt to quantify the costs and benefits
associated with the variouslinking decisions. As suggested above,anoptimal threshold
must additionally account forthe base-rate probabilities of encountering linked
versus unlinked crimes in aparticular jurisdiction (Swets, 1992). While undoubtedly a
challenging endeavour,asystematic approachtothreshold selection would be of
tremendous practical value to police investigators. Finally, given the consistent evidence
in favour of empirically based decision aids over unstructured human judgment (Dawes,
Faust, &Meehl, 1989; Grove &Meehl, 1996),the development of actuarial tools for
linkageanalysis should be considered and these toolsshould be compared to alternative
decision-making approaches. As argued above, ROCanalysis is necessaryfor making
such comparisons in an appropriate and valid manner.
Much of this future researchwill allow researcherstodetermine the extent to
which the results presented in the currentstudy generalize to conditions beyond those
examinedhere. By comparing the results that emergeacross linking approaches, crime
types, behavioural domains,and similarity coefficients, the exact conditions under which
linking accuracy is maximized can ultimately be determined. Thisnew knowledge
will likelyhelpinanswering importantquestionsaboutoffending behaviour.
In addition, these newfindings may result in better linking decisions being madein
naturalistic settings.
Alison, L. J.,Snook, B., &Stein, K. L. (2001). Unobtrusive measurement: Using police information
forforensic research. Qualitative Research,1(2), 241–254.
Alison, L. J.,&Stein, K. L. (2001). Vicious circles: Accounts of stranger sexual assault reflect
abusive variants of conventional interactions. Journal of Forensic Psychiatry ,12(3), 515–538.
Bennell, C. (2002). Behavioural consistencyand discrimination in serial burglary .Unpublished
doctoral dissertation, University of Liverpool, Liverpool, UK.
Bennell, C. (2005). Improving police decision making: General principles and practical
applications of receiver operating characteristic analysis.Applied CognitivePsychology ,
19(9), 1157–1175.
Bennell, C., Alison, L. J.,Stein, K. L., Alison, E. K., &Canter,D.V.(2001). Sexual offenses against
children as the abusive exploitation of conventional adult–child relationships. Journal of
Social and Personal Relationships,18(2), 155–171.
Bennell, C., Bloomfield, S., Snook, B., Taylor,P.J., &Barnes, C. (in press). Discriminating between
linked and unlinked burglaries: Comparing the performance of university students, police
professionals, and alogistic regression model. Psychology, Crime, and Law.
Bennell, C., &Canter,D.V.(2002). Linking commercial burglaries by modus operandi: Tests using
regression and ROCanalysis.Science and Justice,42(3), 153–164.
Bennell, C., Gauthier,D., &Gauthier,D.(2008). Does ataxonomic measureofsimilarity increase
our ability to identify serial crimes? Unpublished manuscript.
Bennell, C., &Jones, N. J. (2005). Between aROC and ahard place: Amethod forlinking serial
burglaries by modus operandi. Journal of Investigative Psychologyand Offender Profiling,
2(1), 23–41.
Bennell, C., Jones, N. J.,&Melnyk, T. (2007). Linking serial rapes: Atest of the behavioural
frequencyhypothesis.Poster presented at the annual meeting of the Canadian Psychological
Association, Ottawa, Ontario, Canada.
Canter,D.V.(1995). The psychology of offender profiling. In R. Bull &D.Carson (Eds.),
Handbook of psychology in legal contexts (pp. 343–355). Chichester,UK: Wiley.
Crime linking methods 307
Canter,D.V., Bennell, C., Alison, L. J.,&Reddy,S.(2003). Differentiating sexoffences:
Abehaviorally based thematic classification of stranger rapes. Behavioural Sciences and the
Law,21(2), 157–174.
Canter,D.V., Heritage, R., Wilson, M., Davies, A., Kirby,S., Holden, R., et al. (1991). Afacet
approach to offender profiling.London, UK: Home Office.
Canter,D.V., Wilson, M., Jack, K., &Butterworth, D. (1996). Thepsychology of rape
investigations: Astudy in police decision making.Liverpool, UK: University of Liverpool.
Dawes, R., Faust, D.,&Meehl, P. E. (1989). Clinical versus actuarial judgment. Science,243,
Ewart, B. W. ,Oatley, G. C.,&Burn, K. (2005).Matchingcrimesusing burglars’modus operandi:Atest
of threemodels. InternationalJournal of Police Scienceand Management,7 (3), 160–174.
Goodwill, A. M., &Alison, L. J. (2006). The development of afilter model forprioritizing suspects
in burglaryoffences. Psychology, Crime and Law,12(4), 395–416.
Green, D. M., &Swets, J. A. (1966). Signal detection theoryand psychophysics.New Yo rk: Wiley.
Green, E. J.,Booth, C. E., &Biderman, M. D. (1976). Cluster analysis of burglaryM/O’s. Journal of
Police Science and Administration,4(4), 382–388.
Grove, W. M.,&Meehl, P. E. (1996).Comparativeefficiency of informal(subjective,
impressionistic) and formal (mechanical, algorithmic) prediction procedures: The clinical-
statistical controversy.Psychology, PublicPolicy, and Law,2(2), 293–323.
Grubin, D.,Kelly,P., &Brunsdon, C. (2001). Linking serious sexual assaults through behaviour.
London, UK: Home Office.
Jaccard, P. (1908). Nouvelle recherches sur la distribution florale. Bulletin de la Societe Va udoise
des Sciences Naturelles,44,223–270.
Mischel, W. (2004). Toward an integrativescience of the person. Annual Review of Psychology ,
Salfati, C. G. (2000). Profiling homicide: Amultidimensional approach. Homicide Studies,4(3),
Salfati, C. G.,&Bateman, A. L. (2005). Serial homicide: An investigation of behavioural
consistency.Journal of Investigative Psychologyand Offender Profiling,2(2), 121–144.
Santtila, P. ,Fritzon, K., &Tamelander,A.L.(2005). Linking arson incidents on the basis of crime
scene behavior.Journal of Police and Criminal Psychology ,19(1), 1–16.
Santtila, P. ,Junkkila, J. ,&Sandnabba, N. K. (2005). Behavioural linking of stranger rapes. Journal
of Investigative Psychologyand Offender Profiling,2(2), 87–103.
Santtila, P. ,Korpela, S., &Hakkanen, H. (2004). Expertise and decision-making in the linking of car
crime series. Psychology, Crime and Law,10(2), 97–112.
Swets, J. A. (1988). Measuring the accuracy of diagnostic systems. Science,240(4857), 1285–1293.
Swets, J. A. (1992). The science of choosing the right decision threshold in high-stake diagnostics.
American Psychologist,47(4), 522–532.
Swets, J. A. (1996). Signal detection theoryand ROCanalysis in psychologyand diagnostics.
Mahwah, NJ: Erlbaum.
Swets, J. A., Dawes, R. M., &Monahan, J. (2000). Psychological science can improvediagnostic
decisions. Psychological Science in the PublicInterest,1,1–26.
Woodhams, J. ,Grant, T.,&Price, A. (2007). From marine ecology to crime analysis: Improving the
detection of serial sexual offences using ataxonomic similarity measure. Journal of
Investigative Psychologyand Offender Profiling,4 (2), 17–27.
Woodhams, J. ,Hollin, C. R., &Bull, R. (2007). The psychology of linking crimes: Areview of the
evidence. Legal and Criminological Psychology ,12(2), 233–249.
Woodhams, J. ,&Toye,K.(2007). An empirical test of the assumptions of case linkageand offender
profiling with serial commercial robberies. Psychology, Public Policy, and Law,13(1), 59–85.
Received 17 January2007; revised version received 11 July 2008
308 Craig Bennel et al.
Content dictionary
Twenty-seven variables were created from acontent analysis of victim statements in
order to provide alist of elements common to offences. All variables are dichotomous
with values based on the presence(1) or absence (0) of each categoryofbehaviour.
Adescription of the categorization scheme in alphabetical order is given below.
(1) Anal penetration. This variable refers to the offender penetrating or attempting to
penetrate the victim’sanus.
(2) Binds victim.This variable refers to the use, at any time during the attack, of any
article to bind the victim (excluding restraint by the offender’shands).
(3) Blindfolds victim.This variable refers to the use, at any time during the attack, of
any physical interference with the victim’s ability to see (excluding verbal threats
to the victim to close hereyes or the use of the offender’shands).
(4) Compliments victim.This variable referstothe offender complimentingthe victim
(e.g. on her appearance).
(5) Cunnilingus.This variable referstothe offender performing asexualact on the
victim’s genitalia or attempting to performsuch asex act using his mouth.
(6) Demands goods.This variable refers to the offender approaching the victim
with ademand forgoods or money. This variable specifically relates to initial
(7) Demeans victim.This variable refers to the offender demeaning or insulting
the victim (e.g.using profanities directed against the victim or women in
(8) Disguise. Thisvariable referstothe offender wearing any form of disguise.
(9) Fellatio. Thisvariable referstothe offender forcing the victim to performoral
(10) Forces victim participation. Thisvariable refers to the offender forcing the
victim to physicallyparticipate in the sexual aspects of the offence.
(11) Forces victim sexualcomment. Thisvariable refers to the offender forcing the
victim to makesexual comments.
(12) Gags victim.This variable refers to the use,atany time during the attack, of
any article to prevent the victim from making noise (excluding the temporary
use of the offender’shand).
(13) Identifies victim. This variable refers to the offender taking steps to obtain
from the victim details that would identify her (e.g. examining the victim’s
(14) Impliesknowing victim.This variable referstothe offender implying that he
knows the victim.
(15) Kisses victim.Thisvariable refers to the offender kissing or attempting to kiss
the victim.
(16) Multiple violence. Thisvariable refers to the offender perpetrating multiple
acts of violence against the victim (e.g. multiple punches).
(17) Offender sexual comment. Thisvariable refers to the offender making sexual
comments during the attack.
(18) Single violence. This variable referstothe offender perpetrating asingleact of
violence against the victim (e.g.asingle slap).
Crime linking methods 309
(19) Steals identifiable. This variable refers to the offender stealing items from the
victim that are recognizable as belonging to the victim.
(20) Steals personal. This variable refers to the offender stealing items from the
victim that are personal to the victim but not necessarily of any great value in
terms of re-saleable goods (e.g.photographs or letters).
(21) Steals unidentifiable.Thisvariable referstothe offender stealing items from
the victim that are not recognizable as belonging to the victim (
(22) Surprise attack. This variable refers to the offender using amethod of
approach consisting of an immediate attackonthe victim.
(23) Tearsclothing. Thisvariable refers to the offender forciblyremoving the
victim’s clothing in aviolent manner.
(24) Threatens no report. Thisvariable refers to the offender threatening the victim
that she should not reportthe incident to the police or to any other person.
(25) Vaginalpenetration. This variable referstothe offender penetrating or
attempting the victim’svagina.
(26) Verbalviolence. Thisvariable refers to the offender threatening the victim at
some time during the attack(excluding threats not to reportthe incident).
(27) Weaponuse. This variable refers to the offender displaying aweapon in order
to control the victim.
310 Craig Bennel et al.
... Researchers who have used ROC analysis to examine the pairwise crime linkage task have highlighted numerous benefits associated with this approach (e.g., Bennell et al., 2009). The primary benefit is that ROC analysis provides a measure of linking accuracy (the AUC) that applies across different decision thresholds, rather than being specific to any single threshold, which may or may not result in desirable decisions (Bennell et al., 2009). ...
... Researchers who have used ROC analysis to examine the pairwise crime linkage task have highlighted numerous benefits associated with this approach (e.g., Bennell et al., 2009). The primary benefit is that ROC analysis provides a measure of linking accuracy (the AUC) that applies across different decision thresholds, rather than being specific to any single threshold, which may or may not result in desirable decisions (Bennell et al., 2009). Since the AUC is independent of any single threshold on a ROC curve, it provides an index of overall linkage performance, which is a more valid approach for assessing linkage accuracy than using threshold specific measures (e.g., percentage of correct decisions made when using a particular decision threshold). ...
... made. As discussed in more detail by Bennell et al. (2009), this approach is problematic because the accuracy estimate only applies to the specific threshold that Canter and his colleagues adopted, and that threshold may not be desirable. As an alternative, a ROC curve could have been generated and an AUC calculated by assessing decision outcomes for multiple thresholds. ...
Deciding whether two crimes have been committed by the same offender or different offenders is an important investigative task. Crime linkage researchers commonly use receiver operating characteristic (ROC) analysis to assess the accuracy of linkage decisions. Accuracy metrics derived from ROC analysis—such as the area under the curve (AUC)—offer certain advantages, but also have limitations. This paper describes the benefits that crime linkage researchers attribute to the AUC. We also discuss several limitations in crime linkage papers that rely on the AUC. We end by presenting suggestions for researchers who use ROC analysis to report on crime linkage. These suggestions aim to enhance the information presented to readers, derive more meaningful conclusions from analyses, and propose more informed recommendations for practitioners involved in crime linkage tasks. Our reflections may also benefit researchers from other areas of psychology who use ROC analysis in a wide range of prediction tasks.
... Evidence is most often available to collect if adequately designed methods to investigate the crime scene are used. Any collected crime scene data may be interpreted using behavioural and criminal profiling [15][16][17] in a manner that reflects the perpetrator's cognitions [18], degree of risk-taking and planning [19,20], and thus provide a richer description of an offender's actions that will further aid in prediction of when and where they will commit crimes again. However, in reality, such analyses are often conducted by means of individual law enforcement officers and are rarely undertaken using systematic data-driven comparisons across cases, cities and regions. ...
... As a consequence of the collection procedure, comparisons between crime scenes were easily performed, as comparable information was collected from all crime scenes. Pairwise comparisons between crimes could be analyzed using the Jaccard index [16,17]. The pairwise comparisons could also be performed on subgroups of features, to correspond to the natural division of data-a commonly accepted approach in related research [16,17]. ...
... Pairwise comparisons between crimes could be analyzed using the Jaccard index [16,17]. The pairwise comparisons could also be performed on subgroups of features, to correspond to the natural division of data-a commonly accepted approach in related research [16,17]. In this study, the subgroups of features described in the paragraph above were used, as well as the combination of all subgroups (denoted MO). ...
Full-text available
The evidence that burglaries cluster spatio-temporally is strong. However, research is unclear on whether clustered burglaries (repeats/near-repeats) should be treated as qualitatively different crimes compared to spatio-temporally unrelated burglaries (non-repeats). This study, therefore, investigated if there were differences in modus operandi-signatures (MOs, the habits and methods employed by criminals) between near-repeat and non-repeat burglaries across 10 Swedish cities, as well as whether MO-signatures can aid in predicting if a burglary is classified as a near-repeat or a non-repeat crime. Data consisted of 5744 residential burglaries, with 137 MO features characterizing each case. Descriptive data of repeats/non-repeats is provided together with Wilcoxon tests of MO-differences between crime pairs, while logistic regressions were used to train models to predict if a crime scene was classified as a near-repeat or a non-repeat crime. Near-repeat crimes were rather stylized, showing heterogeneity in MOs across cities, but showing homogeneity within cities at the same time, as there were significant differences between near-repeat and non-repeat burglaries, including subgroups of features, such as differences in mode of entering, target selection, types of goods stolen, as well the traces that were left at the crime scene. Furthermore, using logistic regression models, it was possible to predict near-repeat and non-repeat crimes with a mean F1-score of 0.8155 (0.0866) based on the MO. Potential policy implications are discussed in terms of how data-driven procedures can facilitate analysis of spatio-temporal phenomena based on the MO-signatures of offenders, as well as how law enforcement agencies can provide differentiated advice and response when there is suspicion that a crime is part of a series as opposed to an isolated event.
... A range of different statistical techniques are used to assess the underlying principles of crime linkage with samples of sexual offenses. Some studies measure the degree of similarity between two or more crimes using all crime scene behavior in one measurement (e.g., when calculating a similarity coefficient, such as Jaccard's coefficient, from all modus operandi (MO) behavior; e.g., Bennell et al., 2009;Davidson & Petherick, 2020;Woodhams et al., 2019), whereas other studies first group individual crime scene behaviors into domains or themes, which then form the basis for descriptive statistics regarding consistency (e.g., Grubin et al., 2001;Sorochinski & Salfati, 2018). Further some studies have calculated a similarity coefficient for each individual crime scene behavior (Deslauriers-Varin & Beauregard, 2014;Harbers et al., 2012). ...
... Further some studies have calculated a similarity coefficient for each individual crime scene behavior (Deslauriers-Varin & Beauregard, 2014;Harbers et al., 2012). Studies that have compared the relative merit of including all crime scene behavior in one measurement of behavioral similarity to using domains or themes of behavior have concluded that predictions of crime linkage are more accurate when using all crime scene behaviors together (e.g., Bennell et al., 2009;Oziel et al., 2015). ...
... In 2018, Davies, Woodhams, and Rainbow summarised the findings for studies of serial sexual offending that have investigated the relative behavioral similarity seen for same-offender and different-offender crime pairs. They noted that these studies (Bennell et al., 2009;Oziel et al., 2015;Slater et al., 2015;Woodhams & Komarzynska, 2014;Woodhams & Labuschagne, 2012) have been conducted using samples from the United Kingdom, Canada, and South Africa. In all studies, same-offender crime pairs were characterised by significantly greater similarity in crime scene behavior than different-offender crime pairs. ...
Crime linkage can be a useful tool in the investigation of sexual offenses when other, physical evidence is unavailable or too costly to process. It involves identifying behavior that is both consistent and distinctive, and thus forms an identifiable pattern through which a series of offenses committed by the same offender can be distinguished. While there is a substantial body of research to support the principles of crime linkage, samples often contain only one type of sexual offense, and further research is needed into offenses such as voyeurism and exhibitionism. In practice, there are a number of ways in which crime linkage can be conducted, and a variety of terms are used to describe these different processes. While writings from practitioners provide insight into how crime linkage is conducted, research now needs to focus more on systematically mapping its practice and documenting procedural differences. There are also a number of additional considerations that require further research attention where the practice of crime linkage is concerned, such as the utility of computerised databases designed to assist with the process, the human decision-making element of linking and how bias can affect this, and the effects of expertise and training on linkage efficacy.
... This information could then be used to identify a behavioural pattern of offending across a series of offences based on an offender's modus operandi (MO), which allows for linking crimes committed by the same offender and more effectively identifying serial perpetrators. In the Global North, research has found that information about consistent and distinctive perpetrator behaviours established through the victim's description of the offence to the police can be used to link crimes committed by the same perpetrator [21,22]. More recent work indicates that these techniques are promising in the Global South in helping the police to solve serial offences [23,24]. ...
... First, doing so does not decrease accu-racy. Second, behaviourally relevant details can be used to link crimes together, helping to provide evidence and aid investigations to bring serial offenders to apprehension, thus preventing further offences [21][22][23][24]. Additionally, and as noted above, behaviourally relevant details can also be important for other types of analyses, such as indicating geographical and temporal crime patterns to inform situational crime prevention strategies. ...
Full-text available
Police interviews gather detailed information from witnesses about the perpetrator that is crucial for solving crimes. Research has established that interviewing witnesses immediately after the crime maintains memory accuracy over time. However, in some contexts, such as in conflict settings and low-income countries, witness interviews occur after long delays, which decreases survivors’ access to vital services and justice. We investigated whether an immediate interview via a mobile phone application (SV_CaseStudy Mobile Application, hereafter MobApp) developed by the Kenyan Survivors of Sexual Violence Network preserves people’s memory accuracy over time. Participants (N = 90) viewed a mock burglary and were then interviewed either immediately using Mo