ChapterPDF Available

Data Mining at the West Midlands Police: A Study of Bogus Official Burglaries

Authors:

Abstract and Figures

Bogus Official burglaries refer to crimes where a person(s) gains access to a premises by deception with the intention of stealing property. Experience has shown that such offenders tend to commit several similar crimes. The offender(s) may also be active across a wide geographical region covered by different Police areas. That, together with the sheer volume of such crimes makes it very difficult for the Police to link crimes together in order to form composite descriptions of offender(s) and identify patterns in their activities. This paper describes the results of applying a Kohonen Self Organising Map (SOM) to a set of data derived from reported bogus official crimes with the objective of linking crimes committed by the same offender. The issues involved with how crime data is selected, cleaned and coded are also discussed. The results were independently validated and show that the SOM has found some links that warrant follow up investigations. Some problems with data quality were experienced. Their effect on the map produced by the SOM algorithm is also discussed.
Content may be subject to copyright.
Data Mining at the West Midlands Police:
A Study of Bogus Official Burglaries
Richard Adderley
West Midlands Police,
Bournville Lane Police Station
email:
r.adderley@west-midlands.police.uk
P. B. Musgrove
School of Computing and Information Technology,
University of Wolverhampton
email: P.B.Musgrove@wlv.ac.uk
Abstract
Bogus Official burglaries refer to crimes where a person(s) gains access to a
premises by deception with the intention of stealing property. Experience has
shown that such offenders tend to commit several similar crimes. The offender(s)
may also be active across a wide geographical region covered by different Police
areas. That, together with the sheer volume of such crimes makes it very difficult
for the Police to link crimes together in order to form composite descriptions of
offender(s) and identify patterns in their activities.
This paper describes the results of applying a Kohonen Self Organising Map
(SOM) to a set of data derived from reported bogus official crimes with the
objective of linking crimes committed by the same offender. The issues involved
with how crime data is selected, cleaned and coded are also discussed.
The results were independently validated and show that the SOM has found some
links that warrant follow up investigations. Some problems with data quality were
experienced. Their effect on the map produced by the SOM algorithm is also
discussed.
1 Introduction
Today computers are pervasive in all areas of business activities. This enables the
recording of all business transactions making it possible not only to deal with
record keeping and control information for management but also via the analysis
of those transactions to improve business performance. This has led to the
development of the area of Computing known as Data Mining [1].
The Police Force like any other business now relies heavily on the use of
computers. In the Police Force business transactions consist of the reporting of
crimes. A great deal of use is made of computers for providing management
information via monitoring statistics that can be used for resource allocation. The
information stored has also been used for tackling major serious crimes (usually
crimes such as serial murder or rape). The primary techniques used being
specialised database management systems and data visualisation [2]. However,
comparatively little use has been made of stored information for the detection of
volume crimes such as burglary. This is partly because major crimes can justify
greater resources on grounds of public safety but also because there are relatively
few major crimes making it easier to establish links between offences. With
volume crimes the sheer number of offences, the paucity of information, the
limited resources available and the high degree of similarity between crimes
renders major crime techniques ineffective.
There have been a number of academic projects that have attempted to apply AI
techniques, primarily Expert Systems, to detecting volume crimes such as
burglary [3],[4]. Whilst usually proving effective as prototypes for the specific
problem being addressed they have not made the transfer in to practical working
systems. This is because they have been stand-alone systems that do not integrate
easily into existing Police systems thereby leading to high running costs. They
tended to use a particular expert’s line of reasoning with which the detective using
the system might disagree. Also they lacked robustness and could not adapt to
changing environments. All this has led to wariness within the Police Force
regarding the efficacy of AI techniques for policing.
The objective of the current research project is therefore to evaluate the merit of
data mining techniques for crime analysis. The commercial data mining package
Clementine (SPSS) is being used in order to speed development and facilitate
experimentation. Clementine also has the capability of interfacing with existing
Police computer systems. The requirement for purpose written software outside
the Clementine environment is being kept to a minimum.
In this paper we report the results from applying one specific data mining
technique, the Self Organising Map (SOM) [5] to descriptions of offenders for a
particular type of crime, bogus official burglaries. The stages of: data selection,
coding and cleaning are described together with the interpretation of the meaning
of the resulting map. The merit of the map was independently validated by a
Police Officer who was not part of the research team.
2 The Application Task
The specific application task reported here consists of a particular type of
burglary. A ‘bogus officials’ offence (sometimes known as a distraction burglary)
refers to a burglary where the offender gains access to a premises by deception.
The offender(s) may pose as a member(s) of the utilities, Police, Social Services,
salespersons, even children who are looking for pets or toys, to gain entry into the
property. Typically once inside, the victim is engaged in conversation whilst an
accomplice searches for and steals property. In this type of burglary the victim
always meets the offender(s) and therefore should be able to provide a
description.
A problem with this type of crime is that the sheer number of offences committed
over a wide geographical area make it difficult to link crimes committed by the
same offender(s). The objective in this study is to see whether a SOM can be used
to link crimes based on offender descriptions. This will result in a map (more
accurately a matrix) where each cell represents a cluster of offender descriptions.
The ideal solution would be a SOM where each cell contains various descriptions
of all the crimes involving a single offender. Neighbouring cells in the map would
contain descriptions of different offenders who bear a physical resemblance.
The ideal solution will always be unattainable due to the same offender being
described differently by victims of different crimes. In addition, the high degree of
similarity between some offenders (e.g. young, average build, average height) will
inevitably mean the same cell will contain descriptions of different offenders. Just
how far the map derived in practice differs from the ideal would help determine
the efficacy of the technique. Unfortunately, few of the crimes have been
successfully detected (i.e. solved) and hence there is no perfect solution to act as a
comparison. Consequently, a subjective assessment of the merit of the resulting
map needs to be made. This subjective assessment can be supported/influenced by
information from those crimes that have been detected.
3 Data Selection, Cleaning and coding
The victims of this type of crime tends to be elderly. Their age together with the
distressed state brought on by the crime might be thought to lead to unreliable
descriptions being provided. However, a recent study commissioned by the
Metropolitan Police concluded that: -
"There is no evidence that their attention, language, recognition, recency
judgements or memory for the past is affected by age" [6].
The study included an experiment on older persons that indicated that the
offender's characteristics most likely to be accurately recalled are (in the order of
most common to least frequently mentioned): - Gender, Accent, Race, Age,
General facial appearance, Build, Voice, Shoes, Eyes, Clothes, Hair colour &
length.
When a bogus official crime is reported, a Police Officer attends the scene and
takes a number of witness statements and then completes a paper based report
called the “crime report” which includes information abstracted from the witness
statements. The crime reports are then summarised by civilian data entry clerks
when they enter details of the crime in to the computerised database system. The
crime record contains numerous fields. Fixed fields contain names, addresses,
beat number and other administrative data. In addition, there are two free text
fields; the first contains a description of the offender(s). The second describes
how the crime was committed (the modus operandi -MO).
Whilst providing valuable information the free text nature of these fields make
automated analysis difficult. Consequently it was necessary to write a simple
parser program to pick out key words and phrases. This proved more difficult than
expected due to the widely varying styles used by police officers and data entry
clerks. Spelling mistakes were common, abbreviations were inconsistent and word
sequencing varied (for example accent might be described as: - “Birmingham
acent”, “Birmingham accent”, “Bham accent”, “local accent”, “accent:
Birmingham”, or even “not local accent”). As a consequent, the coding was part
automatic and part manual.
Once key words had been abstracted from the description field, they showed some
agreement with that found by Barber [6] with the exception of shoes and eyes,
which were rarely mentioned. Because of the diversity of possible clothing and
the likelihood of it changing between crimes, it was decided to omit this from the
coded descriptions. This provided fields for age, gender, height, hair colour, hair
length, build, accent, and race. Fields not mentioned by Barber but included in
this study are the persons height and the number of accomplices.
Care needs to be taken when encoding data from its symbolic form to the numeric
form required by the SOM. Data could be a number on a continuous scale (such
as age), binary (such as gender), nominal (such as hair colour), and ordinal (such
as hair length which can be ordered as short, medium or long). Nominal and
ordinal variables can each be represented by a set of binary variables although
some information could be lost (i.e order information) [7]. A further problem
when dealing with continuous variables can arise due to certain variables
swamping the effect of others due to their range being greater. It is common to
standardise variables, but this can in itself cause problems of its own particularly
for unsupervised techniques (such as SOM). This is due to the discriminating
effect of the variable being lost. For example in scaling age which might range
from 15 to 65, in to the range 0 to 1 would lead to a 20 year old being scaled as
0.1 and a 30 year old 0.3. Thus a difference of ten years in age (a value of 0.2)
would be ten times less important compared to a difference in an attribute such as
build which is coded as a strict 0 or 1 (NB. a difference in build would score two,
one for each difference). For example if offender A was described as being aged
about 30 with medium build and offender B as being aged about 30 but with small
build it would have a difference of 2 between the descriptions. However if
offender A was described as medium build aged about 20 (scaled to 0.1) and
offender B described as medium build but aged about 30 (scaled as 0.3) the
difference would be 0.2. Due to these problems, it was decided to code the
continuous variables age and height using a binary encoding thereby placing them
on a similar level of importance as the other binary vaiables.
The continuous attributes age and height are each expressed as ranges split in to a
number of intervals. If the height given in the description lay within a specified
range, it was coded as a one and the other intervals as zero. In order to allow for
slight discrepancies between descriptions of the same offender, and to incorporate
some aspect of ordering, two sets of overlapping intervals were used for each
variable. This means that each height was encoded as a set of binary variables,
two of which would be set to one for any given height and the remainder set to
zero, similarly for age.
To illustrate, consider an offender who is estimated as being about 5’ 5” (people
still do not think in metric units) this would be encoded as a one for the interval
5’2” to 5’6” (e.g. H11=0 H12=1 and H13=0) and also a one for the interval 5’4”
to 5’8” (i.e. H21=0 H22=1, H23=0 and H24=0). This incorporates a degree of
fuzziness in the description of age and height. However, it is at the cost of
effectively giving age and height a double count when it comes to comparing
similarities between descriptions.
Figure 1 – Illustrative example of the encoding of height as zero or one
A further problem encountered in producing comparable descriptions is that of
missing attributes. Sometimes attributes such as build are not recorded. This
means in our system of encoding that all build binary variables will be set to zero.
This does not mean that the person does not have a build! The problem of missing
values is notorious in statistical data analysis. There is no universal solution for
dealing with this problem adequately. What is of interest is how robust the
technique is faced with the inevitable missing values.
5’ 5’6” 6’
H11
H12
H13
H22 H23 H21 H24
Over the three year period under consideration there were 800 bogus official
crimes involving 1,292 offenders in the Police areas under consideration. Dealing
with all 800 would generate a solution that was intractable regarding analysis and
validation. Consequently it was decide to deal with a subset of the crimes. Those
crimes involving female offenders were selected as they represented a reasonable
time cross section and consisted of just 105 offender descriptions associated with
89 crimes. The SOM algorithm was provided with records consisting of offender
descriptions. There could be more than one description associated with a
particular crime (i.e. in crimes where more than one female is involved). Each of
these descriptions was represented by up to eight attributes: Race, Age, Height,
number of accomplices, build, hair colour, hair length and accent. When
translated in to a binary encoding this resulted in 46 binary variables out of which
at most ten would be given a value of one (each height and age being represented
by two binary variables). In practice, due to incomplete descriptions, the number
of binary variables per description taking a value of one varied between three and
nine with an average of 7.5.
4 Application Construction
The SOM [5] is an unsupervised neural network training method. It takes data
consisting of a number of unordered records (in this task the 105 offender
descriptions) each of which is measured by a variety of attribute values (in this
task 46 binary variables). It iteratively organises the input records by grouping
them in to clusters. The clusters are themselves ordered in to a two dimensional
spatial configuration where the members of one cluster bear a resemblance to
neighbouring clusters but not as strong a resemblance as they do to members of
their own. The SOM can be viewed as a dimension reduction visualisation
technique in this case reducing from 46 dimensional space to two dimensions. The
resulting two dimensional configuration is a topological map rather than spatial
(i.e. it is like a London underground map rather than a road map). The
implementation of the SOM algorithm used was that provided by the Clementine
data mining package.
A design consideration when constructing a SOM is to decide on the dimensions
of the resulting grid. Too many cells would see various descriptions of the same
offender being split across a number of cells each with a highly specialised
description. Too few cells would see cells formed containing a large number of
different offenders potentially with a high degree of variability between
descriptions. It was decided to construct a five row by seven column map. This
allows for a potential of 35 different offenders each committing three crimes. If
there were more than 35 offenders, it would force offenders with similar
descriptions to be clustered together. If there are fewer than 35 offenders the
SOM algorithm could place descriptions of the same offender across a number of
cells. The SOM algorithm is free to put as many descriptions as it likes in a cell
(i.e. more or less than 3) depending upon how similar they are to each other.
5 Findings
The results produced by using the SOM option of Clementine can be seen in
figure 2. The cells in the table show the number of offenders placed in the cluster
associated with the cell. The blacked out cells indicate empty clusters. Their
presence in the SOM tends to indicate large spatial differences between clusters
on opposite sides.
4 5 2 6 4 5 4 5
3 2 1 2
2 5 2 5 1 7 2 9
1 2 2 1 1
0 5 4 3 3 7 4 6
0 1 2 3 4 5 6
Figure 2 – Derived cluster sizes
In order to interpret this map a symbolic description of each cluster was derived
by finding the average value for each attribute in a cluster. Provided the average
value was greater than 0.5 then, that binary variable name was assigned as the
cluster’s attribute value. This interpretation of the SOM can be seen in figure 3.
Blank fields are due either to great variability in the values of the attribute, or the
absence of a description for that attribute in the crime report for the majority of
cluster members.
Accompl
0
0
1
1
1
1
2
Race
IC1
IC1
IC1
IC1
IC1
IC1
IC4
Height
5'4"
5'4"
4
Age
20
32
9
9
Build
H. Col
Dark
Dark
Dark
H. Len
Accent
Accompl
0
1
Race
IC1
IC1
Height
5'0"
3
Ag
e
24
14
13
Build
Slim
H. Col
Dark
Fair
Dark
H. Len
Accent
Irish
Accompl
0
1
1
1
Race
IC1
IC1
IC1
IC1
IC1
Height
5'8"
5'8"
5'8"
5'6"
5'0"
2
Age
24
24
32
14
14
14
Build
Medium
Medium
H. Col
Dark
Dark
Dark
Dark
Dark
H. Len
Long
Accent
Accompl
1
0
0
Race
IC1
IC1
IC1
Height
5'5"
5'8"
5'6"
5'2"
1
Age
24
21
29
17
Build
Slim
Slim
Slim
H. Col
Dark
Dark
Dark
H. Len
Long
Long
Accent
Ac
compl
1
1
1
1
1
Race
IC1
IC1
IC1
IC1
IC1
IC1
Height
5'4"
5'6"
5'6"
5'6"
5'4"
5'4"
0
Age
24
23
19
19
17
17
17
Build
Slim
Slim
Slim
Slim
H. Col
Dark
Dark
Dark
Dark
Dark
H. Len
Long
Short
Long
Long
Short
Accent
Local
Local
0
1
2
3
4
5
6
Figure 3- symbolic descriptions of clusters.
6 Validation Process
The SOM labelled map together with the crime numbers appertaining to each
description in a SOM cell were passed to a police sergeant who was not part of
the research team for independent verification. The sergeant had access to more
information than had been made available to the SOM algorithm. This included
full witness statements (often from more than one for each crime); information on
the Modus Operandi (MO) and information as to which crimes had been solved.
Time permitted the sergeant to analyse 17 of the 24 non-empty clusters. The
sergeant was given the brief to decide if there is sufficient evidence in the witness
statements, and for those crimes that had been solved, to say there is a possible
link between some of the crimes in each cluster. Clusters were analysed
individually with no attempt made to look for links between neighbouring
clusters.
Of the 17 clusters analysed one contained insufficient details to make a
judgement, five had no apparent links between offenders in them. The remaining
eleven in the judgement of the sergeant contained subsets of offender descriptions
that could be linked based on the extra sources of information.
An example of a description provided by the sergeant is cluster (6,0).
“6 crimes; 3 with 1 male and 1 female, 2 with 2 female and 1 with 1 female
and 2 males. One crime was detected to Mr. X. The female’s ages range
from 13 yrs to 25 yrs across the cluster, only one not being described as
slim/thin. The heights range from 5’2” to 5’5”. Short hair. In three crimes
the MO was very similar in that social services and food parcel were
mentioned but this did not occur for the detected crime.”
The independent evidence provided by the social services MO provides
suggestive evidence for linking three of the six crimes. The descriptions for these
three crimes could be consolidated to form a composite picture of the female
offender.
These results are encouraging, as links between crimes have been established that
had not been previously made. However, the sergeant mentioned two negative
aspects that need addressing. First, many of the cells analysed contained members
that were in his opinion clearly different to the majority of members of the cell.
Second, some of the solved crimes appertained to offenders appearing in widely
differing cells on the map. He suggested one possible cause being the wide
variance in descriptions of the same offender (in those case where a definite link
can be made, this is contrary to Barber’s findings see above). To illustrate he
provided the following example again for cluster (6,0).
“2 crimes in this cluster were committed next door to each other 3 1/2
hours apart on the same day, the same MO was used and 1 male and 1
female were the offenders. In the first crime the offenders were described
as female, IC1, 18 yrs, local accent, 5’5” thin build with blond bobbed
hair; male IC1, 25 yrs, 6’ thin build with short ginger hair. In the other
crime the offenders were described as female, IC1, 20 - 25 yrs, 5’2”, slim
build with short dark hair; male, IC1, 25 - 30 yrs, 5’8”, robust build with
fair hair. In the case papers, the Officer who attended the scene commented
that the victim, in the second crime, was confused and forgetful and could
not be regarded as a reliable witness.”
7 Discussion and further work
Whilst generally encouraging the validation process indicated a number of areas
where there is room for improvement. One would be to consider removing
descriptions from the analysis where there are a number of incomplete values.
This was the main contributor to the clusters were the sergeant could not find any
links. This does not mean these crimes would be ignored. Once the SOM is
derived from the more complete descriptions the less complete descriptions can
be matched against the stereotype description for each cell and then ranked in
terms of the goodness of the match. Possibly, these vague descriptions could be
considered as being “secondary” members of more than one cell.
Another possible improvement is to merge some of the neighbouring clusters to
make allowance for slight variations in descriptions. The 5 row by 7 column SOM
was an arbitrary selection. Possibly it is too big. One way of merging clusters
suggested in [8] is to use the vector of average values representing each cluster
and apply hierarchical agglomerative clustering [7]. This basically means
sequentially merging clusters based on their distance apart (distance can be
measured in many ways here we used the standard squared Euclidean distance)
recalculating the new cluster average and then merge the next two nearest. The
agglomerative clustering was performed using the SPSS statistical package. The
results are displayed in the dendrogram in figure 4. (A dendrogram is a graphical
way of showing the hierarchical merging process.)
This dendrogram shows that cluster (3,4) should be the first to be merged with
(4,4). As these both had the same symbolic description in figure 3 this is no
surprise. The next two clusters to be merged would be (4,0) and (5,0). This
process could be continued indefinitely until there is only one cluster. Ripley [7]
suggests stopping the merging process when a merging is suggested between two
clusters that are not contiguous on the map. This occurs when (0,2) is suggested
as being merged with the (2,4) (3,4) and (4,4) supercluster.
C = column R = Row
C R
3 4 -+-------+
4 4 -+ +-------+
2 4 ---------+ +---+
0 2 -----------------+ +---+
0 4 ---------------------+ +-----+
1 4 -------------------------+ +-------+
0 3 -------------------------------+ I
4 0 ---+---------+ +---+
5 0 ---+ +-----------+ I I
6 0 -------------+ +-------------+ +---+
5 1 -------------------------+ I I
3 1 -------------------------------------------+ +-+
5 4 -------------------+---------------+ I I
6 4 -------------------+ I I I
6 2 -------+---+ +-----------+ I
6 3 -------+ +---+ I I
5 2 -----------+ +-------+ I I
4 2 ---------------+ +-----------+ I
4 3 -----------------------+ I
1 2 ---------------------+-------+ I
2 2 ---------------------+ +---------+ I
3 2 -----------------------------+ I I
0 0 -------+-----+ +---------+
1 0 -------+ +-------------+ I
0 1 -------------+ +-----+ I
2 0 ---------------+-----------+ +-----+
3 0 ---------------+ I
2 1 ---------------------------------+
Figure 4- Dendrogram for hierarchical agglomerative clustering of SOM cluster centres.
The effect of applying hierarchical clustering on the SOM can be seen in figure 5.
4 5 2 15 4 5
3 2 1
2 5 2 5 1 20
1 2 1 1
0 11 3 3 17
0 1 2 3 4 5 6
Figure 5 – SOM map following merging of spatially near neighbours.
Merging to avoid missing possible links with neighbours will undoubtedly mean
merging some unrelated crime descriptions together, However, the numbers are
still at a tractable level for manual analysis. Also it is possible to apply a splitting
criteria (e.g. race) to members of the specific supercluster. Different superclusters
might use different splitting criteria.
The above merging will address some of the problems where descriptions vary
slightly however for more radical variations it will not help. These are best
addressed outside the context of software tools. If an indication of the reliability
of the witness statement could be obtained then only reliable data could be used.
Also some variability is due to the time span. The data used in this study covered
a three year period. During that time, the appearance of one particular teenage
offender, who was convicted for a number of the crimes, changed radically. When
dealing with larger collections of data (i.e. male offenders) crimes committed
within a smaller time window should be used.
A valuable source of information not included in this study is the Modus
Operandi (MO). The diversity of MOs together with the variety of ways of
describing them precluded their use within the timescales and budget of the
current study. However, this information was utilised for validation purposes by
an independent Police Officer (see above). The loss of this information initially
appears restrictive but it does lend extra generality to results obtained as they
would be applicable to descriptions for crimes other than bogus official
burglaries. An illustration of the type of information available but omitted can be
seen in figure 6.
MO Field
PERSON UNKNOWN POSING AS COUNCIL WATERBOARD WORKER
GAINED ENTRY TO PREMISES. KEPT IP ENGAGED IN KITCHEN WHILST
SECOND MALE ENTERED PREMISES AND MADE SEARCH OF FLAT AND
STOLE PROPERTY (2ND PERSON NOT SEEN IN PREMISES) BOGUS
WORKER MADE EXCUSES AND LEFT PREMISES.
OFFENDER ATTENDED PREMISES SHOWED "HOUSING DEPARTMENT" ID
CARD WITH PHOTO ON IT & SAID HE NEED TO CHECK THE WATER
OFFENDER WAS ALLOWED IN BY ELDERLY IP WHO WAS THEN TOLD TO
RUN THE KITCHEN TAPS OFFENDER STAYED FOR A FEW MINUTES
BEFORE LEAVING DURING WHICH HE WAS ALLOWED ACCESS TO ALL
ROOMS UNACCOMPANIED AFTER OFFENDER HAD LEFT PREMISES IP
DISCOVERED PROPERTY MISSING
Figure 6 – Illustrative examples of the Modus Operandi free text field.
A further use of the SOM could be to link crimes based on pairing of offenders.
For example if a crime was committed by two offenders and the description of
one offender is in, say, cell (0,0) and the description of the other offender is in
cell (4,4) then look for other crimes committed by pairs belonging to these two
cells or their near neighbours. This will be the subject of further investigation.
8 Conclusions
We have described how the SOM algorithm can be used to cluster offender
descriptions for a particular type of crime, the bogus official burglary.
Independent validation has shown that interesting links have been found within
clustered descriptions. Some problems have been identified and solutions
suggested. Some of these problems are to do with the data and the need for
cleaner fuller descriptions being selected before being used by the SOM
algorithm. Others are to do with modifying the final map in order to facilitate the
search for links with descriptions belonging to neighbouring clusters.
References
1. Adriaans, P. & Zantinge, D. Data Mining, pub: Addison-Wesley, 1996.
2. Adderley, R. & Musgrove, P.B. General Review of Police Crime Recording and
Investigation Systems, Submitted to:- Policing: An International Journal of Police
Strategies and Managemen.t
3. R Lucas, An Expert System to Detect Burglars using a Logic Language and a
Relational Database, 5
th
British National Conference on Databases, Canterbury , 1986
4. Charles, J. , AI and law enforcement, IEEE Intelligent Systems pp77-80 Jan/Feb 1998.
5. Kohonen, T. , The Self-Organizing Map, Proceedings of the IEEE, Vol. .78 no. 9 pp
1464-1480, 1990.
6. Baber M., Brough P., Identification Evidence of Elderly Victims and Witnesses,
Police Research Group, Home Office 1997.
7. Gordon, A.D. Classification pub Chapman and Hall. 1981
8. Ripley, B. D. Pattern recognition and neural networks, pub Cambridge University
Press, 1996
... In criminology, determining a possible suspect is based on clues found on the crime scene, and from the information gathered from witnesses and other sources. In recent years police forces have been enhancing their traditional method of crime information gathering and reporting with new technological advancements to increase their output by efficiently recording crimes to aid their investigations (Adderley, R.;Musgrove, P.B., 1999). A large set of data is not just a bulk of records but valuable information that can be useful to buildup crime detection and prevention strategies. ...
Research
Full-text available
An unpublished Computational Approach to Criminal Shortlisting based on Modus Operandi which was changed lately to hierarchical clustering.
... In criminology, determining a possible suspect is done based on clues found on the crime scene, and the information gathered from witnesses and other sources. In recent years police forces have been enhancing their traditional method of information gathering and reporting with new technological advancements to increase their output by efficiently recording crimes to aid their investigations [3]. A large set of data is not just a bulk of records but valuable information that can be useful to buildup crime detection and prevention strategies. ...
Conference Paper
Full-text available
One of the most challenging problems faced by crime analysts is identifying sets of crimes committed by the same individual or group. Amount of criminal records piling up daily has made it cumbersome to manually process connections between crimes. These Crime series' possess certain attributes that are characteristic of the criminal(s) involved in them, which are useful in defining their modus operandi (MO). After a careful study in the grave crime category of House breaking and Theft in Sri Lanka, we have identified certain MO attributes which we have used to collect from past crime scene data from police records. Then we have explored whether it is possible to group suspects who have similar MO patterns through a machine learning approach and give a short list for a new crime from the existing data. The evaluation of the research presented an accuracy above 75% which proved that Machine Learning is capable of short listing criminals based on their Modus Operandi features.
... Building on previous work by the authors (Adderley and Musgrove 1999) a further level of refinement to the modelling process was used. Features of the MO, spatial and temporal analysis from the Primary Network crimes were used in a Kohonen self organising map algorithm to cluster the similarities. ...
Article
Full-text available
This article looks at the application of data-mining techniques, principally the multi-layer perceptron, radial basis function and self-organising map, to the recognition of burglary offences committed by a network of offenders. The aim is to suggest a list of currently undetected crimes that may be attributed to one or more members of the network and improve on the time taken to complete the task manually and the relevancy of the list of crimes. The data were drawn from four years of burglary offences committed within an area of the West Midlands police. They were encoded from text by a small team of specialists working to a well-defined protocol and analysed using the above techniques contained within the data-mining workbench of SPSS/Clementine. Within minutes, three months of undetected crimes were analysed through the Clementine stream, producing a list of offences that might be attributed to the network of offenders. The results were analysed by two police sergeants not associated with the development process who determined that 85 per cent of the nominated crimes could be attributed to the network of offenders. To produce a manual list would take between one-and-a-half and two hours and be between 5 per cent and 10 per cent accurate.
... Building on previous work by the authors (Adderley and Musgrove 1999) a further level of refinement to the modelling process was used. Features of the MO, spatial and temporal analysis from the Primary Network crimes were used in a Kohonen self organising map algorithm to cluster the similarities. ...
... The role of computers has been increased in all walks of life from the finance sector to supermarkets. In recent years police forces have been enhancing their traditional method of crime reporting with new technological advancements to increase their output by efficiently recording crimes to aid their investigation (Adderley and Musgrove 1999). Data is not just a record of crimes, it also contains valuable information that could be used to link crime scenes based on the modus operandi (MO) of the offender(s), suggest which offenders may be responsible for the crime and also identify those offenders who work in teams (offender networks) etc. ...
Chapter
Full-text available
Police analysts are requiredto unravel the complexities in data to assist operational personnel in arresting offenders and directing crime prevention strategies. However, the volume of crime that is being committed and the awareness of modern criminals make this a daunting task. The ability to analyse this amount of data with its inherent complexities without. using computational support puts a strain on human resources. This paper examines the current techniques that are used to predict crime and criminality. Over time, these techniques have been refined and have achieved limited success. They are concentrated into three categories: statistical methods, these mainly relate to the journey to crime, age of offending and offending behaviour; techniques using geographical information systems that identify crime hot spots, repeat victimisation, crime attractors and crime generators; a miscellaneous group which includes machine learning techniques to identify patterns in criminal behaviour and studies involving reoffending. The majority of current techniques involve the prediction of either a single offender’s criminality or a single crimetype’s next offence. These results are of only limited use in practical policing. It is our contention that Knowledge Discovery in Databases should be used on all crime types together with offender data, as a whole, to predict crime and criminality within a small geographical area of a police force.
... Typically an OCU of a Police force will only have one crime analyst, and so the benefit of empowering the more numerable Police officers is clear. Various artificial intelligence algorithms have been brought to bear upon areas of criminal investigation, which include the following as examples of the diversity of the approaches: neural networks for clustering of crimes ('crime and disorder': Wilson, Corcoran, & Ware, 2002; 'bogus official' reports: Adderley & Musgrove, 1999 , casebased reasoning for predictions based on series of crimes Ribaux & Margot, 1999, probabilistic analysis of modus operandi Yokota & Watanabe, 2002, and spatio-temporal analysis of crimes Ratcliffe, 2002). However, these are research applications not tools that OCU managers would be able to use. ...
Article
The OVER Project was a collaboration between West Midlands Police, UK, the Centre for Adaptive Systems, and Psychology Division, from the University of Sunderland. The Project was developed primarily to assist the Police with the high volume crime, burglary from dwelling houses. A developed software system enables the trending of historical data, the testing of ‘short term’ hunches, and the development of ‘medium’ and long term’ strategies to burglary and crime reduction, based upon victim, offender, location and details of victimisations. The software utilises mapping and visualisation tools and is capable of a range of sophisticated predictions, tying together statistical techniques with theories from forensic psychology and criminology.The statistical methods employed (including multi-dimensional scaling, binary logistic regression) and ‘data-mining’ technologies (including neural networks) are used to investigate the impact of the types of evidence available and to determine the causality in this domain. The final predictions on the likelihood of burglary are calculated by combining all of the varying sources of evidence into a Bayesian belief network. This network is embedded in the developed software system, which also performs data cleansing and data transformation for presentation to the developed algorithms.It is important that derived statistics from the software and predictions are interpretable by the intended users of the decision support system, namely Police sector managers, and this paper includes some of the design decisions based upon the forensic psychology and criminology literature, including the graphical representation of geographic data and presentation of results of analyses.
... A self organizing map (SOM) was selected because it has the ability both to cluster similar records in to the same cell whilst producing a two dimensional topological map showing the relationship of those records to near neighbors. This can be used to form larger clusters by merging neighboring cells [1]. It also aids in determining the relationship between broad categories of crime. ...
Conference Paper
Full-text available
This paper looks at the use of a Self Organizing Map (SOM), to link of records of crimes of serious sexual attacks. Once linked a profile can be derived of the offender(s) responsible.The data was drawn from the major crimes database at the National Crime Faculty of the National Police Staff College Bramshill UK. The data was encoded from text by a small team of specialists working to a well-defined protocol. The encoded data was analyzed using SOMs. Two exercises were conducted. These resulted in the linking of several offences in to clusters each of which were sufficiently similar to have possibly been committed by the same offender(s). A number of clusters were used to form profiles of offenders. Some of these profiles were confirmed by independent analysts as either belonging to known offenders or appeared sufficiently interesting to warrant further investigation.The prototype was developed over 10 weeks. This contrasts with an in-house study using a conventional approach, which took 2 years to reach similar results. As a consequence of this study the NCF intends to pursue an in-depth follow up study.
... However, before presentation to the Kohonen network, significant hand preprocessing/relabelling of text was required. In a similar fashion to their work with " bogus official " reports (Adderley and Musgrove, 1999) also required significant recoding.Corcoran et al. 2001Corcoran et al. , 2003SOMs, for hotspot prediction, with " explanation " of the clusters using rule abduction. SOMs were also employed for offender predictions among the West Midlands burglary data. ...
Article
Full-text available
The paper sets out the challenges facing the Police in respect of the detection and prevention of the volume crime of burglary. A discussion of data mining and decision support technologies that have the potential to address these issues is undertaken and illustrated with reference the authors' work with three Police Services. The focus is upon the use of \soft" forensic evidence which refers to modus operandi and the temporal and geographical features of the crime, rather than \hard" evidence such as DNA or flngerprint evidence. Three objectives underpin this paper. Firstly, given the continuing expansion of forensic computing and its role in the emergent discipline of Crime Science, it is timely to present a review of existing methodologies and research. Secondly, it is important to extract some practical lessons concerning the application of computer science within this forensic domain. Finally, from the lessons to date, a set of con- clusions will be advanced, including the need for multidisciplinary input to guide further developments in the design of such systems. The objectives are achieved by flrst considering the task performed by the in- tended systems users. The discussion proceeds by identifying the portions of these tasks for which automation would be both beneflcial and feasible. The knowledge discovery from databases process is then described, starting with an examination of the data that police collect and the reasons for storing it. The discussion pro- gresses to the development of crime matching and predictive knowledge which are operationalised in decision support software. The paper concludes by arguing that computer science technologies which can support criminal investigations are wide ranging and include geographical informa- tion systems displays, clustering and link analysis algorithms and the more complex use of data mining technology for proflling crimes or ofienders and matching and predicting crimes. We also argue that knowledge from disciplines such as foren- sic psychology, criminology and statistics are essential to the e-cient design of operationally valid systems.
Chapter
This chapter gives an account of the nine Laws of Data Mining, and proposes two hypotheses about data mining and cognition. The nine Laws describe key properties of the data mining process, and their explanations explore the reasons behind these properties. The first hypothesis is that data mining is a kind of intelligence amplifier, because the data mining process enables the data miner to see things which they could not see unaided, as stated in the sixth law of data mining. The second hypothesis is that machine learning algorithms have a special value to data mining because they represent knowledge in a way which is cognitively plausible, and this makes them more suitable for intelligence amplification.
Thesis
Full-text available
Modus Operandi is a trait that most of the living beings have, which is also identified as mode of operation. Comparing several crimes performed by the same criminal will reveal a specific Modus Operandi pattern for an individual or a group. This research suggests a model in shortlisting criminal suspects by identifying these patterns using a Specific M.O feature set and a model derived from a machine learning approach. The model creates a number of clusters which represents a unique MO classification. Unlike any of the classifications used so far this classification is made of a collection of single MO behaviors which were exhibited in unison. This classification method has its own importance since it is unique and is foolproof for changes that can happen in a single MO behavior. The model has proven to be correct with above 75% of accuracy. It validates both internal and external accuracy of the created Modus Operandi classifications within the clusters. However, the need of prior records of a criminal within the database is a dependency in this research when identifying a criminal. As future extensions of this research a prediction model and a system implementation architecture is proposed to deploy the system with this model.
Article
The self-organized map, an architecture suggested for artificial neural networks, is explained by presenting simulation experiments and practical applications. The self-organizing map has the property of effectively creating spatially organized internal representations of various features of input signals and their abstractions. One result of this is that the self-organization process can discover semantic relationships in sentences. Brain maps, semantic maps, and early work on competitive learning are reviewed. The self-organizing map algorithm (an algorithm which order responses spatially) is reviewed, focusing on best matching cell selection and adaptation of the weight vectors. Suggestions for applying the self-organizing map algorithm, demonstrations of the ordering process, and an example of hierarchical clustering of data are presented. Fine tuning the map by learning vector quantization is addressed. The use of self-organized maps in practical speech recognition and a simulation experiment on semantic mapping are discussed
Book
Ripley brings together two crucial ideas in pattern recognition: statistical methods and machine learning via neural networks. He brings unifying principles to the fore, and reviews the state of the subject. Ripley also includes many examples to illustrate real problems in pattern recognition and how to overcome them.
Article
The self-organizing map (SOM) is an automatic data-analysis method. It is widely applied to clustering problems and data exploration in industry, finance, natural sciences, and linguistics. The most extensive applications, exemplified in this paper, can be found in the management of massive textual databases and in bioinformatics. The SOM is related to the classical vector quantization (VQ), which is used extensively in digital signal processing and transmission. Like in VQ, the SOM represents a distribution of input data items using a finite set of models. In the SOM, however, these models are automatically associated with the nodes of a regular (usually two-dimensional) grid in an orderly fashion such that more similar models become automatically associated with nodes that are adjacent in the grid, whereas less similar models are situated farther away from each other in the grid. This organization, a kind of similarity diagram of the models, makes it possible to obtain an insight into the topographic relationships of data, especially of high-dimensional data items. If the data items belong to certain predetermined classes, the models (and the nodes) can be calibrated according to these classes. An unknown input item is then classified according to that node, the model of which is most similar with it in some metric used in the construction of the SOM. A new finding introduced in this paper is that an input item can even more accurately be represented by a linear mixture of a few best-matching models. This becomes possible by a least-squares fitting procedure where the coefficients in the linear mixture of models are constrained to nonnegative values.
General Review of Police Crime Recording and Investigation Systems
  • R Adderley
  • P B Musgrove
Adderley, R. & Musgrove, P.B. General Review of Police Crime Recording and Investigation Systems, Submitted to:-Policing: An International Journal of Police Strategies and Managemen.t
Identification Evidence of Elderly Victims and Witnesses, Police Research Group, Home Office
  • M Baber
  • P Brough
Baber M., Brough P., Identification Evidence of Elderly Victims and Witnesses, Police Research Group, Home Office 1997.
Classification pub Chapman and Hall
  • A D Gordon
Gordon, A.D. Classification pub Chapman and Hall. 1981
  • J Charles
  • Enforcement
Charles, J., AI and law enforcement, IEEE Intelligent Systems pp77-80 Jan/Feb 1998.