ArticlePDF Available

Crime Analysis using K-Means Clustering

Authors:

Abstract and Figures

today's world security is an aspect which is given higher priority by all political and government worldwide and aiming to reduce crime incidence. As data mining is the appropriate field to apply on high volume crime dataset and knowledge gained from data mining approaches will be useful and support police force. So In this paper crime analysis is done by performing k-means clustering on crime dataset using rapid miner tool.
Content may be subject to copyright.
International Journal of Computer Applications (0975 8887)
Volume 83 No4, December 2013
1
Crime Analysis using K-Means Clustering
Jyoti Agarwal
Mtech CSE
Amity University,Noida
Renuka Nagpal
Assistant Professor
Amity University ,Noida
Rajni Sehgal
Assistant Professor
Amity University,Noida
ABSTRACT
In today’s world security is an aspect which is given higher
priority by all political and government worldwide and aiming
to reduce crime incidence. As data mining is the appropriate
field to apply on high volume crime dataset and knowledge
gained from data mining approaches will be useful and
support police force. So In this paper crime analysis is done
by performing k-means clustering on crime dataset using
rapid miner tool.
Keywords
Cluster, Crime Analysis and Rapid miner
1. INTRODUCTION
In present scenario criminals are becoming technologically
sophisticated in committing crime and one challenge faced by
intelligence and law enforcement agencies is difficulty in
analyzing large volume of data involved in crime and terrorist
activities therefore agencies need to know technique to catch
criminal and remain ahead in the eternal race between the
criminals and the law enforcement. So appropriate field need
to chosen to perform crime analysis and as data mining refers
to extracting or mining knowledge from large amounts of
data, data mining is used here on high volume crime dataset
and knowledge gained from data mining approaches is useful
and support police forces. To perform crime analysis
appropriate data mining approach need to be chosen and as
clustering is an approach of data mining which groups a set of
objects in such a way that object in the same group are more
similar than those in other groups and involved various
algorithms that differ significantly in their notion of what
constitutes a cluster and how to efficiently find them. In this
paper k means clustering technique of data mining used to
extract useful information from the high volume crime dataset
and to interpret the data which assist police in identify and
analyze crime patterns to reduce further occurrences of similar
incidence and provide information to reduce the crime. In this
paper k mean clustering is implemented using open source
data mining tool which are analytical tools used for analyzing
data .Among the available open source data mining suite such
as R, Tanagra ,WEKA ,KNIME ,ORANGE ,Rapid miner.k
means clustering is done with the help of rapid miner tool
which is an open source statistical and data mining package
written in Java with flexible data mining support options. Also
for crime analysis dataset used is Crime dataset an offences
recorded by the police in England and Wales by offence and
police force area from 1990 to 2011-12 .In this paper
homicide which is crime committed by human by killing
another human is being analyzed .
This paper is divided into 7 sections: Related work, Proposed
System Architecture, Experimental set up & Results,
Conclusion, Future scope, References
1.1 Crime analysis
Crime analysis is defined as analytical processes which
provides relevant information relative to crime patterns and
trend correlations to assist personnel in planning the
deployment of resources for the prevention and suppression of
criminal activities
It is important to analyze crime due to following reasons :
1. Analyze crime to inform law enforcers about general and
specific crime trends in timely manner
2. Analyze crime to take advantage of the plenty of
information existing in justice system and public domain.
Crime rates are rapidly changing and improved analysis finds
hidden patterns of crime, if any, without any explicit prior
knowledge of these patterns.
The main objectives of crime analysis include:
1. Extraction of crime patterns by analysis of available
crime and criminal data
2. Prediction of crime based on spatial distribution of
existing data and anticipation of crime rate using
different data mining techniques
3. Detection of crime
2. RELATED WORK
Data mining in the study and analysis of criminology can be
categorized into main areas, crime control and crime
suppression. De Bruin et. al. [1] introduced a framework for
crime trends using a new distance measure for comparing all
individuals based on their profiles and then clustering them
accordingly. Manish Gupta et. al. [2]. highlights the existing
systems used by Indian police as e-governance initiatives and
also proposes an interactive query based interface as crime
analysis tool to assist police in their activities. He proposed
interface which is used to extract useful information from the
vast crime database maintained by National Crime Record
Bureau (NCRB) and find crime hot spots using crime data
mining techniques such as clustering etc. The effectiveness of
the proposed interface has been illustrated on Indian crime
records. Nazlena Mohamad Ali et al.[3] discuss on a
development of Visual Interactive Malaysia Crime News
Retrieval System (i-JEN) and describe the approach, user
studies and planned, the system architecture and future plan.
Their main objectives were to construct crime-based event;
investigate the use of crime based event in improving the
classification and clustering; develop an interactive crime
news retrieval system; visualize crime news in an effective
and interactive way; integrate them into a usable and robust
system and evaluate the usability and system performance and
the study will contribute to the better understanding of the
crime data consumption in the Malaysian context as well as
the developed system with the visualization features to
address crime data and the eventual goal of combating the
crimes .Sutapat Thiprungsri [4] examines the application of
cluster analysis in the accounting domain, particularly
discrepancy detection in audit. The purpose of his study is to
examine the use of clustering technology to automate fraud
International Journal of Computer Applications (0975 8887)
Volume 83 No4, December 2013
2
filtering during an audit. He used cluster analysis to help
auditors focus their efforts when evaluating group life
insurance claims. A. Malathi et al.[5] look at the use of
missing value and clustering algorithm for a data mining
approach to help predict the crimes patterns and fast up the
process of solving crime. Malathi. A et. al.[6] used a
clustering/classify based model to anticipate crime trends. The
data mining techniques are used to analyze the city crime data
from Police Department. The results of this data mining could
potentially be used to lessen and even prevent crime for the
forth coming years.Dr. S. Santhosh Baboo and Malathi. A [7]
research work focused on developing a crime analysis tool for
Indian scenario using different data mining techniques that
can help law enforcement department to efficiently handle
crime investigation. The proposed tool enables agencies to
easily and economically clean, characterize and analyze crime
data to identify actionable patterns and trends .Kadhim
B. Swadi Al-Janabi [8] presents a proposed framework for the
crime and criminal data analysis and detection using Decision
tree Algorithms for data classification and Simple K Means
algorithm for data clustering. The paper tends to help
specialists in discovering patterns and trends, making
forecasts, finding relationships and possible explanations,
mapping criminal networks and identifying possible suspects.
Aravindan Mahendiran et al. [9] apply myriad of tools on
crime data sets to mine for information that is hidden from
human perception. With the help of state of the art
visualization techniques we present the patterns discovered
through our algorithms in a neat and intuitive way that enables
law enforcement departments to channelize their resources
accordingly. Sutapat Thiprungsri[10] examine the possibility
of using clustering technology for auditing. Automating fraud
filtering can be of great value to continuous audits. The
objective of their study is to examine the use of cluster
analysis as an alternative and innovative anomaly detection
technique in the wire transfer system. K. Zakir Hussain et al.
[11] tried try to capture years of human experience into
computer models via data mining and by designing a
simulation model.
3. PROPOSED SYSTEM
ARCHITECTURE
After literature review there is need to used an open source
data mining tool which can be implemented easily and
analysis can be done easily. So here crime analysis is done on
crime dataset by applying k means clustering algorithm using
rapid miner tool.
The procedure is given below:
1. First we take crime dataset
2. Filter dataset according to requirement and create new
dataset which has attribute according to analysis to be
done
3. Open rapid miner tool and read excel file of crime
dataset and apply “Replace Missing value operator” on it
and execute operation
4. Perform “Normalize operator” on resultant dataset and
execute operation
5. Perform k means clustering on resultant dataset formed
after normalization and execute operation
6. From plot view of result plot data between crimes and
get required cluster
7. Analysis can be done on cluster formed.
Fig 1: Flow chart of crime analysis
4. EXPERIMENTAL SETUP AND
RESULTS
4.1 Approach Used
4.1.1 k-means algorithm
K-means clustering is one of the method of cluster
analysis which aims to partition n observations into k clusters
in which each observation belongs to the cluster with the
nearest mean.
Process
1. Initially, the number of clusters must be known let it be k
2. The initial step is the choose a set of K instances as
centres of the clusters.
3. Next, the algorithm considers each instance and assigns
it to the cluster which is closest.
4. The cluster centroids are recalculated either after whole
cycle of re-assignment or each instance assignment.
5. This process is iterated.
K means algorithm complexity is O(tkn), where n is
instances, c is clusters, and t is iterations and relatively
efficient . It often terminates at a local optimum. Its
disadvantage is applicable only when mean is defined and
need to specify c, the number of clusters, in advance. It unable
to handle noisy data and outliers and not suitable to discover
clusters with non-convex shapes.
Take crime dataset
Filter dataset according to
requirement
Open Rapid miner tool and
read excel file of crime dataset
Apply Replace Missing Value
operator and execute
Perform k means clustering on
resultant dataset and execute
Perform Normalization operator
on resultant dataset and execute
Perform plot view and get cluster
Perform crime analysis on cluster
formed
International Journal of Computer Applications (0975 8887)
Volume 83 No4, December 2013
3
4.2 Dataset Used
Crime dataset used for crime analysis is an offences recorded
by the police in England and Wales by offence and police
force area from 1990 to 2011-12 [12].In Table 1 sample crime
dataset is shown.
Table 1. Crime dataset
Year
Homicide
Attempted
murder
Child
destruction
Causing
death by
careless
driving
1990
10
19
0
7
1990
6
10
0
5
1990
6
8
0
9
1990
6
2
0
15
1990
10
5
0
1
4.3 Tool Used
Many open source data mining suites are available such as R,
Tanagra, Weka , KNIME, Orange, Rapid miner. Here we are
performing crime analysis using Rapid miner tool because of
following reason:
1. It is solid and complete package with Flexible/affordable
support options.
2. Enterprise-ready performance and scalability for big
data analytics Innovative analyst support
3. We can program by piping components together in a
graphic ETL work flows.
Also it has good features that if you set up an illegal work
flows Rapid Miner suggest Quick Fixes to make it legal.
4.4. K means cluster analysis
This involves tracking crime rate changes from one year to the
next and used data mining to project those changes into the
future. Here we consider homicide crime and plot it with year
and analysis variation in graph on cluster formed.
1. Homicide
Cluster 0
Fig 2: Homicide is minimum in 2004 and maximum and
same in 2000 & 2008
From Fig 2 it can be seen that in year 2004 number of
homicide crime committed is minimum as compared to in
year 2008 where maximum number of homicide crime
committed.
Cluster 1
Fig 3: Homicide is minimum in 2008 and maximum in
1990 & 2004.
From Fig 3 it can be seen that in year 2008 number of
homicide crime committed is minimum as compared to in
year 1990 and 2000 where maximum number of homicide
crime committed.
Cluster 2
Fig 4: Homicide is minimum in 1992 and maximum in
2002
From Fig 4 it can be seen that in year 1992 number of
homicide crime committed is minimum as compared to in
year 2002 where maximum number of homicide crime
committed.
Cluster 3
Fig 5: Homicide is minimum in 2011 and maximum in
2003
0
5
10
15
1990
1993
1996
1999
2002
2005
2008
2011
no. of
crime
year
homicide
homicide
0
20
40
60
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
no. of
crime
year
homicide
homicide
0
200
400
600
1990
1993
1996
1999
2002
2005
2008
2011
no. of
crime
year
homicide
homicide
0
200
400
600
1990
1993
1996
1999
2002
2005
2008
2011
no. of
crime
year
homicide
homicide
International Journal of Computer Applications (0975 8887)
Volume 83 No4, December 2013
4
From Fig 5 it can be seen that in year 2011 number of
homicide crime committed is minimum as compared to in
year 2003 where maximum number of homicide crime
committed.
Cluster 4
Fig 6: Homicide is minimum in 1990 & 1993 and
maximum in 2007
From Fig 6 it can be seen that in year 1990 and 1993 number
of homicide crime committed is minimum as compared to
year 2007 where maximum number of homicide crime
committed.
5. CONCLUSION
This project focuses on crime analysis by implementing
clustering algorithm on crime dataset using rapid miner tool
and here we do crime analysis by considering crime homicide
and plotting it with respect to year and got into conclusion
that homicide is decreasing from 1990 to 2011 .From the
clustered results it is easy to identify crime trend over years
and can be used to design precaution methods for future.
6. FUTURE SCOPE
From the encouraging results, we believe that crime data
mining has a promising future for increasin the effectiveness
and efficiency of criminal and intelligence analysis. Visual
and intuitive criminal and intelligence investigation
techniques can be developed for crime pattern. As we have
applied clustering technique of data mining for crime analysis
we can also perform other techniques of data mining such as
classification. Also we can perform analysis on various
dataset such as enterprise survey dataset, poverty dataset, aid
effectiveness dataset, etc.
7. REFERENCES
[1] De Bruin ,J.S.,Cocx,T.K,Kosters,W.A.,Laros,J. and
Kok,J.N(2006) Data mining approaches to criminal
carrer analysis ,”in Proceedings of the Sixth International
Conference on Data Mining (ICDM”06) ,Pp. 171-177
[2] Manish Gupta1*, B.Chandra1 and M. P. Gupta1,2007
Crime Data Mining for Indian Police Information System
[3] Nazlena Mohamad Ali1, Masnizah Mohd2, Hyowon
Lee3, Alan F. Smeaton3, Fabio Crestani4 and Shahrul
Azman Mohd Noah2 ,2010 Visual Interactive Malaysia
Crime News Retrieval System
[4] Sutapat Thirprungsri Rutgers University .USA ,2011
Cluster Analysis of Anomaly Detection in Accounting
Data : An Audit Approach 1
[5] A.Malathi ,Dr.S.Santhosh Baboo. D.G. Vaishnav
College,Chennai ,2011 Algorithmic Crime Prediction
Model Based on the Analysis of Crime Clusters.
[6] Malathi.A 1 ,Dr.S.Santhosh Baboo 2 and Anbarasi . A 31
Assistant professor ,Department of Computer Science
,Govt Arts College ,Coimbatore , India . 2 Readers ,
Department of Computer science , D.G. Vaishnav Collge
,Chennai , India , 2011 An intelligent Analysis of a city
Crime Data Using Data Mining
[7] Malathi , A; Santhosh Baboo , S, 2011 An Enhanced
Algorithm to Predict a Future Crime using Data Mining
[8] Kadhim B.Swadi al-Janabi . Department of Computer
Science . Faculty of Mathematics and Computer Science
.University of Kufa/Iraq , 2011 A Proposed Framework
for Analyzing Crime DataSet using Decision Tree and
Simple K-means Mining Algorithms.
[9] Aravindan Mahendiran, Michael Shuffett, Sathappan
Muthiah, Rimy Malla, Gaoqiang Zhang,2011 Forecasting
Crime Incidents using Cluster Analysis and Bayesian
Belief Networks
[10] Sutapat Thiprungsri,2012 Cluster Analysis for Anomaly
Detection in Accounting Data : An Audit Approach1
[11] K. Zakir Hussain, M. Durairaj and G. Rabia Jahani
Farzana ,2012 Application of Data Mining Techniques
for Analyzing Violent Criminal Behavior by Simulation
Model
[12] https://www.gov.uk/government/publications/offences-
recorded-by-the-police-in-england-and-wales-by-
offence-and-police-force-area-1990-to-2011-12
0
200
400
1990
1993
1996
1999
2002
2005
2008
2011
no. of
crime
year
homicide
homicide
IJCATM : www.ijcaonline.org
... Although LULCC analysis depicts the relationship between the human and physical environment, the use of diverse datasets (e.g., satellite data and socioeconomic data) with clustering algorithms can produce detailed information and facts. For example, hierarchical clustering [29], K-means [30][31][32], and Gaussian mixture model [33] are a few benchmark clustering techniques for change analysis. The hierarchical clustering method is preferable with larger datasets, but k-means clustering performs better both with the large and medium datasets [34]. ...
... For water quality analysis, Zou et al. (2015) utilized the k-means classification technique and took the Heihe River in China as a study area [32]. Agarwal et al. (2013) intended to specify the crime trends of England and Wales and used the k-means method for crime analysis [31]. ...
... For water quality analysis, Zou et al. (2015) utilized the k-means classification technique and took the Heihe River in China as a study area [32]. Agarwal et al. (2013) intended to specify the crime trends of England and Wales and used the k-means method for crime analysis [31]. ...
Article
Full-text available
The vegetative cover in and surrounding the Rohingya refugee camps in Ukhiya-Teknaf is highly vulnerable since millions of refugees moved into the area, which led to severe environmental degradation. In this research, we used a supervised image classification technique to quantify the vegetative cover changes both in Ukhiya-Teknaf and thirty-four refugee camps in three time-steps: one pre-refugee crisis (January 2017), and two post-refugee crisis (March 2018, and February 2019), in order to identify the factors behind the decline in vegetative cover. The vegetative cover vulnerability of the thirty-four refugee camps was assessed using the Per Capita Greening Area (PCGA) datasets and K-means classification techniques. The satellite-based monitoring result affirms a massive loss of vegetative cover, approximately 5482.2 hectares (14%), in Ukhiya-Teknaf and 1502.56 hectares (79.57%) among the thirty-four refugee camps, between 2017 and 2019. K-means classification revealed that the vegetative cover in about 82% of the refugee camps is highly vulnerable. In the end, a recommendation as to establishing the studied region as an ecological park is proposed and some guidelines discussed. This could protect and reserve forests from further deforestation in the area, and foster future discussion among policymakers and researchers.
... It later gained its application in scientific and industrial software such as data mining (Berkhin, 2006;Pham et al., 2007), feature learning (Coates & Ng, 2012), and image and signal processing (Little & Jones, 2011). Noticeable literature on its recent application in crime analysis includes Agarwal et al. (2013) and Joshi et al. (2017). The software package, Rapid Miner Tool, which implements the K-means algorithm has been applied to identify trends and patterns of homicide rates in England and Wales (Agarwal et al., 2013), while similar application of the algorithm was used to detect clusters of various crime types in three cities in New South Wales, Australia (Joshi et al., 2017). ...
... Noticeable literature on its recent application in crime analysis includes Agarwal et al. (2013) and Joshi et al. (2017). The software package, Rapid Miner Tool, which implements the K-means algorithm has been applied to identify trends and patterns of homicide rates in England and Wales (Agarwal et al., 2013), while similar application of the algorithm was used to detect clusters of various crime types in three cities in New South Wales, Australia (Joshi et al., 2017). The K-means method suffers from limitations such as specification of the number of clusters by the user and sensitivity to outlier data and toward initial cluster centroids and consequently the possibility of converging into a local minimum (Celebi et al., 2013). ...
Article
Full-text available
Hotspot analysis of spatial attributes is a persistent research field in data mining, and applying a model-based clustering procedure is increasingly becoming popular in identifying trends and patterns in datasets on crime events occurring in space. The distributions of potential crime hotspots are parameterized as arising from Gaussian multivariate distributions, whose parameters are estimated by the expectation–maximization (E-M) algorithm, an iterative process with convergence very sensitive to initializations. In this study, a model-based clustering algorithm is explored from the E-M algorithm, initialized by K-means clustering using geodesic distance classification to estimate the model parameters and compared with the classical E-M algorithm, initialized with hierarchical clustering, to identify the distributional patterns of incidence of criminal activities. These model-based clustering algorithms were demonstrated on an open-source large dataset of violent crime activities, which occurred in West Midlands County. Training the data as a Gaussian process, the study identified 12 hotspots of Gaussian mixed models as clusters of an ellipsoidal distribution varying in shape, volume, and orientation, which are mostly found in central parts of boroughs of the study area. The proposed model-based clustering of the E-M algorithm combined with K-means clustering algorithm proved efficient as being fast and stable in convergence with low probability of uncertainty by classifications, producing same classification in some cases when compared to that of the classical E-M and K-means algorithms. The combined model-based clustering techniques applied in the hotspot analysis of criminal activities in space will not only provide insight into crime prediction and resource allocation in combating strategies but also guide researchers to adopt mechanisms for modeling large spatial attributes in data mining.
... Deployment organizes the patterns for desired outcome [1]. According to [16], [19], [15], crime data analysis provides summary statistics, general and specific crime trends to LEs in timely manner to enable understanding on crime and criminal behaviour. This assists LEAs to be proactive in crime detection and prevention while managing their limited resources effectively. ...
... In [19], k-means clustering algorithm was used to perform data analysis to assist LEs in crime reduction. The goal was to extract useful information from crime dataset to enable LEs to identify and analyze crime patterns for effective crime control and prevention. ...
... Moreover, the work were done through statistical methodologies as implemented where actual crime statistical data for the state of Mississippi was implemented the using linear regression, additive regression, and decision stump algorithms using the same finite set of features, on the communities and crime dataset by the authors (McClendon & Meghanathan, 2015).Overall, the linear regression algorithm performed the best among the three selected algorithms. Similarly the tools like the rapid miner tool was used for analyzing the crime rates and anticipation of the crime rate using different data mining techniques (Agarwal et al., 2013). Their work is done for crime analysis using the K-Means Clustering algorithm. ...
Article
Full-text available
Nepal is one of the most peace-loving nations on the globe and renowned for its second to none culture in hospitality. But it is quite surprising to know the statistics released by the Nepal Police which show that the country recorded 31,462 incidents of crime in the fiscal year 2015-16, compared to 28,070 in the previous fiscal year, with an annual increase of 10.15 percent. The records include cases of rape in the majority, along with accidental homicides, drug paddling and murder. This has been a topic of prime concern for the whole nation. Recent studies indicate that walking alone at night has a moderate risk in Nepal which has contributed to a significant rise in criminal activities. If we can minimize a certain percentage of this risk, it will be possible to contribute to reduction of the criminal activities. ICT has always been addressing different social problems and we cannot deny its use in controlling criminal activities. This paper discusses an application which would be generated with data collected from secure authorship to help notify users about the density of crime in different regions. This will help the users to choose safe routes with a low crime density. In addition, it will also inform the users about the statistics related to crimes such as rape and robbery in their area. It can assist in improving the status of social security in society to some extent. The research will draw a conclusion based on data collected from various sources, including police stations and victims, to analyze the frequency and concentration of crime in a specific region. This research aims to avoid misfortune for anyone and curtails uninvited risks that occur in crime hot-spots. In addition, users will be able to locate nearby hospitals or police stations during emergencies. The listed contacts can also be notified during times of possible dangers like suspicious people following the user persistently. The paper provides aid to social security by efficient research along with the use of applicable technology.
... Agarwal et al. [40] and Tayal et al. [41] used the Kmeans clustering-based model to show the crime patterns based on year. It also has the same problem as Spatio based system. ...
Article
Full-text available
Forecasting crime is complex since several complicated aspects contribute to a crime.Predicting crime becomes more challenging because of the enormous number of everyday crime episodesin varied places. Though there are many established machine learning and deep learning techniques, lawenforcement officers face challenges preventing crime from occurring promptly. An efficient way of lawenforcement is required to lower the crime rates. This paper proposed an effective multi-module methodfor predicting crime using deep learning techniques. Our proposed method has two modules: Feature LevelFusion and Decision Level Fusion. The first module employs temporal-based Attention LSTM, Spatio-Temporal based Stacked Bidirectional LSTM, and Fusion model. The Fusion model leverages the prior twomodel’s training data. The temporal-based model is the source model for the transfer learning techniqueon the dataset of different cities. By applying this technique, the training time of the model is reduced.In the second module, the Spatio-Temporal based Attention-LSTM, Stacked Bidirectional LSTM, and theresult of feature-level fusion module are used to get the final prediction. The proposed architecture predictsthe next hour based on the data from the past twenty-four hours. The estimated number of crimes in anycategory for a particular location can be obtained as the output of our suggested model. It also enables lawenforcement to get insight into future crime occurrences based on category, time, and location. This workconcentrated mainly on the USA’s San Francisco and Chicago cities for the experimental analysis. For theSan Francisco and Chicago datasets, our model has the Mean Absolute Error of 0.008, 0.02, the Coefficientof Determination of 0.95 and 0.94, and the Symmetric Mean Absolute Percentage Error of 1.03% and 0.6%,respectively. The proposed model outperforms numerous other well-known models.
... For Beijing, instead of crime data, only the locations of crime hotspots' centers [37] are available. We identify the crime hotspots of Chicago using k-means clustering [38] and create hotspots of similar sizes around the Beijing hotspots' centers. We generate daily crimes around the hotspots of Beijing following the distribution of Chicago. ...
Preprint
Full-text available
Ensuring travelers' safety on roads has become a research challenge in recent years. We introduce a novel safe route planning problem and develop an efficient solution to ensure the travelers' safety on roads. Though few research attempts have been made in this regard, all of them assume that people share their sensitive travel experiences with a centralized entity for finding the safest routes, which is not ideal in practice for privacy reasons. Furthermore, existing works formulate safe route planning in ways that do not meet a traveler's need for safe travel on roads. Our approach finds the safest routes within a user-specified distance threshold based on the personalized travel experience of the knowledgeable crowd without involving any centralized computation. We develop a privacy-preserving model to quantify the travel experience of a user into personalized safety scores. Our algorithms for finding the safest route further enhance user privacy by minimizing the exposure of personalized safety scores with others. Our safe route planner can find the safest routes for individuals and groups by considering both a fixed and a set of flexible destination locations. Extensive experiments using real datasets show that our approach finds the safest route in seconds with 47% less exposure of personalized safety scores.
Article
Using qualitative interview data, this article examines the role of crime analysts in producing knowledge, as well as the challenges they face. Through the collection and organization of data outlining pertinent information about specific districts, analysts aid in the implementation of policing practices. As such, analysts regard themselves as possessing a specialized form of knowledge, which they incorporate and draw on in the outputs they produce. We conclude that analysts do not always employ rigorous, scientific methodologies, while producing their intelligence outputs, suggesting rather that they rely on their familiarity and specialized knowledge of offenders and crimes in their district. Our findings are important to evaluate and understand how ‘data-driven’ policing is occurring and identifying ways to improve and utilize crime analysis approaches within policing.
Chapter
Enforcing law and order has always been a challenge for law enforcement agencies. Exponential growth and unconditional access to Internet and mobile computing have added to this. It is in this context that police forces around the world are migrating to predictive policing. This move will aid them in informed decision-making leading to better enforcement of law and order. Using machine learning algorithms, law enforcement agencies can now extract specific patterns of crime and criminal behaviour from spatio-temporal data to determine whether a freshly committed crime relates to an existing pattern or not. With this knowledge, the administrators would be better equipped to map crime to criminals. Further, they can use the data for forecasting, which, in turn, will enable them to deploy resources judiciously. This can even aid in prevention of crimes or help efficient tackling of crimes. In this chapter, the authors investigate popular algorithms for finding patterns from large datasets and thus elucidating various machine learning approaches in crime analysis.
Article
Full-text available
Crime is a major issue where the top priority has given by our government. Criminology is an area that focuses the scientific study of crime and criminal behavior and law enforcement and is a process that aims to identify crime characteristics. It is one of the most important fields where the application of data mining techniques can produce important results. Crime analysis, a part of criminology, is a task that includes exploring and detecting crimes and their relationships with criminals. The high volume of crime datasets and also the complexity of relationships between these kinds of data have made criminology an appropriate field for applying data mining techniques. Identifying crime characteristics is the first step for developing further analysis. The knowledge that is gained from data mining approaches is a very useful tool which can help and support in identifying violent criminal behavior. The idea here is to try to capture years of human experience into computer models via data mining and by designing a simulation model.
Article
Full-text available
This study examines the application of cluster analysis in the accounting domain, particularly discrepancy detection in audit. Cluster analysis groups data so that points within a single group or cluster are similar to one another and distinct from points in other clusters. Clustering has been shown to be a good candidate for anomaly detection. The purpose of this study is to examine the use of clustering technology to automate fraud filtering during an audit. We use cluster analysis to help auditors focus their efforts when evaluating group life insurance claims. Claims with similar characteristics have been grouped together and small-population clusters have been flagged for further investigation. Some dominant characteristics of those clusters which have been flagged are large beneficiary payment, large interest payment amounts, and long lag between submission and payment.
Conference Paper
Full-text available
ó Narrative reports and criminal records are stored digitally across individual police departments, enabling the collec- tion of this data to compile a nation-wide database of criminals and the crimes they committed. The compilation of this data through the last years presents new possibilities of analyzing criminal activity through time. Augmenting the traditional, more socially oriented, approach of behavioral study of these criminals and traditional statistics, data mining methods like clustering and prediction enable police forces to get a clearer picture of criminal careers. This allows ofcers to recognize crucial spots in changing criminal behaviour and deploy resources to prevent these careers from unfolding. Four important factors play a role in the analysis of criminal careers: crime nature, frequency, duration and severity. We describe a tool that extracts these from the database and creates digital proles for all offenders. It compares all individuals on these proles by a new distance measure and clusters them accordingly. This method yields a visual clustering of these criminal careers and enables the identication of classes of criminals. The proposed method allows for several user-dened parameters.
Article
about national security has increased after the 26/11 Mumbai attack. In this paper we look at the use of missing value and clustering algorithm for a data mining approach to help predict the crimes patterns and fast up the process of solving crime. We will concentrate on MV algorithm and Apriori algorithm with some enhancements to aid in the process of filling the missing value and identification of crime patterns. We applied these techniques to real crime data. We also use semi- supervised learning technique in this paper for knowledge discovery from the crime records and to help increase the predictive accuracy.
Article
Cluster Analysis is a useful technique for grouping data points such that points within a single group or cluster are similar, while points in different groups are distinctive. Clustering as an unsupervised learning algorithm is a good candidate for fraud and anomaly detection. The purpose of this study is to examine the possibility of using clustering technology for continuous auditing. Automating fraud filtering can be of great value to preventive continuous audits. In this paper, cluster-based outliers help auditors focus their efforts when evaluating group life insurance claims. Claims with similar characteristics have been grouped together and those clusters with small population have been flagged for further investigations. Some dominant characteristics of those clusters are, for example, having large beneficiary payment, having huge interest amount and having been submitted long time before getting paid. This study examines the application of cluster analysis in accounting domain. The results provide a guideline and evidence for the potential application of this technique in the field of audit.
Santhosh Baboo 2 and Anbarasi . A 31 Assistant professor ,Department of Computer Science ,Govt Arts College
  • Dr S D G Malathi
  • Vaishnav Collge
Malathi.A 1,Dr.S.Santhosh Baboo 2 and Anbarasi. A 31 Assistant professor,Department of Computer Science ,Govt Arts College,Coimbatore, India. 2 Readers, Department of Computer science, D.G. Vaishnav Collge ,Chennai, India, 2011 An intelligent Analysis of a city Crime Data Using Data Mining
Application of Data Mining Techniques for Analyzing Violent Criminal Behavior by Simulation Model [12] https://www.gov.uk/government/publications/offencesrecorded-by-the-police-in-england-and-wales-byoffence-and-police-force-area
  • K Zakir Hussain
  • M Durairaj
  • G. Rabia Jahani Farzana
K. Zakir Hussain, M. Durairaj and G. Rabia Jahani Farzana,2012 Application of Data Mining Techniques for Analyzing Violent Criminal Behavior by Simulation Model [12] https://www.gov.uk/government/publications/offencesrecorded-by-the-police-in-england-and-wales-byoffence-and-police-force-area-1990-to-2011-12
Algorithmic Crime Prediction Model Based on the Analysis of Crime Clusters
  • A Malathi
  • . S Dr
  • D G Santhosh Baboo
  • Vaishnav College
A.Malathi,Dr.S.Santhosh Baboo. D.G. Vaishnav College,Chennai,2011 Algorithmic Crime Prediction Model Based on the Analysis of Crime Clusters.
A Proposed Framework for Analyzing Crime DataSet using Decision Tree and Simple K-means Mining Algorithms
  • B Kadhim
Kadhim B.Swadi al-Janabi. Department of Computer Science. Faculty of Mathematics and Computer Science .University of Kufa/Iraq, 2011 A Proposed Framework for Analyzing Crime DataSet using Decision Tree and Simple K-means Mining Algorithms.
Forecasting Crime Incidents using Cluster Analysis and Bayesian Belief Networks
  • Aravindan Mahendiran
  • Michael Shuffett
  • Sathappan Muthiah
  • Rimy Malla
  • Gaoqiang Zhang
Aravindan Mahendiran, Michael Shuffett, Sathappan Muthiah, Rimy Malla, Gaoqiang Zhang,2011 Forecasting Crime Incidents using Cluster Analysis and Bayesian Belief Networks