Content uploaded by Selvaraj Nirmala Sugirtha Rajini
Author content
All content in this area was uploaded by Selvaraj Nirmala Sugirtha Rajini on Dec 26, 2020
Content may be subject to copyright.
http://www.iaeme.com/IJARET/index.asp 1450 editor@iaeme.com
International Journal of Advanced Research in Engineering and Technology (IJARET)
Volume 11, Issue 12, December 2020, pp. 1450-1470, Article ID: IJARET_11_12_136
Available online at http://www.iaeme.com/ijaret/issues.asp?JType=IJARET&VType=11&IType=12
ISSN Print: 0976-6480 and ISSN Online: 0976-6499
DOI: 10.34218/IJARET.11.12.2020.136
© IAEME Publication Scopus Indexed
DEVELOPMENT OF CRIME AND FRAUD
PREDICTION USING DATA MINING
APPROACHES
T. Chandrakala
Research Scholar, Department of Computer Applications, Dr. M.G.R. Educational &
Research Institute, Chennai, Tamilnadu, India
S. Nirmala Sugirtha Rajini
Professor, Department of Computer Applications, Dr. M.G.R. Educational & Research
Institute, Chennai, Tamilnadu, India
K. Dharmarajan
Associate Professor, Department of Information Technology, Vels Institute of Science
Technology & Advanced Studies, Pallavaram, Chennai, Tamilnadu, India
K. Selvam
Professor, Department of Computer Applications, Dr. M.G.R. Educational & Research
Institute, Chennai, Tamilnadu, India
ABSTRACT
Crime remains to continue to be a serious threat to all groups and peoples
throughout the world together with the complexity in technology and procedures that
are being manipulated to allow extremely complex criminal acts. Data mining is now
an essential tool for examining, reducing, and avoiding crime and is manipulated by
both government and private institutions across the globe which is the method of
revealing hidden information from Big Data. The data mining methods themselves are
temporarily presented to the reader and this information includes the social network
analysis, neural networks, naive Bayes rule, support vector machines, decision trees,
association rule mining, clustering, entity extraction, and amongst others. The main
objective of this article is to offer a concise analysis of the data mining applications in
crime. Finally, the article evaluates applications of data mining in crime, including a
considerable quantity of the study to date, displayed in chronological order with a
summary table of numerous crucial information mining applications in the crime area
as a directory of reference.
Keywords: Data Mining, Regression, Naive Bayes rule, Support Vector Machine, Big
Data, Neural Network.
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1451 editor@iaeme.com
Cite this Article: T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan,
K. Selvam, Development of Crime and Fraud Prediction using Data Mining
Approaches. International Journal of Advanced Research in Engineering and
Technology, 11(12), 2020, pp. 1450-1470.
http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=12
1. INTRODUCTION
Crime has advanced quickly after some time, with criminals now misusing the most recent in
innovation not exclusively to perpetrate crimes yet additionally to dodge being caught. Crime
is not, at this point restricted to the lanes and back rear entryways in our communities. The
development of 'Large Data', which requests novel methodologies towards the viable and
precise examination of the developing volumes of crime information, was a significant test for
all law authorization and knowledge gathering associations [1]. The Internet, which interfaces
the whole world, is likewise a flourishing play area for the more complex crooks in the
advanced age. Heaps of fear, for example, the 9/11 psychological oppressor assaults and
utilization of innovation to hack into the most secure safeguard information bases, the
requirement for modern and viable techniques for crime avoidance is progressively critical
[2].
It is in this scenery that DM (data mining) is depicted as an amazing asset with
extraordinary opportunity to assist criminal investigation center around the most significant
data covered up inside the 'Enormous Data' on crime [3]. Information digging as an apparatus
for crime investigation is perceived as a relatively new and exceptionally looked after region
of exploration [4]. This isn't unexpected as DM (data mining) itself is a generally new and
quickly developing topic, and such intrigued by the authentic and current meanings of DM
(data mining) are alluded to [5] as DM (data mining) isn't worried about assessment and
examinations or predetermined models yet with finding models through an algorithmic
exploration measure investigating direct and nonlinear models, unequivocal or not.
Figure 1 Phases/Processes detected in data mining [20]
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1452 editor@iaeme.com
PC information experts have started helping law requirement officials and analysts
accelerate the favorable to fathoming violations [6] and anticipate violations ahead of time.
The notoriety in the utilization of numerous DM (data mining) methods are additionally
affected by the expanding accessibility of Big Data and its convenience for individuals who
need information examination abilities and factual information [7]. As distinguished by
numerous creators, admittance to information assumes a significant job in the adequacy of
DM (data mining) in crime, however, issues emerge as access is prevented by security
anxieties [2,8].
This article is planned for giving a succinct survey of the information-digging applications
utilized for recognizing and detecting crime throughout the long term. Normally, this
educational survey article will help acquaint Data Mining procedures with crime analysts and
agents notwithstanding supporting and empowering future investigation into developing
information digging for crime examination. To empower such use, the survey hosts been
sorted out so intrigued gatherings could undoubtedly allude to this article alone to inform each
other on research that has just been led to a date and the subsequent results that have been
achieved [9-12]. The fundamental commitment of this article is two-crease as it does not just
catch a larger part of the huge DM (data mining) applications in crime by characterizing these
dependent on various kinds of strategies yet additionally presents a brief presentation into
every one of the pertinent DM (data mining) methods that were misused for mining crime.
Besides, the survey additionally incorporates, in the plain organization, an outline of DM
(data mining) applications in crime that can go about as a brisk reference manager for
specialists.
The principle crime-based DM (data mining) procedures classification, clustering,
Associate mining rule, and successive example mining. Our exploration revealed that the
subsequent DM (data mining) methods [13-15] are most regularly embraced for crime
investigation. These incorporate social network investigation, neural systems, Navies Bayes
rule, Support vector machines, affiliation rule mining, DT (Decision Trees), clustering, and
data mining.
2. DATA MINING APPLICATIONS IN CRIME
In this segment, we offer a synopsis of the Data Mining applications utilized to distinguish
and forestall crime. In contrast with conventional DM (data mining) strategies, the serious
methods center around both organized and unstructured information to distinguish designs
[3,11]. Most existing frameworks utilize a blend of DM (data mining) methods with the rise
of Big Data to get more exact and precise extractions.
As far as crime, as a wide scope of exploration fields, crime investigation can include a
wide scope of crime events, from straightforward infringement of city obligations to
universally sorted out violations [5]. Complex conspiracies are regularly hard to unwind
because data on respondents can be topographically diffused [16] and range significant
periods; distinguishing digital crime can moreover be troublesome because bustling system
traffic and successive online exchanges produce a lot of information, while just a little
segment would identify with criminal operations.
Table 1 is given as a kind of perspective registry to give away from DM (data mining)
applications in crime. This table sums up data dependent on the DM (data mining) method
utilized and gives data identifying with delicate products, districts, and reasons for the basic
applications. An ongoing survey of information digging procedures utilized for economic
bookkeeping extortion location up until 2011 can be found [17] and are hence not duplicated
here.
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1453 editor@iaeme.com
Figure 2: Framework for Crime data mining
2.1. Entity Extraction
Entity extraction can be characterized as the cycle of concentrating metadata from
unstructured content files. A neural system-based substance extractor [18], which employed
named-element removal procedures to recognize helpful elements from police story reports in
2002. They featured the significance of important data put away as text objects in criminal-
equity information (i.e., the free-text police story reports), which are viewed as unstructured
data. In contrast to the data from organized information, this unstructured information can't be
handily gotten to and utilized by examiners or analysts. Four significant named-element
extraction approaches [8] are quickly summed up and recorded in like manner: rule-based
[98], the lexical query [19], measurement-based [20], and AI [21-25]. Like most prevailing
data extraction frameworks, the element extractor [19] comprises of more than one of these
recorded methodologies. All the more explicitly, it consolidates lexical query, AI, and
insignificant hand-made principles. The exactness paces of removing elements of people and
opiate medicate are 85.4% and 74.1%, with review paces of 77.9% and 73.4%, individually
[18] by relating the neural system based substance extractor in 46 reports haphazardly chose
from the Phoenix Police Department information base for opiate-related violations.
To distinguish what crimes could conceivably have been submitted by a similar gathering
of people [26], a separation measure is proposed with a four-advance worldview and variation
of the likelihood thickness capacity to extricate substances from an assortment of reports. It is
then utilized to change a high-dimensional vector table into a contribution for a police-
operable device. The creators utilized the SPSS LexiQuest text mining instrument by applying
this proposed separation measure, [103] to frame a table of the apparent multitude of elements
remembered for every examination. From that point, the change stage analyzed the
examinations on basic elements, the variation of even separation measure and likelihood
thickness work with the ordinary dissemination, lastly, a two-dimensional portrayal of the
separations between all potential pairs of examinations is gotten to assist the crime experts
and agents accomplish a general away from of all going through examinations.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1454 editor@iaeme.com
Table 1 Outline of Data Mining applications in crime.
Data Mining
Methods
Function and Purpose
Key Methods or Software’s
Entity Extraction
[99]
Suspect Descriptions, Narcotic Drug,
Organization, Personal Property, Crime
Type, Gender
and Race, Phone, Nationality, Vehicle,
Extract valuable information Time,
Investigative, Natural Language
Processing, SPSS LexiQuest, Named
Entity Extraction (hand-crafted rules,
machine learning, rule-based, and lexical
lookup)
Cluster Analysis
[100]
Distinguish hot spots of a criminal offense;
automatically detect associations from
current criminal offense information and
weight interactions to identify the most
powerful association amongst all the
potential pairs of crime associated entities.
Hierarchical Clustering Technique, Self-
Organizing Map, Geographic Information
System (GIS),
Association Rule
[99,100]
Connect crime occurrences, narrow down
any potential defendants, offer informative
association between criminal items or
entities, find crime patterns.
Outlier Score Function, Dynamically
Adjusted Weights, Transformed
Categorical Similarities, Distributed
Association Rule Mining, Apriori
Algorithm,
Classification
Methods
[101,105]
Efficient detection of specific criminal
activities among large-sized data sets;
Categorize crime data; Predict crime hot
spots.
Deceptive Theory, Hunt’s Algorithm,
CART, STAGE Algorithm, C4.5
Algorithm, ID3 Algorithm,
Social Network
[106]
Offer analyses of structures and the
functions
The ratio of Periphery/Core, K-core.
An Online Crime Reporting System was created [105] to separate important crime data
from witness accounts and to produce extra inquiries dependent on the removed data. The
proposed framework joined common language preparing and an insightful meeting strategy
(because of the intellectual meeting standards) with the appropriation of the General
Architecture for Text Engineering System as the data mining instrument. By assessing the
exhibition on the presume portrayal module, a general review pace of 70% and 100%
accuracy was accomplished.
The objective of applying data extraction strategies [27-30] in crime examination is to
assist specialists with separating crime-related data rapidly and adequately. They built up a
web-based announcing framework that depends on a data extraction method and consolidated
regular language handling with bits of knowledge from the intellectual meeting way to deal
with getting more data from witnesses and casualties. This proposed framework [33]
consolidates data extraction and standards of the psychological meeting with the end goal of
proficiently assembling more important data from those casualties and eyewitnesses who are
excessively terrified or humiliated that will be reported crime occurrences. As this framework
additionally supports the utilization of normal language, it not just becomes the way toward
announcing crime simpler yet also empowers the social event of more data. An enormous
vocabulary that consolidates a standard-based framework is created to extricate crime-related
substances by setting off this proposed framework to pose inquiries as per the standards of the
intellectual meeting. The exhibitions of the proposed framework show fundamentally high
exactness rates (94% for police stories and 96% for the testimony accounts) and review rates
(85% for police stories and 90% for the testimony accounts). The semantic inferential-based
regular language preparing model is utilized [31-35] to remove crime data from unstructured
content. Moreover, this system especially gave the variation of the communitarian conditions
on the Web. The assessment of this structure was directed on 100 crime linked writings on the
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1455 editor@iaeme.com
Web, and an exactness pace of 87% was accomplished for removing the crime scene close by
a 72% accuracy ratio for extricating the kind of crime.
An extended substance state is characterized [36] as the key part for separating an
element. Viable performance is directed by joining part of the discourse-based format
coordinating and cosmology are driven public language handling. The starter findings by the
realization of this anticipated methodology on free content law implementation information
beat the named-substance mining strategy and revealed around 80% exactness and review
rates by and large.
Besides, the element extraction procedure joined with POS [37] labeling for criminal data
examination and relationship representation. They revealed their strategy as a proficient and
powerful term-relationship mining procedure, which indicated incredible execution even with
the most unpredictable instance of Chinese POS labeling throughout its usage on criminal
information from Taiwan.
A standard-based Arabic designated substance acknowledgment [38] (NER) framework is
introduced to recognize and arrange designated elements in Arabian crime text. It includes
three modules in the pre-preparing stage: sentence parsing, tokenization, and grammatical
form labeling; syntactic standards, examples, and gazetteer are additionally considered during
the time spent the named substances recognizable proof stage. The proposed framework
accomplished a general 91% exactness rate and 89% review rate when tried on the corpus of
Arabic crime reports from papers. As of late, named substance acknowledgment with [39]
gazetteers and rule-put together extraction for extricating data concerning ethnicities from
crime information in Malaysia. NER [20] was utilized with a contingent arbitrary field (an AI
way to deal with arrange a crime area sentence in an article) to separate crime data obtained in
online news stories. The two examinations of two papers in New Zealand and correlations of
crime area extraction across nations (India and Australia) were led with a general 80 – 90%
exactness for examinations on New Zealand papers and the most part about 75% precision for
cross-country situations.
3. CLASSIFICATION OF DATA MINING METHODS AND
APPLICATIONS
Every one of the six DM (data mining) application classes is upheld by a lot of
algorithmic ways to deal with removing the important connections in the information
[73]. These methodologies contrast in the classes of issues that they can tackle [41]. The
classes are as per the following.
Classification
. Grouping builds and utilizes a model to foresee the downright marks of
obscure items to distinguish objects of different categories. These absolute names are
unordered, discrete, and predefined [33,71]. A grouping and expectation are the methods
toward identifying a lot of normal features [108] and models that describe and identify
information ideas or classes. Basic characterization methods incorporate neural systems,
the guileless Bayes procedure, Support vector machines, and DT (Decision Trees). Such
characterization errands are utilized in the recognition of Visa, medicinal services and
accident protection, and corporate misrepresentation, among different kinds of extortion,
and grouping is one of the best well-known learning models in the use of DM (data
mining) in FFD.
Clustering.
Clustering is exploited to partition objects into thoughtfully important
gatherings (groups), with the articles in a gathering being like each other but extremely
not any like the items in various gatherings. Bunching is otherwise referred to as the
information section or parceling and is regarded as a variation of solo characterization
[30,72]. A "clustering investigation concerns the issue of breaking down or apportioning
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1456 editor@iaeme.com
an informational index [86] (generally multivariate) into gatherings so the focuses in one
gathering are like one another and are as various as conceivable from the focuses in
different gatherings." Further, each group is an assortment of information objects which
are like each other inside a similar group yet unlike those in different groups. The most
well-known bunching methods are the Naïve Bayes strategy, the K-closest neighbor, and
self-arranging map procedures [88].
Prediction
. Predictions calculate statistical and requested future qualities dependent on
the examples of an informational index [35-38]. The characteristic for which the qualities
are being expected [39] which is stable esteemed (requested) instead of clear cut
(discrete-esteemed and unordered). This ascribe can be alluded to just as the anticipated
property. Neural systems and strategic model expectations are the most regularly utilized
forecast methods.
Outlier Detection
. Anomaly identification is exploited to quantify the "separation"
among information items to differentiate such papers that are not the similar as or
contradictory with the remainder of the informational collection [39-41]: "Information
that seems to have unexpected attributes in comparison to the remainder of the populace
are called anomalies" [43]. The issue of exception/irregularity discovery is one of the
greatest key issues in DM (data mining) [82]. A regularly utilized method in exception
discovery is the limiting learning calculation.
Regression
It is a measurable technique used to uncover the connection among at least
one free factor and a needy variable (that is ceaselessly esteemed) [45]. Numerous
experimental investigations have utilized strategic regression as a benchmark
[1,28,62,75,121]. The regression procedure is regularly attempted utilizing such
numerical techniques as calculated regression and direct regression, and it is utilized in
the location of charge card, harvest and collision protection, and corporate
misrepresentation.
Visualization
. Representation alludes to the efficiently justified initiation of information
and to a methodology that alters over confused information qualities into clear examples
to permit clients to see the unpredictable examples of connections revealed in the DM
(data mining) measure [63,73]. An analyst at Bell and AT&T Laboratories [119] have
misused the example recognition capacities of the human visual framework by building a
set-up of instruments and applications that flexibly encode information utilizing shading,
position, size, and other visual qualities. Representation is best used to convey complex
examples through the away from information or capacities.
4. CLASSIFICATION METHODS
Classification strategies are utilized for arranging perceptions dependent on some noteworthy
principles/properties that are found from the information base as one of the highly crucial and
huge DM (data mining) procedures. As far as applying grouping strategies in crime DM (data
mining), numerous usages joined more than one explicit sort of order strategy. In this way, the
survey that follows is arranged by every method in sequential requests, and relying upon
conditions, those unpredictable blend cases are not recreated [45-48].
4.1. Decision Trees
The characterization strategy, otherwise called decision trees (DT), is utilized [47] for
recognizing dubious messages and announced over 95% precision inaccurately characterizing
messages in a huge estimated dataset. The model of the beguiling hypothesis is applied to the
dataset of messages, and the decision tree is produced using the ID3 calculation. The
exhibitions of various order strategies were examined for reviewers in recognizing firms that
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1457 editor@iaeme.com
issue deceitful budget summaries [48]. Three models (DT (Decision Trees), neural systems,
and Bayesian conviction systems) are assessed by managing the ID of elements related to fake
fiscal reports on datasets from 76 Greek assembling firms. DT (Decision Trees), neural
systems, support vector machines, and a random boosting are used for hotspot identification
in an urban improvement venture dataset, which contains 1.4 million cases, 14 indicators, and
a twofold reaction variable. DT (Decision Trees) are received for anticipating crime
announcing in ref. [50] by recognizing the factors that impact whether a crime is accounted
for from overview information got through the Bureau of Justice Statistics of USA [51],
grouping methods like DT (Decision Trees), neural system, Support vector machines, and
Navies Bayes and are applied and looked at for foreseeing crime hotspots and crime gauging.
A decrease system of a trait is joined with the DT (Decision Trees) [51,52] calculation for
investigating criminal conduct. Innocent Bayesian, C4.5 DT (Decision Trees), and rule-based
grouping are misused [52-54] for identifying collision protection misrepresentation. The DT
(Decision Trees) procedure is especially contrasted and other information-digging strategies
for car protection misrepresentation recognition [53,54]. A decision tree-based
characterization model is utilized for finding crime designs and foreseeing future patterns
[55]. A characterization calculation is assessed [56] for crime expectation where they discover
DT (Decision Trees) beating a guileless Bayesian calculation in precisely anticipating the
crime classification for various states in the USA. An enhanced decision tree calculation
dependent on the Maclaurin-Priority Value First technique is utilized [56-58] for PC crime
criminology, and this increased calculation outflanked the normally utilized ID3 calculation
as far as both proficiency and precision. All the more as of late, an assortment of arrangement
procedures was assessed for identifying protection misrepresentation [58] where a calculation
for deciding the connection between grouping models and various sorts of protection
extortion information was proposed.
4.2. Neural Networks (NN)
DT (Decision Trees), ANN (Artificial neural systems), and strategic regression are utilized
[59] for revealing lies from 371 proclamations of various sorts of violations. ANNs are
applied to distinguish sneaking containers, and the discoveries demonstrate that the ANN
reports a greater precision than strategic regression [60].
Support vector machines, Decision trees, MLP (Multilayer perceptron), probabilistic
neural systems, and hereditary programming (GP) [61] for recognizing phishing messages
from a dataset of 2500 messages [62], DT (Decision Trees), ANNs, and backing vector
machines were thought about for separating among those not charged and charged for
beginning adolescent culpable. Strategic regression, improving, SVM, neural systems, K-
closest neighbor, and an assortment of other information-digging strategies are assessed for
foreseeing recidivism [63].
4.3. Support Vector Machines (SVM)
The SVM arrangement has been utilized to distinguish the wellsprings of email spamming
dependent on the sender's semantic examples and basic highlights [10,64]. An SVM model is
effectively utilized [64] to help creator recognizable proof criminology through mining email
content. SVM is utilized for crime hotspot forecast [65], where they discover it outflanking
neural systems and spatial auto-regression based methodologies. SVM is abused [66] for
crime scene order utilizing an information base containing 400 crime scenes. The creators
discover SVM-based methodologies per-shaping well than Multilayer perceptron neural
systems. SVM utilized to help with distinguishing advanced proof identifying with PC
violations using anomaly identification [67]. SVM was utilized to assist crime examiners
favorable to duce a lot of potential understandings for area significant ideas through bit-based
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1458 editor@iaeme.com
connection extraction [68]. The SVM method utilized for foreseeing criminal recidivism and
contrasted their outcomes and those acquired from strategic regression and neural systems
[69]. The exhibition of SVM for identifying fraud utilizing the Schonlau dataset [70], and
while they discovered one-class SVM models to be down to earth in precisely finding data
fraud in the overall case, the outcomes were not good where explicit client profiles fit the
normal client profile. Progressed charge misrepresentation exercises are identified utilizing
both SVM and irregular woodlands (RF) [71] where they discover SVM beating RF2.
Besides, an SVM model is utilized close by calculated regression and RF for identifying
charge card misrepresentation [72], where the exhibitions are assessed on genuine exchange
information from between public money related organizations. The SVM procedure joined
with the AdaBoost calculation [73] for digital crime location and anticipation dependent on a
Facebook dataset. A crossbreed SVR (Support vector regression) model in the mix with
ARIMA and it is vital that RF were effectively utilized [114] in depicting the antecedents of
murder by representing the overall expenses of gauging mistakes. The RF calculations are
likewise clarified [114], PSO for estimating property-related misconduct rates in the USA,
and discovered it outflanking gauges from the individual models [75], SVR was utilized for
anticipating property-related misconduct rates alongside dark proportional examination. The
conduct of lawbreakers is investigated utilizing SVM [120] for Malaysia, where the
proportion of police to the populace is very low to 1000. All at 3.6 the more as of late, it has
been appeared [112] that an SVM paradigm can be utilized to give live forecasts of crime in
metropolitan regions dependent on Twitter information identified with an urban subsection
from inside the city of San Francisco.
4.4. SNA (Social Network Analysis)
Interpersonal organization investigation (SNA) is a strategy based on the examination of the
social structure of perceptions for distinguishing noteworthy data. An existing idea of the
system was examined and analyzed in the utilizations of the crime investigation space with
correlations with a definite clarification [78]. SNA was applied and identified two conditional
system structure estimations, k-center, and center/fringe proportion [79], to recognize the
online closeout swelled notoriety merchants from ordinary records. Significant results
demonstrated that SNA could go about as a powerful indicator to recognize criminal records
and potentially forestall and lessen dangerous exchanges and online closeout cheats. An
examination of the structure of the Global Salafi Jihad connects [80] with Web auxiliary
mining and SNA. Results indicated that the proposed procedure can be a powerful device to
distinguish key individuals in a fear-based oppressor network and, accordingly, assist
specialists with creating productive and compelling problematic techniques and measures.
The SNA procedure was utilized to examine composed criminal gatherings by applying it
to a fugitive cruiser posse working in Canada [81]. SNA is introduced as another sort of
insight for country security in the USA, which can give significant information on the one of
a kind qualities of psychological militant associations, comprehend fear-based oppressor
systems and structure the reason for a more compelling countermeasure to netwar [82]. SNA
was utilized on the Enron company's email file [83], and it ends up being ready to examine
and index examples of communications between substances in an email assortment to
separate social standing. The creators expressed that this procedure becomes conceivable to
see a preview of a corporate network and successfully decide the genuine connections and
associations among people.
The web mining and system investigation strategy to contemplate scorn bunches in online
sites [84]. Their planned methodology effectively-recognized and examined a designated set
of disdain gatherings (which included 820 bloggers) on Xanga. SNA was preformed close by
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1459 editor@iaeme.com
fluffy hypothesis with the point of displaying multi-modular interpersonal organizations.
Another fluffy parallel activity was proposed to fulfill the necessities of a fluffy consolidation
administrator [85]. An improved most brief way calculation [86], brushing SNA to mine the
center individual from a psychological oppressor gathering. Two SNA markers, k-center and
focus loads calculations were utilized to shape a proposal framework that can recommend the
dangers of agreement related to a record for online sale webpage clients [87]. The outcomes
are promising, with 76% identification precision on genuine world 'boycott' information and
soundproof that the methodology can give compelling admonitions a while in front of the
official arrival of boycotts. The proposed calculation, process measurements, can work the
calculation and investigation without knowing the recognizable information with high-
security guarantees [88]. The significance of examining the rise of digital networks in online
journals with a mix of web mining and SNA strategies [89]. The creators planned a
reproduction email framework dependent on character quality measurements to display the
traffic behavior of email account clients and utilized this with SNA to distinguish key
individuals from a criminal gathering [90]. The pertinence of SNA for country security DM
(data mining) with contextual investigations dependent on pack/opiate systems, US radical
systems, Al-Qaeda part systems, and universal Jihadist sites and gathering systems [91].
A structure for automated organized information examination was examined and
concluded from different interpersonal organizations by changing over an exchange dataset
and applying affiliation mining and factual strategies. The proposed technique fluctuates from
past work and consolidated the game hypothesis idea in a multi-specialist model to fabricate
P2P applications for the police power to recognize connections among hoodlums and
restricted down potential suspects [92]. An SNA-based model was forecasted [93] for
focusing on criminal systems. It is based on Borgatti's central participant methodology with
alterations on fusing the overall quality of entertainers just as the quality of the connections
restricting system entertainers. SNA [94] was utilized for distinguishing web closeout
misrepresentation. They performed investigates information from the Yahoo Auctions site
with examinations of various kinds of web closeout accounts and accomplished promising
outcomes relating to expectation exactness.
Carrington grouped the uses of SNA in the crime space into three territories [95] (the
impact of the individual system on self-image’s misconduct or crime, the impact of
neighborhood systems on crime in the area, and the association of criminal gatherings and
exercises) and furnished itemized clarifications along with huge speculations and writing. The
SNA methodology was utilized as a device for helping wrongdoing experts and criminologists
in police powers create ban systems [96], with a model contextual analysis concentrating on
the Richmond City Police Department. Nitty-gritty instances of executions are given to show
how the proposed approach can help police in understanding complex conduct inspirations of
guilty parties, deliberately hot-spotting individuals of intrigue, and creating more grounded
between jurisdictional working connections [96]. All the more as of late, SNA was utilized to
distinguish [97] and dissect shrouded exercises in interpersonal organizations. Both Min Cut-
based and regression-based calculations are embraced and contrasted and datasets from a few
sources.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1460 editor@iaeme.com
Table 2
Research on data mining methods in FFD.
Illegal actions
Data mining
application
Data mining methods
Referen
ces
Credit card fraud
Classification
Bayesian Belief Network, RIPPER, CART, decision
trees, Ada boost algorithm,
[[20]
Network fraud
Clustering
Discriminant analysis, Neural networks
[[26]
Decision trees
, neural
networks,
Naïve Bayes,
discriminant analysis, logistic model, K-nearest
neighbor
[[83]
Evolutionary algorithms, Support vector machine
[[21]
Self-organizing map
[[66]
Hidden Markov Model
[[59,76]
Money
laundering
Classification
Network analysis
[[35]
Crop insurance
fraud
Regression
Probit model, Logistic model
[[7]
Yield-switching model
[[42]
Healthcare
insurance fraud
Classification
Association rule
[[82]
Automobile
insurance
fraud
Polymorphous (M-of-N) logic
[[50]
Self-organizing map
[[42]
Visualization
Visualization
[[65]
Outlier detection
Discounting learning algorithm
[[86]
Classification
Logistic model
[[16]
Naïve Bayes
[[74]
Self-organizing map
[[15]
Logistic model, Bayesian belief network
[[75]
Logistic model
[[77]
Fuzzy logic
[[76]
Bayesian belief network, Naïve Bayes, K-nearest
neighbor
[[55]
SVM, NN, DT, Logistic model
[5,6]
Logistic model
[10]
PRIDIT (Principal component analysis of RIDIT)
[14]
Neural networks
[75]
Prediction
Logistic model
[67]
Evolutionary algorithms
[85]
Regression
Logistic model
[9,60]
Probit model
[22,79]
Corporate fraud
Classification
UTADIS(UTilite's Additives DIScriminantes) and
MCDA (Multicriteria decision aid),
[[44]
Bayesian belief network, decision trees, Neural
networks
[[63]
Regression
Logistic model
[[29,75,8
6]
Prediction
Neural networks
[[19]
Clustering
Naïve Bayes
[[41]
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1461 editor@iaeme.com
Table 3
Statistics of manuscripts about the data mining methods and financial fraud
Methods
Other related
financial fraud
Corporate
fraud
Automobile
insurance
fraud
Healthcare
insurance
fraud
Insurance fraud
Crop insurance
fraud
Bank
fraud
Credit
card
fraud
Yield-switching model
1
Visualization
1
UTilite's Additives
DIScriminantes
(UTADIS)
1
Stacking variant
methodology
1
Principal component
analysis of RIDIT
(PRIDIT)
1
Polymorphous (M-of-
N) logic
1
Network analysis
Multicriteria decision
aid (MCDA)
1
Hidden Markov Model
1
Discounting learning
algorithm
1
Association rule
1
Ada boost algorithm
1
RIPPER
1
1
Fuzzy logic
1
1
Discriminant analysis
2
CART
1
1
Support vector
Machine
1
1
1
Self-organizing map
1
1
1
Probit model
2
1
K-nearest neighbor
1
1
1
Evolutionary
algorithms
1
1
1
Naïve Bayes
1
2
1
Decision trees
2
1
2
Bayesian belief
network
2
2
1
Neural networks
6
2
2
Logistic model
5
9
1
1
Total
25
24
5
3
17
5. CYBERCRIME DETECTION
Another use of DM (data mining) strategies is in cybercrime, going from a disavowal of
administration to obtaining unapproved admittance to data. The author concentrated on
recognizing Denial of Service assaults utilizing design [97] acknowledgment strategies. The
scientists utilized the framework log documents to look at designs and find anomalies
utilizing grouping. At the point when the exception is affirmed to be a DoS assault, a
framework director is educated. The analysts didn't expand on the achievement pace of their
technique.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1462 editor@iaeme.com
While trying to anticipate cybercrimes in the financial area, for example, make changes to
the information, illicit access, or making the system glitch, executed their new DM (data
mining) procedure [98]. The scientists needed to gather the information from numerous
sources and keep it in a crime information base. The information was then handled to be
prepared for DM (data mining). The specialists applied the accompanying techniques to make
an expectation standard:
J48.
Influenced Association Classification,
Classification,
Clustering,
Association Rule Mining,
The scientist's primary commitment is the utilization of J48 calculation, which – as
indicated by their discoveries – gives high exactness over the preparation information.
To resolve the issue of the absence of protection that permits cybercrime, engaging
security was proposed through ongoing DM (data mining) [99]. They notice that gadgets are
not canny enough to progressively update their protection notwithstanding new dangers, along
these lines, they proposed a continuous example acknowledgment model that associates with
security gadgets and permits them to overhaul their safeguards when dangers are available. As
per the specialists, it depends on concentrated ongoing DM (data mining) motor (RTDME).
6. FRAUD DETECTION
As per the Association of Certified Fraud Examiners [100] "misrepresentation incorporates
any purposeful or intentional act to deny another of cash or property by trickiness, misleading,
or other unjustifiable methods." The money related inspiration to forestall extortion is leading
examination to more modern techniques to identify misrepresentation utilizing DM (data
mining), and AI. In their observational examination, the author expects to add to medical
coverage extortion discovery [101-104]. Analysts limited the strategies for clinical
misrepresentation to copying charging, abusing charging codes, giving costly tests
(upcoding), and indicting for administrations that are not given while presenting the case to it
(unbundling). Because of the clear idea of the copying bills and abusing the codes, the
scientists concentrated on unbundling an upcoding.
To discover fake cases, the specialists recommended a strategy for discovering oddities
inside the current information by employing measurable decision standards and k-implies
bunching [113]. The scientists showed that by hailing anomalies; regardless of whether it's the
length of remain for a particular infection, installment, or speed of guarantee endorsement, the
cycle of extortion recognition can be profoundly improved.
A more normal field of extortion and misrepresentation identification is money related to
misrepresentation. The raise of utilizing DM (data mining) strategies were utilized [118-122]
to distinguish budgetary misrepresentation inside an association. Nonetheless, the analysts
likewise saw the need and the need for a technique that will recognize the budget report
misrepresentation inside a whole business gathering. The specialists build up a major
information-based extortion recognition approach for budget reports for business gatherings.
The gathering utilized the bunching model-QGA-SVM with preparing information from
bunches they realized submitted misrepresentation to 2014 from 2000, notwithstanding
individuals that didn't submit extortion [123-125].
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1463 editor@iaeme.com
The grouping precision was then contrasted and other bunching models, for example,
neural systems and decision trees. The outcome demonstrated that the model utilized by the
analysts has the most noteworthy precision rate.
Table 4 Various techniques used by the researches reviewed
Method
Used Methods
Reference
Classification
SVM, NN, DT
[101-103]
Association Rule
Text Mining
[104-106]
Cluster Analysis
K-Means Clustering,
Partitioning Clustering
[107-111]
Social Network
Analysis
-
[114-116]
Unclear
-
[117-119]
7. CONCLUSION
This article is a writing audit of DM (data mining) and AI applications in the field of
criminologists. Four distinct territories of corruption with various seriousness were analyzed
through crafted by scientists in the area and their utilization of DM (data mining) to add to
bringing down crime percentage through distinguishing, decreasing, or keeping crime from
occurring. Various methods of DM (data mining) were likewise utilized at various phases of
information assortment, examination, and formation of models. The utilization of recorded
information to empower expectation in crime anticipation is an area that needs more
examination because of the possibility it has in sparing lives and forestalling disasters.
REFERENCES
[1] Yadav, M. Timbadia, A. Yadav, R. Vishwakarma and N. Yadav, "Crime pattern detection,
analysis & prediction", 2017 International conference of Electronics, Communication and
Aerospace Technology (ICECA), 2017.
[2] X. Zhao and J. Tang, "Exploring Transfer Learning for Crime Prediction", 2017 IEEE
International Conference on Data Mining Workshops (ICDMW), 2017.
[3] H. Kang and H. Kang, "Prediction of crime occurrence from multi-modal data using deep
learning", PLOS ONE, vol. 12, no. 4, p. e0176244, 2017.
[4] S. Nath, "Crime Pattern Detection Using Data Mining", 2006 IEEE/WIC/ACM International
Conference on Web Intelligence and Intelligent Agent Technology Workshops, 2006.
[5] L. Thota, M. Alalyan, A. Khalid, F. Fathima, S. Changalasetty and M. Shiblee, "Cluster based
zoning of crime info", 2017 2nd International Conference on Anti- Cyber Crimes (ICACC),
2017.
[6] V. Ingilevich and S. Ivanov, "Crime rate prediction in the urban environment using social
factors", Procedia Computer Science, vol. 136, pp. 472-478, 2018.
[7] G. Weir, E. Dos Santos, B. Cartwright and R. Frank, "Positing the problem: enhancing
classification of extremist web content through textual analysis", 2016 IEEE International
Conference on Cybercrime and Computer Forensic (ICCCF), 2016.
[8] T. Anand, S. Padmapriya and E. Kirubakaran, "Terror tracking using advanced web mining
perspective", 2009 International Conference on Intelligent Agent & Multi- Agent Systems,
2009.
[9] M. Khan, S. Pradhan and H. Fatima, "Applying Data Mining techniques in Cyber Crimes",
2017 2nd International Conference on Anti-Cyber Crimes (ICACC), 2017.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1464 editor@iaeme.com
[10] K. Lekha and S. Prakasam, "Data mining techniques in detecting and predicting cyber crimes
in banking sector", 2017 International Conference on Energy, Communication, Data
Analytics and Soft Computing (ICECDS), 2017.
[11] B. Bhatti and N. Sami, "Building adaptive defense against cybercrimes using real-time data
mining", 2015 First International Conference on Anti-Cybercrime (ICACC), 2015.
[12] "Association of Certified Fraud Examiners - Fraud 101", Acfe.com, 2018. [Online].
Available: http://www.acfe.com/fraud-101.aspx. [Accessed: 18- Oct- 2018].
[13] Verma, A. Taneja and A. Arora, "Fraud detection and frequent pattern matching in insurance
claims using data mining techniques", 2017 Tenth International Conference on Contemporary
Computing (IC3), 2017.
[14] Y. Chen and C. Wu, "On Big Data-Based Fraud Detection Method for Financial Statements of
Business Groups", 2017 6th IIAI International Congress on Advanced Applied Informatics
(IIAI-AAI), 2017.
[15] S. V. Nath, Crime pattern detection using data mining, In Proceedings of the International
Conference on Web Intelligence and Intelligent Agent Technology Workshops, Hong Kong,
2006, 41 – 44.
[16] U. Fayyad and R. Uthurusamy, Evolving data into mining solutions for insights, Commun
ACM 45 (8) (2002), 28 – 31.
[17] M. Chau, J. J. Xu, and H. Chen, Extracting meaningful entities from police narrative reports,
In Proceedings of the 2002 Annual National Conference on Digital Government Research, Los
Angeles, CA, 2002, 1 – 5.
[18] J. Hosseinkhani, M. Koochakzaei, S. Keikhaee, and Y. H. Amin, Detecting suspicion
information on the web using crime Data Mining techniques, Int J Adv Comput Sci Inform
Technol 3 (1) (2014), 32 – 41.
[19] H. Chen, W. Chung, J. J. Xu, G. Wang, Y. Qin, and M. Chau, Crime data mining: a general
framework and some examples, Computer 37 (4) (2004), 50 – 56.
[20] Sharma, and P. K. Panigrahi, A review of financial accounting fraud detection based on data
mining tech- niques, Int J Comput Appl 39 (1) (2012), 37 – 47.
[21] T. K. Cocx, and W. A. Kosters, A distance measure for determining similarity between
criminal investigations, Adv Data Mining Appl Med Web Mining Marketing Image Signal
Mining 4065 (2006), 511 – 525.
[22] C. H. Ku, A. Iriberri, and G. Leroy, Crime information extraction from police and witness
narrative reports, In Proceedings of the IEEE Conference on Technologies for Homeland
Security, 12 – 13 May, Waltham, MA, 2008, 193 – 198
[23] C. H. Ku, A. Iriberri, and G. Leroy, Natural language processing and e-government: crime
information extraction from heterogeneous data sources, In Proceedings of the 9th Annual
International Digital Government Research Conference, Montreal, Canada, 2008, 18 – 21.
[24] V. Pinheiro, V. Furtado, T. Pequeno, and D. Nogueira, Nat- ural language processing based on
semantic inferentialism for extracting crime information from text, In Proceedings of the IEEE
International Conference on Intelligence and Security Informatics (ISI), Vancouver, BC, 2010,
23 – 26.
[25] J. Johnson, A. Miller, L. Khan, B. Thuraisingham, and M. Kantarcioglu, Extraction of
expanded entity phrases, In Proceedings of the IEEE International Conference on Intelligence
and Security Informatics (ISI), Beijing, China, 2011, 10 – 12.
[26] K.-S. Yang, C.-C. Chen, Y.-H. Tseng, and Z.-P. Ho, Name entity extraction based on POS
tagging for criminal infor- mation analysis and relation visualization. In Proceedings of the 6th
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1465 editor@iaeme.com
International Conference on New Trends in Information Science and Service Science and Data
Mining (ISSDM), 23 – 25 October, Taipei, Japan, 2012, 785 – 789.
[27] M. Asharef, N. Omar, and M. Albared, Arabic named entity recognition in crime documents, J
Theor Appl Inform Technol 44 (1) (2012), 1 – 6.
[28] Alkaff, and M. Mohd, Extraction of nationality from crime news, J Theor Appl Inform
Technol 54 (2) (2013), 304 – 312.
[29] R. Arulanandam, B. T. R. Savarimuthu, and M. A. Purvis, Extracting crime information from
online newspaper arti- cles, In Proceedings of the Second Australasian Web Con- ference, 20 –
23 January, Auckland, New Zealand, 2014, 31 – 38.
[30] T. Pang-Ning, M. Steinbach, and V. Kumar, Introduction to Data Mining, (1st ed.), London,
Pearson, 2014.
[31] T. G. Dietterich, S. Becker, and Z. Ghahramani, Advances in neural information processing
systems 14, In Proceed- ings of the Annual Conference on Neural Information Pro- cessing
Systems, MIT Press, 2002.
[32] M. R. Anderberg, Cluster analysis for applications (No. OAS-TR-73-9). Office of the
Assistant for Study Support Kirtland Afb N Mex, 1973.
[33] R. T. Ng and J. Han, Efficient and effective clustering methods for spatial data mining, In
Proceedings of The 20th International Conference on Very Large Data Bases, September 12 –
15, Chile, Santiago de Chile, 1994, 144 – 155.
[34] T. Grubesic, and A. Murray, Detecting Hot-spots using Cluster Analysis and GIS, In
Proceedings of the 5th Annual International Crime Mapping Research Conference, Dallas,
2001.
[35] R. V. Hauck, H. Atabakhsb, P. Ongvasith, H. Gupta, and H. Chen, Using Coplink to analyze
criminal-justice data, Computer 35 (3) (2002), 30 – 37.
[36] J. S. De Bruin, T. K. Cocx, W. A. Kosters, J. F. J. Laros, and J. N. Kok, Data Mining
approaches to criminal career analysis, In Proceedings of the 6th International Conference on
Data Mining, Hong Kong, 2006, 18 – 22.
[37] R. Adderley, M. Townsley, and J. Bond, Use of data mining techniques to model crime scene
investigator performance, Knowl Based Syst 20 (2) (2007), 170 – 176.
[38] R. Lombardo, and M. Falcone, 2011. Crime and Economic Performance. A cluster analysis of
panel data on Italy’s NUTS 3 regions. Working Paper, University of Calabria, 1 – 33.
[39] Y. Bello, and S. A. Yelwa, Complementing GIS with Cluster Analysis in Assessing Property
Crime in Katsina State, Nigeria, Am Int J Contemp Res 2 (7) 2012, 190 – 198.
[40] M. R. Keyvanpour, M. Javideh, and M. R. Ebrahimi, Detecting and investigating crime by
means of data mining: a general crime matching framework, Procedia Comput Sci 3 (2011),
872 – 880.
[41] M. Sukanya, T. Kalaikumaran, and S. Karthik, Criminals and crime hotspot detection using
data mining algorithms: clustering and classification, Int J Adv Res Comput Eng Technol 1
(10) (2012), 225 – 227.
[42] J. Agarwal, R. Nagpal, and R. Sehgal, Crime analysis using K-means clustering, Int J Comput
Appl 83 (4) (2013), 1 – 4.
[43] M. Vijayakumar, S. Karthick, and N. Prakash, The day- to-day crime forecasting analysis of
using spatial-temporal clustering simulation, Int J Sci Eng Res 4 (1) (2013), 1 – 6.
[44] R. Adderley, and P. B. Musgrove, Data mining case study: modeling the behavior of offenders
who commit serious sexual assaults, In Proceedings of the 7th International Conference on
Knowledge Discovery and Data Mining, San Francisco, CA, 2001, 26 – 29.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1466 editor@iaeme.com
[45] H. Chen, J. Schroeder, R. V. Hauck, L. Ridgeway, H. Atabakhsh, H. Gupta, C. Boarman,
K. Rasmussen, and A. W. Clements, COPLINK connect: information and knowledge
management for law enforcement, Decis Support Syst 34 (3) (2002), 271 – 285.
[46] H. Chen, D. Zeng, H. Atabakhsh, W. Wyzga, and J. Schroeder, COPLINK: managing law
enforcement data and knowledge, Commun ACM 46 (1) (2003), 28 – 34.
[47] J. Schroeder, J. Xu, H. Chen, and M. Chau, Automated criminal link analysis based on domain
knowledge, J Am Soc Inform Sci Technol 58 (6) (2007), 842 – 855.
[48] M. Gupta, B. Chandra, and M. P. Gupta, Crime data mining for Indian police information
system. Computer Society of India, 2008, 389 – 397.
[49] D. E. Brown, and S. Hagen, Data association methods with applications to law enforcement,
Deci Support Syst 34 (4) (2003), 369 – 378.
[50] S. Lin, and D. E. Brown, An outlier-based data association method for linking criminal
incidents, Decis Support Syst 41 (3) (2006), 604 – 615.
[51] S. Appavu, M. Pandian, and R. Rajaram, Association rule mining for suspicious email
detection: a data mining approach, In Proceedings of the IEEE Intelligence and Security
Informatics, Brunswick, NJ, 2007, 23 – 24.
[52] V. Ng, S. Chan, D. Lau, and C. M. Ying, Incremental mining for temporal Association Rules
for crime pattern discoveries, In Proceedings of the 18th Conference on Australasian
Database, Ballarat, Victoria, 2007, 29 January- 2 February, 123 – 132.
[53] A. L. Buczak, and C. M. Gifford, Fuzzy association rule mining for community crime
pattern discovery, In Proceedings of the ACM SIGKDD Workshop on Intelligence and
Security Informatics, Washington, DC, 2010, 25 – 28.
[54] D. Usha, and K. Rameshkumar, A complete survey on application of frequent pattern mining
and association rule mining on crime pattern mining, Int J Adv Comput Sci Technol 3 (4)
(2014), 264 – 275.
[55] M. J. Zaki, Parallel and distributed association mining: a survey, IEEE Concurrency 7 (4)
(1999), 14 – 25.
[56] S. Appavu, and R. Rajaram, Suspicious e-mail detection via decision tree: a data mining
approach, J Comput Inform Technol 15 (2) (2007), 161 – 169.
[57] E. Kirkos, C. Spathis, and Y. Manolopoulos, Data Mining techniques for the detection of
fraudulent financial state- ments, Expert Syst Appl 32 (4) (2007), 995 – 1003.
[58] C. Wang, and P.-S. Liu, Data mining and hotspot detection in an urban development project. J
Data Sci 6 (2008), 389 – 414.
[59] J. Gutierrez, Using decision trees to predict crime reporting, In Advanced Principles for
Improving Database Design, Systems Modeling, and Software Development, K. Siau C.-H.
Yu, M. W. Ward, M. Morabito, and W. Ding, Crime forecasting using data mining techniques.
In Proceedings of the 11th International Conference on Data Mining Workshops, 11
December, Vancouver, BC, 2011, 779 – 786.
[60] W. Hui, W. Jing, and Z. Tao, Analysis of decision tree classification algorithm based on
attribute reduction and application in criminal behavior. In Proceedings of the 3rd
International Conference on Computer Research and Development, 11 – 13 March, Shanghai,
2011, 27 – 30.
[61] R. Bhowmik, Detecting auto insurance fraud by data mining techniques, J Emerg Trends
Comput Inform Sci 2 (4) (2011), 156 – 162.
[62] Gepp, J. H. Wilson, K. Kumar, and S. Bhattacharya, Comparative analysis of decision trees
vis-à-vis other com- putational data mining techniques in automotive insurance fraud
detection, J Data Sci 10 (2012), 537 – 561.
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1467 editor@iaeme.com
[63] Nasridinov, S.-Y. Ihm, and Y.-H. Park, A decision tree-based classification model for crime
prediction, Inform Technol Convergence Lect Notes Electr Eng 253 (2013), 531 – 538.
[64] R. Iqbal, M. A. A. Murad, A. Mustapha, P. H. S. Panahy, and N. Khanahmadliravi, An
experimental study of classification algorithms for crime prediction, Ind J Sci Technol 6 (3)
(2013), 4219 – 4225.
[65] Y. Wang, X. Peng, and J. Bian, Computer crime forensics based on improved decision tree
algorithm, J Netw 9 (4) (2014), 1005 – 1011.
[66] S. A. Muhammad, Fraud: the affinity of classification techniques to insurance fraud detection,
Int J Innov Technol Explor Eng 3 (11) (2014), 62 – 66.
[67] M. Fuller, D. P. Biros, and D. Delen, An investigation of data and text mining methods for
real world deception detection, Expert Syst Appl 38 (7) (2011), 8392 – 8398.
[68] C.-H. Wen, P.-Y. Hsu, C-y Wang, and T. L. Wu, Identifying smuggling vessels with artificial
neural network and logistics regression in criminal intelligence using vessels smuggling case
data, Intell Inform Database Syst Lect Notes Comput Sci 7197 (2012), 539 – 548.
[69] M. Pandey, and V. Ravi, Detecting phishing e-mails using text and data mining, In
Proceedings of the IEEE International Conference on Computational Intelligence &
Computing Research, Coimbatore, India, 2012, 18 – 20.
[70] R. P. Ang, and D. H. Goh, Predicting juvenile offending: a comparison of data mining
methods, Int J Offender Ther Comp Criminol 57 (2) (2013), 191 – 207.
[71] N. Tollenaar, and P. G. M. van der Heijden, Which method predicts recidivism best?: a
comparison of statistical, machine learning and data mining predictive models, J R Stat Soc A
176 (2) (2013), 565 – 584.
[72] O. De-Vel, A. Anderson, M. Corney, and G. Mohay, Mining e-mail content for author
identification forensics, ACM SIGMOD Rec 30 (4) (2001), 55 – 64.
[73] K. Kianmehr, and R. Alhajj, Effectiveness of support vector machine for crime hot-spots
prediction, J Appl Artif Intell 22 (5) (2008), 433 – 458.
[74] R. Abu Hana, C. Freitas, L. S. Oliveira, and F. Bor- tolozzi,Crime Scene Classification. In
Proceedings of the 23th Annual ACM Symposium in Applied Computing, 16- 20 March,
Fortaleza, Cear, Brazil, 2008, 419 – 423.
[75] Z. Liu, D. Lin, and F. Guo, A method for locating digital evidences with outlier detection
using support vector machine, Int J Netw Secur 6 (3) (2008), 301 – 308.
[76] R. Basili, C. Giannone, C. D. Vescovo, A. Moschitti, and P. Naggar, Kernel-based relation
extraction for crime investigation. In AI*IA 2009: Emergent Perspectives in Artificial
Intelligence, 2009, 161 – 171.
[77] P. Wang, R. Mathieu, J. Ke, and H. J. Cai, Predicting crimi- nal recidivism with support vector
machine. In Proceedings of the International Conference on Management and Ser- vice
Science, 24 – 26 August, Wuhan, 2010, 1 – 9.
[78] M. Salem, and S. Stolfo, Detecting masqueraders: a com- parison of one-class bag-of-words
user behavior model- ing techniques, J Wireless Mob Netw Ubiquitous Comput Dependable
Appl 1 (1) (2010), 3 – 13.
[79] Modupe, O. O. Olugbara, and S. O. Ojo, Exploring support vector machines and random
forests to detect advanced fee fraud activities on internet, In Proceedings of the IEEE 11th
Conference on Data Mining Workshops, 11 December, Vancouver, BC, 2011, 331 – 335.
[80] S. Bhattacharyya, S. Jha, K. Tharakunnel, and J. C. Westland, Data mining for credit card
fraud: a comparative study, Decis Support Syst 50 (3) (2011), 602 – 613.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1468 editor@iaeme.com
[81] H. M. Deylami, and Y. P. Singh, Adaboost and SVM based cybercrime detection and
prevention model, Artif Intell Res 1 (2) (2012), 117 – 130.
[82] R. Alwee, S. M. H. Shamsuddin, and R. Sallehuddin, 2013. Hybrid support vector regression
and autoregressive integrated moving average models improved by particle swarm
optimization for property crime states forecasting with economic indicators, Sci World J, 1 –
11.
[83] R. Alwee, S. M. H. Shamsuddin, and R. Sallehuddin, 2013. Economic indicators selection for
property crime rates using Grey Relational Analysis and Support Vector Regression. In
Proceedings of the International Conference on Systems, Control, Signal Processing and
Informatics, 16 – 19 July, Rhodes Island, 178 – 185.
[84] K. Junoh, M. N. Mansor, A. M. Yaacob, F. A. Adnan, S. A. Saad, and N. M. Yazid, Crime
Detection with DCT and artificial intelligent approach, Adv Mat Res (2013), 816 – 817. 610 –
615.
[85] J. Bendler, T. Brandt, S. Wagner, and D. Neumann, Inves- tigating crime-to-twitter
relationships in urban environ- ments – facilitating a virtual neighborhood watch, In Pro-
ceedings of the 22nd European Conference on Information Systems, Tel Aviv, 2014, 9 – 11.
[86] M. K. Sparrow, The application of network analysis to criminal intelligence: an assessment of
the prospects, Social Netw 13 (3) (1991), 251 – 274.
[87] J. C. Wang, and C. Q. Chiu, Detecting online auction inflated-reputation behaviors using
social network analy- sis, In Proceedings of the Annual Conference of the North American
Association for Computational Social and Orga- nizational Science, Notre Dame, 2005, 26 –
28.
[88]
S. Ghosh, D.L. Reilly, Credit card fraud detection with a neural-network, 27th
Annual Hawaii International, Conference on System Science 3 (1994) 621–630.
[89]
P. Green, J.H. Choi, Assessing the risk of management fraud through neural
network technology, Auditing: A Journal of Practice & Theory 16 (1) (1997) 14–
28.
[90]
J. Han, M. Kamber, Data Mining: Concepts and Techniques, Second ed, Morgan
Kaufmann Publishers, 2006, pp. 285–464.
[91]
M. Haskett, An Introduction to Data Mining, Part 2, Analyzing the Tools and
Techniques, Enterprise System Journal, 2000.
[92]
H. He, J. Wang, W. Graco, S. Hawkins, Application of neural networks to
detection of medical fraud, Expert Systems with Applications 13 (4) (1997)
329–336.
[93]
Holton, Identifying disgruntled employee systems fraud risk through text mining:
a simple solution for a multi-billion dollar problem, Decision Support Systems
46 (4) (2009) 853–864.
[94]
Y. Jin, R.M. Rejesus, B.B. Little, Binary choice models for rare events data: a
crop insurance fraud application, Applied Economics 37 (7) (2005) 841–848.
[95]
J.L. Kaminski, Insurance Fraud, OLR Research Report, http://www.cga.ct.gov/2005/
rpt/2005-R-0025.htm. 2004
[96]
Kirkos, C. Spathis, Y. Manolopoulos, Data mining techniques for the detection of
fraudulent financial statements, Expert Systems with Applications 32 (4) (2007)
995–1003.
[97]
S. Kotsiantis, E. Koumanakos, D. Tzelepis, V. Tampakas, Forecasting fraudulent
financial statements using data mining, International Journal of Computational
Intelligence 3 (2) (2006) 104–110.
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1469 editor@iaeme.com
[98]
Y. Kou, C. Lu, S. Sirwongwattana, Y. Huang, Survey of fraud detection
techniques, IEEE International Conference on Networking, Sensing & Control
(2004) 749–754.
[99]
W. Lee, S. Stolfo, Data Mining Approaches for Intrusion Detection, 7th
USENIX Security Symposium, San Antonio, TX, 1998.
[100]
J. Li, K. Huang, J. Jin, J. Shi, A survey on statistical methods for health care
fraud detection, Health Care Management Science 11 (3) (2008) 275–287.
[101]
J.W. Lin, M.I. Hwang, J.D. Becker, A fuzzy neural network for assessing the
risk of fraudulent financial reporting, Managerial Auditing Journal 18 (8)
(2003) 657–665.
[102]
J.A. Major, D.R. Riedinger, EFD: a hybrid knowledge/statistical-based system for
the detection of fraud, The Journal of Risk and Insurance 69 (3) (2002) 309–324.
[103]
S. Mitra, S.K. Pal, P. Mitra, Data mining in soft computing framework: a survey,
IEEE Transactions on Neural Networks 13 (1) (2002) 3–14.
[104]
E.W.T. Ngai, L. Xiu, D.C.K. Chau, Application of data mining techniques in
customer relationship management: a literature review and classification, Expert
Systems with Applications 36 (2) (2009) 2592–2602.
[105]
S. Owusu-Ansah, G.D. Moyes, P.B. Oyelere, P. Hay, An empirical analysis of
the likelihood of detecting fraud in New Zealand, Managerial Auditing Journal
17 (4) (2002) 192–204.
[106]
Oxford Concise English Dictionary, Tenth ed, Publisher, 1999.
[107]
J. Pathak, N. Vidyarthi, S.L. Summers, A fuzzy-based algorithm for auditors to
detect elements of fraud in settled insurance claims, Managerial Auditing Journal
20 (6) (2005) 632–644.
[108]
J. Pearl, Probabilistic Reasoning in Intelligent Systems, Morgan Kaufmann, 1988.
[109]
C. Phua, V. Lee, K. Smith, R. Gayler, A comprehensive survey of data mining-
based fraud detection research, Artificial Intelligence Review (2005) 1–14.
[110]
J. Pinquet, M. Ayuso, M. Guillén, Selection bias and auditing policies for
insurance claims, The Journal of Risk and Insurance 74 (2) (2007) 425–440.
[111]
J.T.S. Quah, M. Sriganesh, Real-time credit card fraud detection using compu-
tational intelligence, Expert Systems with Applications 35 (4) (2008) 1721–
1732.
[112]
Sánchez, M.A. Vila, L. Cerda, J.M. Serrano, Association rules applied to credit
card fraud detection, Expert Systems with Applications 36 (2) (2009) 3630–3640.
[113]
S. Sharma, Applied Multivariate Techniques, Wiley, New York, 1996.
[114]
M.J. Shaw, C. Subramaniam, G.W. Tan, M.E. Welge, Knowledge management
and data mining for marketing, Decision Support System 31 (1) (2001) 127–
137.
[115]
L. Sokol, B. Garcia, J. Rodriguez, M. West, K. Johnson, Using data mining to
find fraud in HCFA health care claims, Topics in Health Information
Management 22 (1) (2001) 1–13.
[116]
C.T. Spathis, Detecting false financial statements using published data: some
evidence from Greece, Managerial Auditing Journal 17 (4) (2002) 179–191.
[117]
C.T. Spathis, M. Doumpos, C. Zopounidis, Detecting falsified financial statements:
a comparative study using multicriteria analysis and multivariate statistical
techniques, The European Accounting Review 11 (3) (2002) 509–535.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1470 editor@iaeme.com
[118]
Srivastava, A. Kundu, S. Sural, A.K. Majumdar, Credit card fraud detection using
hidden Markov model, IEEE Transactions on Dependable and Secure Computing 5
(1) (2008) 37–48.
[119]
M. Sternberg, R.G. Reynolds, Using cultural algorithms to support re-
engineering of rule-based expert systems, in dynamic performance
environments: a case study in fraud detection, IEEE Transactions on
Evolutionary Computation 1 (4) (1997) 225–243.
[120]
M. Syeda, Y. Zhang, Y. Pan, Parallel granular neural networks for fast credit card
fraud detection, 2002, IEEE International Conference on Fuzzy Systems 1 (2002)
572–577.
[121]
P. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining, First ed.Addison-
Wesley Longman Publishing Co., Inc, 2005.
[122]
S. Tennyson, P. Salsas-Forn, Claims auditing in automobile insurance: fraud
detection and deterrence objectives, The Journal of Risk and Insurance 69 (3) (2002)
289–308.
[123]
Turban, J.E. Aronson, T.P. Liang, R. Sharda, Decision Support and Business
Intelligence Systems, Eighth ed, Pearson Education, 2007.
[124]
S. Viaene, M. Ayuso, M. Guillén, D. Van Gheel, G. Dedene, Strategies for
detecting fraudulent claims in the automobile insurance industry, European
Journal of Operational Research 176 (1) (2007) 565–583.
[125]
S. Viaene, G. Dedene, R.A. Derrig, Auto claim fraud detection using bayesian
learning neural networks, Expert Systems with Applications 29 (3) (2005) 653–
666.
[126]
S. Viaene, R.A. Derrig, B. Baesens, G. Dedene, A comparison of state-of-the-art
classification techniques for expert automobile insurance claim fraud detection,
The Journal of Risk and Insurance 69 (3) (2002) 373–421.
[127]
S. Viaene, R.A. Derrig, G. Dedene, A case study of applying boosting naive
Bayes to claim fraud diagnosis, IEEE Transactions on Knowledge and Data
Engineering 16 (5) (2004) 612–620.
[128]
J. Wang, Y. Liao, T. Tsai, G. Hung, Technology-based financial frauds in
Taiwan: issue and approaches, IEEE Conference on: Systems, Man and
Cyberspace Oct (2006) 1120–1124.