ArticlePDF Available

Development of Crime and Fraud Prediction using Data Mining Approaches

Authors:

Abstract and Figures

Crime remains to continue to be a serious threat to all groups and peoples throughout the world together with the complexity in technology and procedures that are being manipulated to allow extremely complex criminal acts. Data mining is now an essential tool for examining, reducing, and avoiding crime and is manipulated by both government and private institutions across the globe which is the method of revealing hidden information from Big Data. The data mining methods themselves are temporarily presented to the reader and this information includes the social network analysis, neural networks, naive Bayes rule, support vector machines, decision trees, association rule mining, clustering, entity extraction, and amongst others. The main objective of this article is to offer a concise analysis of the data mining applications in crime. Finally, the article evaluates applications of data mining in crime, including a considerable quantity of the study to date, displayed in chronological order with a summary table of numerous crucial information mining applications in the crime area as a directory of reference.
Content may be subject to copyright.
http://www.iaeme.com/IJARET/index.asp 1450 editor@iaeme.com
International Journal of Advanced Research in Engineering and Technology (IJARET)
Volume 11, Issue 12, December 2020, pp. 1450-1470, Article ID: IJARET_11_12_136
Available online at http://www.iaeme.com/ijaret/issues.asp?JType=IJARET&VType=11&IType=12
ISSN Print: 0976-6480 and ISSN Online: 0976-6499
DOI: 10.34218/IJARET.11.12.2020.136
© IAEME Publication Scopus Indexed
DEVELOPMENT OF CRIME AND FRAUD
PREDICTION USING DATA MINING
APPROACHES
T. Chandrakala
Research Scholar, Department of Computer Applications, Dr. M.G.R. Educational &
Research Institute, Chennai, Tamilnadu, India
S. Nirmala Sugirtha Rajini
Professor, Department of Computer Applications, Dr. M.G.R. Educational & Research
Institute, Chennai, Tamilnadu, India
K. Dharmarajan
Associate Professor, Department of Information Technology, Vels Institute of Science
Technology & Advanced Studies, Pallavaram, Chennai, Tamilnadu, India
K. Selvam
Professor, Department of Computer Applications, Dr. M.G.R. Educational & Research
Institute, Chennai, Tamilnadu, India
ABSTRACT
Crime remains to continue to be a serious threat to all groups and peoples
throughout the world together with the complexity in technology and procedures that
are being manipulated to allow extremely complex criminal acts. Data mining is now
an essential tool for examining, reducing, and avoiding crime and is manipulated by
both government and private institutions across the globe which is the method of
revealing hidden information from Big Data. The data mining methods themselves are
temporarily presented to the reader and this information includes the social network
analysis, neural networks, naive Bayes rule, support vector machines, decision trees,
association rule mining, clustering, entity extraction, and amongst others. The main
objective of this article is to offer a concise analysis of the data mining applications in
crime. Finally, the article evaluates applications of data mining in crime, including a
considerable quantity of the study to date, displayed in chronological order with a
summary table of numerous crucial information mining applications in the crime area
as a directory of reference.
Keywords: Data Mining, Regression, Naive Bayes rule, Support Vector Machine, Big
Data, Neural Network.
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1451 editor@iaeme.com
Cite this Article: T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan,
K. Selvam, Development of Crime and Fraud Prediction using Data Mining
Approaches. International Journal of Advanced Research in Engineering and
Technology, 11(12), 2020, pp. 1450-1470.
http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=12
1. INTRODUCTION
Crime has advanced quickly after some time, with criminals now misusing the most recent in
innovation not exclusively to perpetrate crimes yet additionally to dodge being caught. Crime
is not, at this point restricted to the lanes and back rear entryways in our communities. The
development of 'Large Data', which requests novel methodologies towards the viable and
precise examination of the developing volumes of crime information, was a significant test for
all law authorization and knowledge gathering associations [1]. The Internet, which interfaces
the whole world, is likewise a flourishing play area for the more complex crooks in the
advanced age. Heaps of fear, for example, the 9/11 psychological oppressor assaults and
utilization of innovation to hack into the most secure safeguard information bases, the
requirement for modern and viable techniques for crime avoidance is progressively critical
[2].
It is in this scenery that DM (data mining) is depicted as an amazing asset with
extraordinary opportunity to assist criminal investigation center around the most significant
data covered up inside the 'Enormous Data' on crime [3]. Information digging as an apparatus
for crime investigation is perceived as a relatively new and exceptionally looked after region
of exploration [4]. This isn't unexpected as DM (data mining) itself is a generally new and
quickly developing topic, and such intrigued by the authentic and current meanings of DM
(data mining) are alluded to [5] as DM (data mining) isn't worried about assessment and
examinations or predetermined models yet with finding models through an algorithmic
exploration measure investigating direct and nonlinear models, unequivocal or not.
Figure 1 Phases/Processes detected in data mining [20]
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1452 editor@iaeme.com
PC information experts have started helping law requirement officials and analysts
accelerate the favorable to fathoming violations [6] and anticipate violations ahead of time.
The notoriety in the utilization of numerous DM (data mining) methods are additionally
affected by the expanding accessibility of Big Data and its convenience for individuals who
need information examination abilities and factual information [7]. As distinguished by
numerous creators, admittance to information assumes a significant job in the adequacy of
DM (data mining) in crime, however, issues emerge as access is prevented by security
anxieties [2,8].
This article is planned for giving a succinct survey of the information-digging applications
utilized for recognizing and detecting crime throughout the long term. Normally, this
educational survey article will help acquaint Data Mining procedures with crime analysts and
agents notwithstanding supporting and empowering future investigation into developing
information digging for crime examination. To empower such use, the survey hosts been
sorted out so intrigued gatherings could undoubtedly allude to this article alone to inform each
other on research that has just been led to a date and the subsequent results that have been
achieved [9-12]. The fundamental commitment of this article is two-crease as it does not just
catch a larger part of the huge DM (data mining) applications in crime by characterizing these
dependent on various kinds of strategies yet additionally presents a brief presentation into
every one of the pertinent DM (data mining) methods that were misused for mining crime.
Besides, the survey additionally incorporates, in the plain organization, an outline of DM
(data mining) applications in crime that can go about as a brisk reference manager for
specialists.
The principle crime-based DM (data mining) procedures classification, clustering,
Associate mining rule, and successive example mining. Our exploration revealed that the
subsequent DM (data mining) methods [13-15] are most regularly embraced for crime
investigation. These incorporate social network investigation, neural systems, Navies Bayes
rule, Support vector machines, affiliation rule mining, DT (Decision Trees), clustering, and
data mining.
2. DATA MINING APPLICATIONS IN CRIME
In this segment, we offer a synopsis of the Data Mining applications utilized to distinguish
and forestall crime. In contrast with conventional DM (data mining) strategies, the serious
methods center around both organized and unstructured information to distinguish designs
[3,11]. Most existing frameworks utilize a blend of DM (data mining) methods with the rise
of Big Data to get more exact and precise extractions.
As far as crime, as a wide scope of exploration fields, crime investigation can include a
wide scope of crime events, from straightforward infringement of city obligations to
universally sorted out violations [5]. Complex conspiracies are regularly hard to unwind
because data on respondents can be topographically diffused [16] and range significant
periods; distinguishing digital crime can moreover be troublesome because bustling system
traffic and successive online exchanges produce a lot of information, while just a little
segment would identify with criminal operations.
Table 1 is given as a kind of perspective registry to give away from DM (data mining)
applications in crime. This table sums up data dependent on the DM (data mining) method
utilized and gives data identifying with delicate products, districts, and reasons for the basic
applications. An ongoing survey of information digging procedures utilized for economic
bookkeeping extortion location up until 2011 can be found [17] and are hence not duplicated
here.
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1453 editor@iaeme.com
Figure 2: Framework for Crime data mining
2.1. Entity Extraction
Entity extraction can be characterized as the cycle of concentrating metadata from
unstructured content files. A neural system-based substance extractor [18], which employed
named-element removal procedures to recognize helpful elements from police story reports in
2002. They featured the significance of important data put away as text objects in criminal-
equity information (i.e., the free-text police story reports), which are viewed as unstructured
data. In contrast to the data from organized information, this unstructured information can't be
handily gotten to and utilized by examiners or analysts. Four significant named-element
extraction approaches [8] are quickly summed up and recorded in like manner: rule-based
[98], the lexical query [19], measurement-based [20], and AI [21-25]. Like most prevailing
data extraction frameworks, the element extractor [19] comprises of more than one of these
recorded methodologies. All the more explicitly, it consolidates lexical query, AI, and
insignificant hand-made principles. The exactness paces of removing elements of people and
opiate medicate are 85.4% and 74.1%, with review paces of 77.9% and 73.4%, individually
[18] by relating the neural system based substance extractor in 46 reports haphazardly chose
from the Phoenix Police Department information base for opiate-related violations.
To distinguish what crimes could conceivably have been submitted by a similar gathering
of people [26], a separation measure is proposed with a four-advance worldview and variation
of the likelihood thickness capacity to extricate substances from an assortment of reports. It is
then utilized to change a high-dimensional vector table into a contribution for a police-
operable device. The creators utilized the SPSS LexiQuest text mining instrument by applying
this proposed separation measure, [103] to frame a table of the apparent multitude of elements
remembered for every examination. From that point, the change stage analyzed the
examinations on basic elements, the variation of even separation measure and likelihood
thickness work with the ordinary dissemination, lastly, a two-dimensional portrayal of the
separations between all potential pairs of examinations is gotten to assist the crime experts
and agents accomplish a general away from of all going through examinations.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1454 editor@iaeme.com
Table 1 Outline of Data Mining applications in crime.
Data Mining
Methods
Function and Purpose
Key Methods or Software’s
Entity Extraction
[99]
Suspect Descriptions, Narcotic Drug,
Organization, Personal Property, Crime
Type, Gender
and Race, Phone, Nationality, Vehicle,
Extract valuable information Time,
Investigative, Natural Language
Processing, SPSS LexiQuest, Named
Entity Extraction (hand-crafted rules,
machine learning, rule-based, and lexical
lookup)
Cluster Analysis
[100]
Distinguish hot spots of a criminal offense;
automatically detect associations from
current criminal offense information and
weight interactions to identify the most
powerful association amongst all the
potential pairs of crime associated entities.
Hierarchical Clustering Technique, Self-
Organizing Map, Geographic Information
System (GIS),
Association Rule
[99,100]
Connect crime occurrences, narrow down
any potential defendants, offer informative
association between criminal items or
entities, find crime patterns.
Outlier Score Function, Dynamically
Adjusted Weights, Transformed
Categorical Similarities, Distributed
Association Rule Mining, Apriori
Algorithm,
Classification
Methods
[101,105]
Efficient detection of specific criminal
activities among large-sized data sets;
Categorize crime data; Predict crime hot
spots.
Deceptive Theory, Hunt’s Algorithm,
CART, STAGE Algorithm, C4.5
Algorithm, ID3 Algorithm,
Social Network
[106]
Offer analyses of structures and the
functions
The ratio of Periphery/Core, K-core.
An Online Crime Reporting System was created [105] to separate important crime data
from witness accounts and to produce extra inquiries dependent on the removed data. The
proposed framework joined common language preparing and an insightful meeting strategy
(because of the intellectual meeting standards) with the appropriation of the General
Architecture for Text Engineering System as the data mining instrument. By assessing the
exhibition on the presume portrayal module, a general review pace of 70% and 100%
accuracy was accomplished.
The objective of applying data extraction strategies [27-30] in crime examination is to
assist specialists with separating crime-related data rapidly and adequately. They built up a
web-based announcing framework that depends on a data extraction method and consolidated
regular language handling with bits of knowledge from the intellectual meeting way to deal
with getting more data from witnesses and casualties. This proposed framework [33]
consolidates data extraction and standards of the psychological meeting with the end goal of
proficiently assembling more important data from those casualties and eyewitnesses who are
excessively terrified or humiliated that will be reported crime occurrences. As this framework
additionally supports the utilization of normal language, it not just becomes the way toward
announcing crime simpler yet also empowers the social event of more data. An enormous
vocabulary that consolidates a standard-based framework is created to extricate crime-related
substances by setting off this proposed framework to pose inquiries as per the standards of the
intellectual meeting. The exhibitions of the proposed framework show fundamentally high
exactness rates (94% for police stories and 96% for the testimony accounts) and review rates
(85% for police stories and 90% for the testimony accounts). The semantic inferential-based
regular language preparing model is utilized [31-35] to remove crime data from unstructured
content. Moreover, this system especially gave the variation of the communitarian conditions
on the Web. The assessment of this structure was directed on 100 crime linked writings on the
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1455 editor@iaeme.com
Web, and an exactness pace of 87% was accomplished for removing the crime scene close by
a 72% accuracy ratio for extricating the kind of crime.
An extended substance state is characterized [36] as the key part for separating an
element. Viable performance is directed by joining part of the discourse-based format
coordinating and cosmology are driven public language handling. The starter findings by the
realization of this anticipated methodology on free content law implementation information
beat the named-substance mining strategy and revealed around 80% exactness and review
rates by and large.
Besides, the element extraction procedure joined with POS [37] labeling for criminal data
examination and relationship representation. They revealed their strategy as a proficient and
powerful term-relationship mining procedure, which indicated incredible execution even with
the most unpredictable instance of Chinese POS labeling throughout its usage on criminal
information from Taiwan.
A standard-based Arabic designated substance acknowledgment [38] (NER) framework is
introduced to recognize and arrange designated elements in Arabian crime text. It includes
three modules in the pre-preparing stage: sentence parsing, tokenization, and grammatical
form labeling; syntactic standards, examples, and gazetteer are additionally considered during
the time spent the named substances recognizable proof stage. The proposed framework
accomplished a general 91% exactness rate and 89% review rate when tried on the corpus of
Arabic crime reports from papers. As of late, named substance acknowledgment with [39]
gazetteers and rule-put together extraction for extricating data concerning ethnicities from
crime information in Malaysia. NER [20] was utilized with a contingent arbitrary field (an AI
way to deal with arrange a crime area sentence in an article) to separate crime data obtained in
online news stories. The two examinations of two papers in New Zealand and correlations of
crime area extraction across nations (India and Australia) were led with a general 80 90%
exactness for examinations on New Zealand papers and the most part about 75% precision for
cross-country situations.
3. CLASSIFICATION OF DATA MINING METHODS AND
APPLICATIONS
Every one of the six DM (data mining) application classes is upheld by a lot of
algorithmic ways to deal with removing the important connections in the information
[73]. These methodologies contrast in the classes of issues that they can tackle [41]. The
classes are as per the following.
Classification
. Grouping builds and utilizes a model to foresee the downright marks of
obscure items to distinguish objects of different categories. These absolute names are
unordered, discrete, and predefined [33,71]. A grouping and expectation are the methods
toward identifying a lot of normal features [108] and models that describe and identify
information ideas or classes. Basic characterization methods incorporate neural systems,
the guileless Bayes procedure, Support vector machines, and DT (Decision Trees). Such
characterization errands are utilized in the recognition of Visa, medicinal services and
accident protection, and corporate misrepresentation, among different kinds of extortion,
and grouping is one of the best well-known learning models in the use of DM (data
mining) in FFD.
Clustering.
Clustering is exploited to partition objects into thoughtfully important
gatherings (groups), with the articles in a gathering being like each other but extremely
not any like the items in various gatherings. Bunching is otherwise referred to as the
information section or parceling and is regarded as a variation of solo characterization
[30,72]. A "clustering investigation concerns the issue of breaking down or apportioning
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1456 editor@iaeme.com
an informational index [86] (generally multivariate) into gatherings so the focuses in one
gathering are like one another and are as various as conceivable from the focuses in
different gatherings." Further, each group is an assortment of information objects which
are like each other inside a similar group yet unlike those in different groups. The most
well-known bunching methods are the Naïve Bayes strategy, the K-closest neighbor, and
self-arranging map procedures [88].
Prediction
. Predictions calculate statistical and requested future qualities dependent on
the examples of an informational index [35-38]. The characteristic for which the qualities
are being expected [39] which is stable esteemed (requested) instead of clear cut
(discrete-esteemed and unordered). This ascribe can be alluded to just as the anticipated
property. Neural systems and strategic model expectations are the most regularly utilized
forecast methods.
Outlier Detection
. Anomaly identification is exploited to quantify the "separation"
among information items to differentiate such papers that are not the similar as or
contradictory with the remainder of the informational collection [39-41]: "Information
that seems to have unexpected attributes in comparison to the remainder of the populace
are called anomalies" [43]. The issue of exception/irregularity discovery is one of the
greatest key issues in DM (data mining) [82]. A regularly utilized method in exception
discovery is the limiting learning calculation.
Regression
It is a measurable technique used to uncover the connection among at least
one free factor and a needy variable (that is ceaselessly esteemed) [45]. Numerous
experimental investigations have utilized strategic regression as a benchmark
[1,28,62,75,121]. The regression procedure is regularly attempted utilizing such
numerical techniques as calculated regression and direct regression, and it is utilized in
the location of charge card, harvest and collision protection, and corporate
misrepresentation.
Visualization
. Representation alludes to the efficiently justified initiation of information
and to a methodology that alters over confused information qualities into clear examples
to permit clients to see the unpredictable examples of connections revealed in the DM
(data mining) measure [63,73]. An analyst at Bell and AT&T Laboratories [119] have
misused the example recognition capacities of the human visual framework by building a
set-up of instruments and applications that flexibly encode information utilizing shading,
position, size, and other visual qualities. Representation is best used to convey complex
examples through the away from information or capacities.
4. CLASSIFICATION METHODS
Classification strategies are utilized for arranging perceptions dependent on some noteworthy
principles/properties that are found from the information base as one of the highly crucial and
huge DM (data mining) procedures. As far as applying grouping strategies in crime DM (data
mining), numerous usages joined more than one explicit sort of order strategy. In this way, the
survey that follows is arranged by every method in sequential requests, and relying upon
conditions, those unpredictable blend cases are not recreated [45-48].
4.1. Decision Trees
The characterization strategy, otherwise called decision trees (DT), is utilized [47] for
recognizing dubious messages and announced over 95% precision inaccurately characterizing
messages in a huge estimated dataset. The model of the beguiling hypothesis is applied to the
dataset of messages, and the decision tree is produced using the ID3 calculation. The
exhibitions of various order strategies were examined for reviewers in recognizing firms that
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1457 editor@iaeme.com
issue deceitful budget summaries [48]. Three models (DT (Decision Trees), neural systems,
and Bayesian conviction systems) are assessed by managing the ID of elements related to fake
fiscal reports on datasets from 76 Greek assembling firms. DT (Decision Trees), neural
systems, support vector machines, and a random boosting are used for hotspot identification
in an urban improvement venture dataset, which contains 1.4 million cases, 14 indicators, and
a twofold reaction variable. DT (Decision Trees) are received for anticipating crime
announcing in ref. [50] by recognizing the factors that impact whether a crime is accounted
for from overview information got through the Bureau of Justice Statistics of USA [51],
grouping methods like DT (Decision Trees), neural system, Support vector machines, and
Navies Bayes and are applied and looked at for foreseeing crime hotspots and crime gauging.
A decrease system of a trait is joined with the DT (Decision Trees) [51,52] calculation for
investigating criminal conduct. Innocent Bayesian, C4.5 DT (Decision Trees), and rule-based
grouping are misused [52-54] for identifying collision protection misrepresentation. The DT
(Decision Trees) procedure is especially contrasted and other information-digging strategies
for car protection misrepresentation recognition [53,54]. A decision tree-based
characterization model is utilized for finding crime designs and foreseeing future patterns
[55]. A characterization calculation is assessed [56] for crime expectation where they discover
DT (Decision Trees) beating a guileless Bayesian calculation in precisely anticipating the
crime classification for various states in the USA. An enhanced decision tree calculation
dependent on the Maclaurin-Priority Value First technique is utilized [56-58] for PC crime
criminology, and this increased calculation outflanked the normally utilized ID3 calculation
as far as both proficiency and precision. All the more as of late, an assortment of arrangement
procedures was assessed for identifying protection misrepresentation [58] where a calculation
for deciding the connection between grouping models and various sorts of protection
extortion information was proposed.
4.2. Neural Networks (NN)
DT (Decision Trees), ANN (Artificial neural systems), and strategic regression are utilized
[59] for revealing lies from 371 proclamations of various sorts of violations. ANNs are
applied to distinguish sneaking containers, and the discoveries demonstrate that the ANN
reports a greater precision than strategic regression [60].
Support vector machines, Decision trees, MLP (Multilayer perceptron), probabilistic
neural systems, and hereditary programming (GP) [61] for recognizing phishing messages
from a dataset of 2500 messages [62], DT (Decision Trees), ANNs, and backing vector
machines were thought about for separating among those not charged and charged for
beginning adolescent culpable. Strategic regression, improving, SVM, neural systems, K-
closest neighbor, and an assortment of other information-digging strategies are assessed for
foreseeing recidivism [63].
4.3. Support Vector Machines (SVM)
The SVM arrangement has been utilized to distinguish the wellsprings of email spamming
dependent on the sender's semantic examples and basic highlights [10,64]. An SVM model is
effectively utilized [64] to help creator recognizable proof criminology through mining email
content. SVM is utilized for crime hotspot forecast [65], where they discover it outflanking
neural systems and spatial auto-regression based methodologies. SVM is abused [66] for
crime scene order utilizing an information base containing 400 crime scenes. The creators
discover SVM-based methodologies per-shaping well than Multilayer perceptron neural
systems. SVM utilized to help with distinguishing advanced proof identifying with PC
violations using anomaly identification [67]. SVM was utilized to assist crime examiners
favorable to duce a lot of potential understandings for area significant ideas through bit-based
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1458 editor@iaeme.com
connection extraction [68]. The SVM method utilized for foreseeing criminal recidivism and
contrasted their outcomes and those acquired from strategic regression and neural systems
[69]. The exhibition of SVM for identifying fraud utilizing the Schonlau dataset [70], and
while they discovered one-class SVM models to be down to earth in precisely finding data
fraud in the overall case, the outcomes were not good where explicit client profiles fit the
normal client profile. Progressed charge misrepresentation exercises are identified utilizing
both SVM and irregular woodlands (RF) [71] where they discover SVM beating RF2.
Besides, an SVM model is utilized close by calculated regression and RF for identifying
charge card misrepresentation [72], where the exhibitions are assessed on genuine exchange
information from between public money related organizations. The SVM procedure joined
with the AdaBoost calculation [73] for digital crime location and anticipation dependent on a
Facebook dataset. A crossbreed SVR (Support vector regression) model in the mix with
ARIMA and it is vital that RF were effectively utilized [114] in depicting the antecedents of
murder by representing the overall expenses of gauging mistakes. The RF calculations are
likewise clarified [114], PSO for estimating property-related misconduct rates in the USA,
and discovered it outflanking gauges from the individual models [75], SVR was utilized for
anticipating property-related misconduct rates alongside dark proportional examination. The
conduct of lawbreakers is investigated utilizing SVM [120] for Malaysia, where the
proportion of police to the populace is very low to 1000. All at 3.6 the more as of late, it has
been appeared [112] that an SVM paradigm can be utilized to give live forecasts of crime in
metropolitan regions dependent on Twitter information identified with an urban subsection
from inside the city of San Francisco.
4.4. SNA (Social Network Analysis)
Interpersonal organization investigation (SNA) is a strategy based on the examination of the
social structure of perceptions for distinguishing noteworthy data. An existing idea of the
system was examined and analyzed in the utilizations of the crime investigation space with
correlations with a definite clarification [78]. SNA was applied and identified two conditional
system structure estimations, k-center, and center/fringe proportion [79], to recognize the
online closeout swelled notoriety merchants from ordinary records. Significant results
demonstrated that SNA could go about as a powerful indicator to recognize criminal records
and potentially forestall and lessen dangerous exchanges and online closeout cheats. An
examination of the structure of the Global Salafi Jihad connects [80] with Web auxiliary
mining and SNA. Results indicated that the proposed procedure can be a powerful device to
distinguish key individuals in a fear-based oppressor network and, accordingly, assist
specialists with creating productive and compelling problematic techniques and measures.
The SNA procedure was utilized to examine composed criminal gatherings by applying it
to a fugitive cruiser posse working in Canada [81]. SNA is introduced as another sort of
insight for country security in the USA, which can give significant information on the one of
a kind qualities of psychological militant associations, comprehend fear-based oppressor
systems and structure the reason for a more compelling countermeasure to netwar [82]. SNA
was utilized on the Enron company's email file [83], and it ends up being ready to examine
and index examples of communications between substances in an email assortment to
separate social standing. The creators expressed that this procedure becomes conceivable to
see a preview of a corporate network and successfully decide the genuine connections and
associations among people.
The web mining and system investigation strategy to contemplate scorn bunches in online
sites [84]. Their planned methodology effectively-recognized and examined a designated set
of disdain gatherings (which included 820 bloggers) on Xanga. SNA was preformed close by
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1459 editor@iaeme.com
fluffy hypothesis with the point of displaying multi-modular interpersonal organizations.
Another fluffy parallel activity was proposed to fulfill the necessities of a fluffy consolidation
administrator [85]. An improved most brief way calculation [86], brushing SNA to mine the
center individual from a psychological oppressor gathering. Two SNA markers, k-center and
focus loads calculations were utilized to shape a proposal framework that can recommend the
dangers of agreement related to a record for online sale webpage clients [87]. The outcomes
are promising, with 76% identification precision on genuine world 'boycott' information and
soundproof that the methodology can give compelling admonitions a while in front of the
official arrival of boycotts. The proposed calculation, process measurements, can work the
calculation and investigation without knowing the recognizable information with high-
security guarantees [88]. The significance of examining the rise of digital networks in online
journals with a mix of web mining and SNA strategies [89]. The creators planned a
reproduction email framework dependent on character quality measurements to display the
traffic behavior of email account clients and utilized this with SNA to distinguish key
individuals from a criminal gathering [90]. The pertinence of SNA for country security DM
(data mining) with contextual investigations dependent on pack/opiate systems, US radical
systems, Al-Qaeda part systems, and universal Jihadist sites and gathering systems [91].
A structure for automated organized information examination was examined and
concluded from different interpersonal organizations by changing over an exchange dataset
and applying affiliation mining and factual strategies. The proposed technique fluctuates from
past work and consolidated the game hypothesis idea in a multi-specialist model to fabricate
P2P applications for the police power to recognize connections among hoodlums and
restricted down potential suspects [92]. An SNA-based model was forecasted [93] for
focusing on criminal systems. It is based on Borgatti's central participant methodology with
alterations on fusing the overall quality of entertainers just as the quality of the connections
restricting system entertainers. SNA [94] was utilized for distinguishing web closeout
misrepresentation. They performed investigates information from the Yahoo Auctions site
with examinations of various kinds of web closeout accounts and accomplished promising
outcomes relating to expectation exactness.
Carrington grouped the uses of SNA in the crime space into three territories [95] (the
impact of the individual system on self-image’s misconduct or crime, the impact of
neighborhood systems on crime in the area, and the association of criminal gatherings and
exercises) and furnished itemized clarifications along with huge speculations and writing. The
SNA methodology was utilized as a device for helping wrongdoing experts and criminologists
in police powers create ban systems [96], with a model contextual analysis concentrating on
the Richmond City Police Department. Nitty-gritty instances of executions are given to show
how the proposed approach can help police in understanding complex conduct inspirations of
guilty parties, deliberately hot-spotting individuals of intrigue, and creating more grounded
between jurisdictional working connections [96]. All the more as of late, SNA was utilized to
distinguish [97] and dissect shrouded exercises in interpersonal organizations. Both Min Cut-
based and regression-based calculations are embraced and contrasted and datasets from a few
sources.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1460 editor@iaeme.com
Table 2
Research on data mining methods in FFD.
Data mining
application
Data mining methods
Referen
ces
Classification
Bayesian Belief Network, RIPPER, CART, decision
trees, Ada boost algorithm,
[[20]
Clustering
Discriminant analysis, Neural networks
[[26]
Decision trees
, neural
networks,
Naïve Bayes,
discriminant analysis, logistic model, K-nearest
neighbor
[[83]
Evolutionary algorithms, Support vector machine
[[21]
Self-organizing map
[[66]
Hidden Markov Model
[[59,76]
Classification
Network analysis
[[35]
Regression
Probit model, Logistic model
[[7]
Yield-switching model
[[42]
Classification
Association rule
[[82]
Polymorphous (M-of-N) logic
[[50]
Self-organizing map
[[42]
Visualization
Visualization
[[65]
Outlier detection
Discounting learning algorithm
[[86]
Classification
Logistic model
[[16]
Naïve Bayes
[[74]
Self-organizing map
[[15]
Logistic model, Bayesian belief network
[[75]
Logistic model
[[77]
Fuzzy logic
[[76]
Bayesian belief network, Naïve Bayes, K-nearest
neighbor
[[55]
SVM, NN, DT, Logistic model
[5,6]
Logistic model
[10]
PRIDIT (Principal component analysis of RIDIT)
[14]
Neural networks
[75]
Prediction
Logistic model
[67]
Evolutionary algorithms
[85]
Regression
Logistic model
[9,60]
Probit model
[22,79]
Classification
UTADIS(UTilite's Additives DIScriminantes) and
MCDA (Multicriteria decision aid),
[[44]
Bayesian belief network, decision trees, Neural
networks
[[63]
Regression
Logistic model
[[29,75,8
6]
Prediction
Neural networks
[[19]
Clustering
Naïve Bayes
[[41]
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1461 editor@iaeme.com
Table 3
Statistics of manuscripts about the data mining methods and financial fraud
Methods
Other related
financial fraud
Corporate
fraud
Automobile
insurance
fraud
Healthcare
insurance
fraud
Insurance fraud
Crop insurance
fraud
Bank
fraud
Credit
card
fraud
Yield-switching model
1
Visualization
1
UTilite's Additives
DIScriminantes
(UTADIS)
1
Stacking variant
methodology
1
Principal component
analysis of RIDIT
(PRIDIT)
1
Polymorphous (M-of-
N) logic
1
Network analysis
Multicriteria decision
aid (MCDA)
1
Hidden Markov Model
1
Discounting learning
algorithm
1
Association rule
1
Ada boost algorithm
1
RIPPER
1
1
Fuzzy logic
1
1
Discriminant analysis
2
CART
1
1
Support vector
Machine
1
1
1
Self-organizing map
1
1
1
Probit model
2
1
K-nearest neighbor
1
1
1
Evolutionary
algorithms
1
1
1
Naïve Bayes
1
2
1
Decision trees
2
1
2
Bayesian belief
network
2
2
1
Neural networks
6
2
2
Logistic model
5
9
1
1
Total
25
24
5
3
17
5. CYBERCRIME DETECTION
Another use of DM (data mining) strategies is in cybercrime, going from a disavowal of
administration to obtaining unapproved admittance to data. The author concentrated on
recognizing Denial of Service assaults utilizing design [97] acknowledgment strategies. The
scientists utilized the framework log documents to look at designs and find anomalies
utilizing grouping. At the point when the exception is affirmed to be a DoS assault, a
framework director is educated. The analysts didn't expand on the achievement pace of their
technique.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1462 editor@iaeme.com
While trying to anticipate cybercrimes in the financial area, for example, make changes to
the information, illicit access, or making the system glitch, executed their new DM (data
mining) procedure [98]. The scientists needed to gather the information from numerous
sources and keep it in a crime information base. The information was then handled to be
prepared for DM (data mining). The specialists applied the accompanying techniques to make
an expectation standard:
J48.
Influenced Association Classification,
Classification,
Clustering,
Association Rule Mining,
The scientist's primary commitment is the utilization of J48 calculation, which as
indicated by their discoveries gives high exactness over the preparation information.
To resolve the issue of the absence of protection that permits cybercrime, engaging
security was proposed through ongoing DM (data mining) [99]. They notice that gadgets are
not canny enough to progressively update their protection notwithstanding new dangers, along
these lines, they proposed a continuous example acknowledgment model that associates with
security gadgets and permits them to overhaul their safeguards when dangers are available. As
per the specialists, it depends on concentrated ongoing DM (data mining) motor (RTDME).
6. FRAUD DETECTION
As per the Association of Certified Fraud Examiners [100] "misrepresentation incorporates
any purposeful or intentional act to deny another of cash or property by trickiness, misleading,
or other unjustifiable methods." The money related inspiration to forestall extortion is leading
examination to more modern techniques to identify misrepresentation utilizing DM (data
mining), and AI. In their observational examination, the author expects to add to medical
coverage extortion discovery [101-104]. Analysts limited the strategies for clinical
misrepresentation to copying charging, abusing charging codes, giving costly tests
(upcoding), and indicting for administrations that are not given while presenting the case to it
(unbundling). Because of the clear idea of the copying bills and abusing the codes, the
scientists concentrated on unbundling an upcoding.
To discover fake cases, the specialists recommended a strategy for discovering oddities
inside the current information by employing measurable decision standards and k-implies
bunching [113]. The scientists showed that by hailing anomalies; regardless of whether it's the
length of remain for a particular infection, installment, or speed of guarantee endorsement, the
cycle of extortion recognition can be profoundly improved.
A more normal field of extortion and misrepresentation identification is money related to
misrepresentation. The raise of utilizing DM (data mining) strategies were utilized [118-122]
to distinguish budgetary misrepresentation inside an association. Nonetheless, the analysts
likewise saw the need and the need for a technique that will recognize the budget report
misrepresentation inside a whole business gathering. The specialists build up a major
information-based extortion recognition approach for budget reports for business gatherings.
The gathering utilized the bunching model-QGA-SVM with preparing information from
bunches they realized submitted misrepresentation to 2014 from 2000, notwithstanding
individuals that didn't submit extortion [123-125].
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1463 editor@iaeme.com
The grouping precision was then contrasted and other bunching models, for example,
neural systems and decision trees. The outcome demonstrated that the model utilized by the
analysts has the most noteworthy precision rate.
Table 4 Various techniques used by the researches reviewed
Method
Used Methods
Reference
Classification
SVM, NN, DT
[101-103]
Association Rule
Text Mining
[104-106]
Cluster Analysis
K-Means Clustering,
Partitioning Clustering
[107-111]
Social Network
Analysis
-
[114-116]
Unclear
-
[117-119]
7. CONCLUSION
This article is a writing audit of DM (data mining) and AI applications in the field of
criminologists. Four distinct territories of corruption with various seriousness were analyzed
through crafted by scientists in the area and their utilization of DM (data mining) to add to
bringing down crime percentage through distinguishing, decreasing, or keeping crime from
occurring. Various methods of DM (data mining) were likewise utilized at various phases of
information assortment, examination, and formation of models. The utilization of recorded
information to empower expectation in crime anticipation is an area that needs more
examination because of the possibility it has in sparing lives and forestalling disasters.
REFERENCES
[1] Yadav, M. Timbadia, A. Yadav, R. Vishwakarma and N. Yadav, "Crime pattern detection,
analysis & prediction", 2017 International conference of Electronics, Communication and
Aerospace Technology (ICECA), 2017.
[2] X. Zhao and J. Tang, "Exploring Transfer Learning for Crime Prediction", 2017 IEEE
International Conference on Data Mining Workshops (ICDMW), 2017.
[3] H. Kang and H. Kang, "Prediction of crime occurrence from multi-modal data using deep
learning", PLOS ONE, vol. 12, no. 4, p. e0176244, 2017.
[4] S. Nath, "Crime Pattern Detection Using Data Mining", 2006 IEEE/WIC/ACM International
Conference on Web Intelligence and Intelligent Agent Technology Workshops, 2006.
[5] L. Thota, M. Alalyan, A. Khalid, F. Fathima, S. Changalasetty and M. Shiblee, "Cluster based
zoning of crime info", 2017 2nd International Conference on Anti- Cyber Crimes (ICACC),
2017.
[6] V. Ingilevich and S. Ivanov, "Crime rate prediction in the urban environment using social
factors", Procedia Computer Science, vol. 136, pp. 472-478, 2018.
[7] G. Weir, E. Dos Santos, B. Cartwright and R. Frank, "Positing the problem: enhancing
classification of extremist web content through textual analysis", 2016 IEEE International
Conference on Cybercrime and Computer Forensic (ICCCF), 2016.
[8] T. Anand, S. Padmapriya and E. Kirubakaran, "Terror tracking using advanced web mining
perspective", 2009 International Conference on Intelligent Agent & Multi- Agent Systems,
2009.
[9] M. Khan, S. Pradhan and H. Fatima, "Applying Data Mining techniques in Cyber Crimes",
2017 2nd International Conference on Anti-Cyber Crimes (ICACC), 2017.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1464 editor@iaeme.com
[10] K. Lekha and S. Prakasam, "Data mining techniques in detecting and predicting cyber crimes
in banking sector", 2017 International Conference on Energy, Communication, Data
Analytics and Soft Computing (ICECDS), 2017.
[11] B. Bhatti and N. Sami, "Building adaptive defense against cybercrimes using real-time data
mining", 2015 First International Conference on Anti-Cybercrime (ICACC), 2015.
[12] "Association of Certified Fraud Examiners - Fraud 101", Acfe.com, 2018. [Online].
Available: http://www.acfe.com/fraud-101.aspx. [Accessed: 18- Oct- 2018].
[13] Verma, A. Taneja and A. Arora, "Fraud detection and frequent pattern matching in insurance
claims using data mining techniques", 2017 Tenth International Conference on Contemporary
Computing (IC3), 2017.
[14] Y. Chen and C. Wu, "On Big Data-Based Fraud Detection Method for Financial Statements of
Business Groups", 2017 6th IIAI International Congress on Advanced Applied Informatics
(IIAI-AAI), 2017.
[15] S. V. Nath, Crime pattern detection using data mining, In Proceedings of the International
Conference on Web Intelligence and Intelligent Agent Technology Workshops, Hong Kong,
2006, 41 44.
[16] U. Fayyad and R. Uthurusamy, Evolving data into mining solutions for insights, Commun
ACM 45 (8) (2002), 28 31.
[17] M. Chau, J. J. Xu, and H. Chen, Extracting meaningful entities from police narrative reports,
In Proceedings of the 2002 Annual National Conference on Digital Government Research, Los
Angeles, CA, 2002, 1 5.
[18] J. Hosseinkhani, M. Koochakzaei, S. Keikhaee, and Y. H. Amin, Detecting suspicion
information on the web using crime Data Mining techniques, Int J Adv Comput Sci Inform
Technol 3 (1) (2014), 32 41.
[19] H. Chen, W. Chung, J. J. Xu, G. Wang, Y. Qin, and M. Chau, Crime data mining: a general
framework and some examples, Computer 37 (4) (2004), 50 56.
[20] Sharma, and P. K. Panigrahi, A review of financial accounting fraud detection based on data
mining tech- niques, Int J Comput Appl 39 (1) (2012), 37 47.
[21] T. K. Cocx, and W. A. Kosters, A distance measure for determining similarity between
criminal investigations, Adv Data Mining Appl Med Web Mining Marketing Image Signal
Mining 4065 (2006), 511 525.
[22] C. H. Ku, A. Iriberri, and G. Leroy, Crime information extraction from police and witness
narrative reports, In Proceedings of the IEEE Conference on Technologies for Homeland
Security, 12 13 May, Waltham, MA, 2008, 193 198
[23] C. H. Ku, A. Iriberri, and G. Leroy, Natural language processing and e-government: crime
information extraction from heterogeneous data sources, In Proceedings of the 9th Annual
International Digital Government Research Conference, Montreal, Canada, 2008, 18 21.
[24] V. Pinheiro, V. Furtado, T. Pequeno, and D. Nogueira, Nat- ural language processing based on
semantic inferentialism for extracting crime information from text, In Proceedings of the IEEE
International Conference on Intelligence and Security Informatics (ISI), Vancouver, BC, 2010,
23 26.
[25] J. Johnson, A. Miller, L. Khan, B. Thuraisingham, and M. Kantarcioglu, Extraction of
expanded entity phrases, In Proceedings of the IEEE International Conference on Intelligence
and Security Informatics (ISI), Beijing, China, 2011, 10 12.
[26] K.-S. Yang, C.-C. Chen, Y.-H. Tseng, and Z.-P. Ho, Name entity extraction based on POS
tagging for criminal infor- mation analysis and relation visualization. In Proceedings of the 6th
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1465 editor@iaeme.com
International Conference on New Trends in Information Science and Service Science and Data
Mining (ISSDM), 23 25 October, Taipei, Japan, 2012, 785 789.
[27] M. Asharef, N. Omar, and M. Albared, Arabic named entity recognition in crime documents, J
Theor Appl Inform Technol 44 (1) (2012), 1 6.
[28] Alkaff, and M. Mohd, Extraction of nationality from crime news, J Theor Appl Inform
Technol 54 (2) (2013), 304 312.
[29] R. Arulanandam, B. T. R. Savarimuthu, and M. A. Purvis, Extracting crime information from
online newspaper arti- cles, In Proceedings of the Second Australasian Web Con- ference, 20
23 January, Auckland, New Zealand, 2014, 31 38.
[30] T. Pang-Ning, M. Steinbach, and V. Kumar, Introduction to Data Mining, (1st ed.), London,
Pearson, 2014.
[31] T. G. Dietterich, S. Becker, and Z. Ghahramani, Advances in neural information processing
systems 14, In Proceed- ings of the Annual Conference on Neural Information Pro- cessing
Systems, MIT Press, 2002.
[32] M. R. Anderberg, Cluster analysis for applications (No. OAS-TR-73-9). Office of the
Assistant for Study Support Kirtland Afb N Mex, 1973.
[33] R. T. Ng and J. Han, Efficient and effective clustering methods for spatial data mining, In
Proceedings of The 20th International Conference on Very Large Data Bases, September 12
15, Chile, Santiago de Chile, 1994, 144 155.
[34] T. Grubesic, and A. Murray, Detecting Hot-spots using Cluster Analysis and GIS, In
Proceedings of the 5th Annual International Crime Mapping Research Conference, Dallas,
2001.
[35] R. V. Hauck, H. Atabakhsb, P. Ongvasith, H. Gupta, and H. Chen, Using Coplink to analyze
criminal-justice data, Computer 35 (3) (2002), 30 37.
[36] J. S. De Bruin, T. K. Cocx, W. A. Kosters, J. F. J. Laros, and J. N. Kok, Data Mining
approaches to criminal career analysis, In Proceedings of the 6th International Conference on
Data Mining, Hong Kong, 2006, 18 22.
[37] R. Adderley, M. Townsley, and J. Bond, Use of data mining techniques to model crime scene
investigator performance, Knowl Based Syst 20 (2) (2007), 170 176.
[38] R. Lombardo, and M. Falcone, 2011. Crime and Economic Performance. A cluster analysis of
panel data on Italy’s NUTS 3 regions. Working Paper, University of Calabria, 1 33.
[39] Y. Bello, and S. A. Yelwa, Complementing GIS with Cluster Analysis in Assessing Property
Crime in Katsina State, Nigeria, Am Int J Contemp Res 2 (7) 2012, 190 198.
[40] M. R. Keyvanpour, M. Javideh, and M. R. Ebrahimi, Detecting and investigating crime by
means of data mining: a general crime matching framework, Procedia Comput Sci 3 (2011),
872 880.
[41] M. Sukanya, T. Kalaikumaran, and S. Karthik, Criminals and crime hotspot detection using
data mining algorithms: clustering and classification, Int J Adv Res Comput Eng Technol 1
(10) (2012), 225 227.
[42] J. Agarwal, R. Nagpal, and R. Sehgal, Crime analysis using K-means clustering, Int J Comput
Appl 83 (4) (2013), 1 4.
[43] M. Vijayakumar, S. Karthick, and N. Prakash, The day- to-day crime forecasting analysis of
using spatial-temporal clustering simulation, Int J Sci Eng Res 4 (1) (2013), 1 6.
[44] R. Adderley, and P. B. Musgrove, Data mining case study: modeling the behavior of offenders
who commit serious sexual assaults, In Proceedings of the 7th International Conference on
Knowledge Discovery and Data Mining, San Francisco, CA, 2001, 26 29.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1466 editor@iaeme.com
[45] H. Chen, J. Schroeder, R. V. Hauck, L. Ridgeway, H. Atabakhsh, H. Gupta, C. Boarman,
K. Rasmussen, and A. W. Clements, COPLINK connect: information and knowledge
management for law enforcement, Decis Support Syst 34 (3) (2002), 271 285.
[46] H. Chen, D. Zeng, H. Atabakhsh, W. Wyzga, and J. Schroeder, COPLINK: managing law
enforcement data and knowledge, Commun ACM 46 (1) (2003), 28 34.
[47] J. Schroeder, J. Xu, H. Chen, and M. Chau, Automated criminal link analysis based on domain
knowledge, J Am Soc Inform Sci Technol 58 (6) (2007), 842 855.
[48] M. Gupta, B. Chandra, and M. P. Gupta, Crime data mining for Indian police information
system. Computer Society of India, 2008, 389 397.
[49] D. E. Brown, and S. Hagen, Data association methods with applications to law enforcement,
Deci Support Syst 34 (4) (2003), 369 378.
[50] S. Lin, and D. E. Brown, An outlier-based data association method for linking criminal
incidents, Decis Support Syst 41 (3) (2006), 604 615.
[51] S. Appavu, M. Pandian, and R. Rajaram, Association rule mining for suspicious email
detection: a data mining approach, In Proceedings of the IEEE Intelligence and Security
Informatics, Brunswick, NJ, 2007, 23 24.
[52] V. Ng, S. Chan, D. Lau, and C. M. Ying, Incremental mining for temporal Association Rules
for crime pattern discoveries, In Proceedings of the 18th Conference on Australasian
Database, Ballarat, Victoria, 2007, 29 January- 2 February, 123 132.
[53] A. L. Buczak, and C. M. Gifford, Fuzzy association rule mining for community crime
pattern discovery, In Proceedings of the ACM SIGKDD Workshop on Intelligence and
Security Informatics, Washington, DC, 2010, 25 28.
[54] D. Usha, and K. Rameshkumar, A complete survey on application of frequent pattern mining
and association rule mining on crime pattern mining, Int J Adv Comput Sci Technol 3 (4)
(2014), 264 275.
[55] M. J. Zaki, Parallel and distributed association mining: a survey, IEEE Concurrency 7 (4)
(1999), 14 25.
[56] S. Appavu, and R. Rajaram, Suspicious e-mail detection via decision tree: a data mining
approach, J Comput Inform Technol 15 (2) (2007), 161 169.
[57] E. Kirkos, C. Spathis, and Y. Manolopoulos, Data Mining techniques for the detection of
fraudulent financial state- ments, Expert Syst Appl 32 (4) (2007), 995 1003.
[58] C. Wang, and P.-S. Liu, Data mining and hotspot detection in an urban development project. J
Data Sci 6 (2008), 389 414.
[59] J. Gutierrez, Using decision trees to predict crime reporting, In Advanced Principles for
Improving Database Design, Systems Modeling, and Software Development, K. Siau C.-H.
Yu, M. W. Ward, M. Morabito, and W. Ding, Crime forecasting using data mining techniques.
In Proceedings of the 11th International Conference on Data Mining Workshops, 11
December, Vancouver, BC, 2011, 779 786.
[60] W. Hui, W. Jing, and Z. Tao, Analysis of decision tree classification algorithm based on
attribute reduction and application in criminal behavior. In Proceedings of the 3rd
International Conference on Computer Research and Development, 11 13 March, Shanghai,
2011, 27 30.
[61] R. Bhowmik, Detecting auto insurance fraud by data mining techniques, J Emerg Trends
Comput Inform Sci 2 (4) (2011), 156 162.
[62] Gepp, J. H. Wilson, K. Kumar, and S. Bhattacharya, Comparative analysis of decision trees
vis-à-vis other com- putational data mining techniques in automotive insurance fraud
detection, J Data Sci 10 (2012), 537 561.
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1467 editor@iaeme.com
[63] Nasridinov, S.-Y. Ihm, and Y.-H. Park, A decision tree-based classification model for crime
prediction, Inform Technol Convergence Lect Notes Electr Eng 253 (2013), 531 538.
[64] R. Iqbal, M. A. A. Murad, A. Mustapha, P. H. S. Panahy, and N. Khanahmadliravi, An
experimental study of classification algorithms for crime prediction, Ind J Sci Technol 6 (3)
(2013), 4219 4225.
[65] Y. Wang, X. Peng, and J. Bian, Computer crime forensics based on improved decision tree
algorithm, J Netw 9 (4) (2014), 1005 1011.
[66] S. A. Muhammad, Fraud: the affinity of classification techniques to insurance fraud detection,
Int J Innov Technol Explor Eng 3 (11) (2014), 62 66.
[67] M. Fuller, D. P. Biros, and D. Delen, An investigation of data and text mining methods for
real world deception detection, Expert Syst Appl 38 (7) (2011), 8392 8398.
[68] C.-H. Wen, P.-Y. Hsu, C-y Wang, and T. L. Wu, Identifying smuggling vessels with artificial
neural network and logistics regression in criminal intelligence using vessels smuggling case
data, Intell Inform Database Syst Lect Notes Comput Sci 7197 (2012), 539 548.
[69] M. Pandey, and V. Ravi, Detecting phishing e-mails using text and data mining, In
Proceedings of the IEEE International Conference on Computational Intelligence &
Computing Research, Coimbatore, India, 2012, 18 20.
[70] R. P. Ang, and D. H. Goh, Predicting juvenile offending: a comparison of data mining
methods, Int J Offender Ther Comp Criminol 57 (2) (2013), 191 207.
[71] N. Tollenaar, and P. G. M. van der Heijden, Which method predicts recidivism best?: a
comparison of statistical, machine learning and data mining predictive models, J R Stat Soc A
176 (2) (2013), 565 584.
[72] O. De-Vel, A. Anderson, M. Corney, and G. Mohay, Mining e-mail content for author
identification forensics, ACM SIGMOD Rec 30 (4) (2001), 55 64.
[73] K. Kianmehr, and R. Alhajj, Effectiveness of support vector machine for crime hot-spots
prediction, J Appl Artif Intell 22 (5) (2008), 433 458.
[74] R. Abu Hana, C. Freitas, L. S. Oliveira, and F. Bor- tolozzi,Crime Scene Classification. In
Proceedings of the 23th Annual ACM Symposium in Applied Computing, 16- 20 March,
Fortaleza, Cear, Brazil, 2008, 419 423.
[75] Z. Liu, D. Lin, and F. Guo, A method for locating digital evidences with outlier detection
using support vector machine, Int J Netw Secur 6 (3) (2008), 301 308.
[76] R. Basili, C. Giannone, C. D. Vescovo, A. Moschitti, and P. Naggar, Kernel-based relation
extraction for crime investigation. In AI*IA 2009: Emergent Perspectives in Artificial
Intelligence, 2009, 161 171.
[77] P. Wang, R. Mathieu, J. Ke, and H. J. Cai, Predicting crimi- nal recidivism with support vector
machine. In Proceedings of the International Conference on Management and Ser- vice
Science, 24 26 August, Wuhan, 2010, 1 9.
[78] M. Salem, and S. Stolfo, Detecting masqueraders: a com- parison of one-class bag-of-words
user behavior model- ing techniques, J Wireless Mob Netw Ubiquitous Comput Dependable
Appl 1 (1) (2010), 3 13.
[79] Modupe, O. O. Olugbara, and S. O. Ojo, Exploring support vector machines and random
forests to detect advanced fee fraud activities on internet, In Proceedings of the IEEE 11th
Conference on Data Mining Workshops, 11 December, Vancouver, BC, 2011, 331 335.
[80] S. Bhattacharyya, S. Jha, K. Tharakunnel, and J. C. Westland, Data mining for credit card
fraud: a comparative study, Decis Support Syst 50 (3) (2011), 602 613.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1468 editor@iaeme.com
[81] H. M. Deylami, and Y. P. Singh, Adaboost and SVM based cybercrime detection and
prevention model, Artif Intell Res 1 (2) (2012), 117 130.
[82] R. Alwee, S. M. H. Shamsuddin, and R. Sallehuddin, 2013. Hybrid support vector regression
and autoregressive integrated moving average models improved by particle swarm
optimization for property crime states forecasting with economic indicators, Sci World J, 1
11.
[83] R. Alwee, S. M. H. Shamsuddin, and R. Sallehuddin, 2013. Economic indicators selection for
property crime rates using Grey Relational Analysis and Support Vector Regression. In
Proceedings of the International Conference on Systems, Control, Signal Processing and
Informatics, 16 19 July, Rhodes Island, 178 185.
[84] K. Junoh, M. N. Mansor, A. M. Yaacob, F. A. Adnan, S. A. Saad, and N. M. Yazid, Crime
Detection with DCT and artificial intelligent approach, Adv Mat Res (2013), 816 817. 610
615.
[85] J. Bendler, T. Brandt, S. Wagner, and D. Neumann, Inves- tigating crime-to-twitter
relationships in urban environ- ments facilitating a virtual neighborhood watch, In Pro-
ceedings of the 22nd European Conference on Information Systems, Tel Aviv, 2014, 9 11.
[86] M. K. Sparrow, The application of network analysis to criminal intelligence: an assessment of
the prospects, Social Netw 13 (3) (1991), 251 274.
[87] J. C. Wang, and C. Q. Chiu, Detecting online auction inflated-reputation behaviors using
social network analy- sis, In Proceedings of the Annual Conference of the North American
Association for Computational Social and Orga- nizational Science, Notre Dame, 2005, 26
28.
[88]
S. Ghosh, D.L. Reilly, Credit card fraud detection with a neural-network, 27th
Annual Hawaii International, Conference on System Science 3 (1994) 621630.
[89]
P. Green, J.H. Choi, Assessing the risk of management fraud through neural
network technology, Auditing: A Journal of Practice & Theory 16 (1) (1997) 14
28.
[90]
J. Han, M. Kamber, Data Mining: Concepts and Techniques, Second ed, Morgan
Kaufmann Publishers, 2006, pp. 285464.
[91]
M. Haskett, An Introduction to Data Mining, Part 2, Analyzing the Tools and
Techniques, Enterprise System Journal, 2000.
[92]
H. He, J. Wang, W. Graco, S. Hawkins, Application of neural networks to
detection of medical fraud, Expert Systems with Applications 13 (4) (1997)
329336.
[93]
Holton, Identifying disgruntled employee systems fraud risk through text mining:
a simple solution for a multi-billion dollar problem, Decision Support Systems
46 (4) (2009) 853864.
[94]
Y. Jin, R.M. Rejesus, B.B. Little, Binary choice models for rare events data: a
crop insurance fraud application, Applied Economics 37 (7) (2005) 841848.
[95]
J.L. Kaminski, Insurance Fraud, OLR Research Report, http://www.cga.ct.gov/2005/
rpt/2005-R-0025.htm. 2004
[96]
Kirkos, C. Spathis, Y. Manolopoulos, Data mining techniques for the detection of
fraudulent financial statements, Expert Systems with Applications 32 (4) (2007)
9951003.
[97]
S. Kotsiantis, E. Koumanakos, D. Tzelepis, V. Tampakas, Forecasting fraudulent
financial statements using data mining, International Journal of Computational
Intelligence 3 (2) (2006) 104110.
T. Chandrakala, S. Nirmala Sugirtha Rajini, K. Dharmarajan, K. Selvam
http://www.iaeme.com/IJARET/index.asp 1469 editor@iaeme.com
[98]
Y. Kou, C. Lu, S. Sirwongwattana, Y. Huang, Survey of fraud detection
techniques, IEEE International Conference on Networking, Sensing & Control
(2004) 749754.
[99]
W. Lee, S. Stolfo, Data Mining Approaches for Intrusion Detection, 7th
USENIX Security Symposium, San Antonio, TX, 1998.
[100]
J. Li, K. Huang, J. Jin, J. Shi, A survey on statistical methods for health care
fraud detection, Health Care Management Science 11 (3) (2008) 275287.
[101]
J.W. Lin, M.I. Hwang, J.D. Becker, A fuzzy neural network for assessing the
risk of fraudulent financial reporting, Managerial Auditing Journal 18 (8)
(2003) 657665.
[102]
J.A. Major, D.R. Riedinger, EFD: a hybrid knowledge/statistical-based system for
the detection of fraud, The Journal of Risk and Insurance 69 (3) (2002) 309324.
[103]
S. Mitra, S.K. Pal, P. Mitra, Data mining in soft computing framework: a survey,
IEEE Transactions on Neural Networks 13 (1) (2002) 314.
[104]
E.W.T. Ngai, L. Xiu, D.C.K. Chau, Application of data mining techniques in
customer relationship management: a literature review and classification, Expert
Systems with Applications 36 (2) (2009) 25922602.
[105]
S. Owusu-Ansah, G.D. Moyes, P.B. Oyelere, P. Hay, An empirical analysis of
the likelihood of detecting fraud in New Zealand, Managerial Auditing Journal
17 (4) (2002) 192204.
[106]
Oxford Concise English Dictionary, Tenth ed, Publisher, 1999.
[107]
J. Pathak, N. Vidyarthi, S.L. Summers, A fuzzy-based algorithm for auditors to
detect elements of fraud in settled insurance claims, Managerial Auditing Journal
20 (6) (2005) 632644.
[108]
J. Pearl, Probabilistic Reasoning in Intelligent Systems, Morgan Kaufmann, 1988.
[109]
C. Phua, V. Lee, K. Smith, R. Gayler, A comprehensive survey of data mining-
based fraud detection research, Artificial Intelligence Review (2005) 114.
[110]
J. Pinquet, M. Ayuso, M. Guillén, Selection bias and auditing policies for
insurance claims, The Journal of Risk and Insurance 74 (2) (2007) 425440.
[111]
J.T.S. Quah, M. Sriganesh, Real-time credit card fraud detection using compu-
tational intelligence, Expert Systems with Applications 35 (4) (2008) 1721
1732.
[112]
Sánchez, M.A. Vila, L. Cerda, J.M. Serrano, Association rules applied to credit
card fraud detection, Expert Systems with Applications 36 (2) (2009) 36303640.
[113]
S. Sharma, Applied Multivariate Techniques, Wiley, New York, 1996.
[114]
M.J. Shaw, C. Subramaniam, G.W. Tan, M.E. Welge, Knowledge management
and data mining for marketing, Decision Support System 31 (1) (2001) 127
137.
[115]
L. Sokol, B. Garcia, J. Rodriguez, M. West, K. Johnson, Using data mining to
find fraud in HCFA health care claims, Topics in Health Information
Management 22 (1) (2001) 113.
[116]
C.T. Spathis, Detecting false financial statements using published data: some
evidence from Greece, Managerial Auditing Journal 17 (4) (2002) 179191.
[117]
C.T. Spathis, M. Doumpos, C. Zopounidis, Detecting falsified financial statements:
a comparative study using multicriteria analysis and multivariate statistical
techniques, The European Accounting Review 11 (3) (2002) 509535.
Development of Crime and Fraud Prediction using Data Mining Approaches
http://www.iaeme.com/IJARET/index.asp 1470 editor@iaeme.com
[118]
Srivastava, A. Kundu, S. Sural, A.K. Majumdar, Credit card fraud detection using
hidden Markov model, IEEE Transactions on Dependable and Secure Computing 5
(1) (2008) 3748.
[119]
M. Sternberg, R.G. Reynolds, Using cultural algorithms to support re-
engineering of rule-based expert systems, in dynamic performance
environments: a case study in fraud detection, IEEE Transactions on
Evolutionary Computation 1 (4) (1997) 225243.
[120]
M. Syeda, Y. Zhang, Y. Pan, Parallel granular neural networks for fast credit card
fraud detection, 2002, IEEE International Conference on Fuzzy Systems 1 (2002)
572577.
[121]
P. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining, First ed.Addison-
Wesley Longman Publishing Co., Inc, 2005.
[122]
S. Tennyson, P. Salsas-Forn, Claims auditing in automobile insurance: fraud
detection and deterrence objectives, The Journal of Risk and Insurance 69 (3) (2002)
289308.
[123]
Turban, J.E. Aronson, T.P. Liang, R. Sharda, Decision Support and Business
Intelligence Systems, Eighth ed, Pearson Education, 2007.
[124]
S. Viaene, M. Ayuso, M. Guillén, D. Van Gheel, G. Dedene, Strategies for
detecting fraudulent claims in the automobile insurance industry, European
Journal of Operational Research 176 (1) (2007) 565583.
[125]
S. Viaene, G. Dedene, R.A. Derrig, Auto claim fraud detection using bayesian
learning neural networks, Expert Systems with Applications 29 (3) (2005) 653
666.
[126]
S. Viaene, R.A. Derrig, B. Baesens, G. Dedene, A comparison of state-of-the-art
classification techniques for expert automobile insurance claim fraud detection,
The Journal of Risk and Insurance 69 (3) (2002) 373421.
[127]
S. Viaene, R.A. Derrig, G. Dedene, A case study of applying boosting naive
Bayes to claim fraud diagnosis, IEEE Transactions on Knowledge and Data
Engineering 16 (5) (2004) 612620.
[128]
J. Wang, Y. Liao, T. Tsai, G. Hung, Technology-based financial frauds in
Taiwan: issue and approaches, IEEE Conference on: Systems, Man and
Cyberspace Oct (2006) 11201124.
... All included articles were published between 2015 and 2020, with the largest number of studies published in 2020 (n=14). More details about the characteristics of the included studies are shown in Appendix 2. The included studies utilized AI for fraud detection (n=17) [1,2,6,8,14,17,18,19,21,22,23,24,26,27,28,30,31], identifying and classifying detected fraud (n=8) [3,4,11,12,13,15,20,25], and investigating and analyzing fraudulent data (n=6) [5,7,9,10,16,29]. The most common algorithm used in the included studies was Convolutional Neural Network (CNN) (n=13), followed by Artificial Neural Network (ANN) (n=10). ...
Article
Over the past decade, Artificial Intelligence (AI) technologies have quickly become implemented in protecting data, including detecting fraud in healthcare organizations. This scoping review aims to explore AI solutions utilized in fraud detection occurring in treatment settings. To find relevant literature, PubMed and Google Scholar were searched. Out of 183 retrieved studies, 31 met all inclusion criteria. This review found that AI has been used to detect different types of fraud such as identify theft and kickbacks in healthcare. Additionally, this review discusses how AI techniques used in network mapping fraud can detect and visualize the hacker's network. A proper system must be implemented in healthcare settings for successful fraud detection, which may overall improve the healthcare system.
Preprint
Full-text available
The prognostic abilities of artificial intelligence and neuroscience in forensics and the criminal justice system stand as a reformatory paradigm for understanding any criminal conduct. While the use of artificial intelligence has been labeled transformational data analytical capabilities, neural predictive approaches also enable an intricate understanding of culpability and criminal propensities. The literature on the complex nature of neuroprediction and artificial intelligence, its ethical deliberations and its usability in curving recidivism are analyzed. This theoretical review elucidates the complex interplay, nuptial relationships and convergence of these relationships in the quest for justice. The consequences of not protecting individual rights in the criminal justice system are surveyed using grounded theory. The degree of acceptability and dependability of AI-generated evidence in legal proceedings are also reviewed. All these topics are yet to be contemplated under one roof to offer an argumentative view. The author expects to prompt readers and new commers to embrace more sociolegal and technological research before incorporating such research in the Indian Judiciary. The review focuses on the question of whether to blame such technology inclusion wholly or rather to prioritize the acquisition of bias-free pretrained datasets and processing models.
Preprint
Full-text available
In this paper, I have tried to bring forth reviews of various articles/researches speaking about the technology within AI/ML networks chosen to either aid in investigation process and/or providing deterministic judgements. This review article is significant for proving a wholesome idea on the convergence of Deep Learning and other Machine Learning Technologies in the domains of Criminal Law. However, a special emphasis has been tried with an aim to decipher the idea of Fairness and creating Responsible AI/Ethics in ML technologies. This is to ensure removing GIGO/RIRO in the systems so that no hallucinations in deterministic judgments/ predictive policing occurs. I believe that this manuscript is appropriate for publication by your esteemed Journal as because it is drafted with an an interdisciplinary approach, and can be a through put to many relevant other domains of computer science, Neuro science, Criminal Law, Crime Science etc. Also, to specify this paper analyses a significant number of the existing literature focusing on the complex nature of nuptial bond between Neuro-prediction and Artificial Intelligence, its ethical deliberations and usability of the same in curving recidivism.
Article
Internal auditing plays a pivotal role in preventing and detecting fraudulent activities. However, the orientation and role of internal auditing in dealing with fraud risk can vary significantly across different companies. This study examines the relationship between the internal audit function (IAF) and fraud, providing new insights into the current practices of internal auditing. Using a survey dataset comprising responses from 275 Chief Audit Executives across Germany, Switzerland and Austria, we investigate factors that correlate with an increased propensity for IAFs to engage in fraud prevention and detection. Our findings suggest that a robust corporate governance environment significantly influences the extent to which the IAF is involved in preventing and detecting fraud. Shedding light on the positioning of internal auditing between management and the audit committee with respect to fraud, our results show that increased IAF involvement with management positively affects the level of activities to prevent and detect fraud, while increased IAF involvement with the audit committee has the opposite effect. Furthermore, we find that the propensity of IAFs to engage in fraud prevention and detection increases when the IAF applies technology‐based auditing techniques for risk identification. Our results have implications for building appropriate protection against the steadily increasing risk of fraud within organizations, while holistically addressing the ambiguity regarding the responsibility for preventing and detecting fraud.
Preprint
Internal auditing is a core function in detecting and preventing fraudulent activities. However, the orientation and role of internal auditing in dealing with fraud varies considerably in different companies. Against this background, this study examines the connection between internal auditing and fraud and provides new insights into the as-is practice of internal auditing. Using 311 observations from Chief Audit Executives, we find that the corporate governance environment significantly affects the level to which internal auditing strives for detecting and preventing fraud. We also emphasize the overall ethical impact regarding fraud in organizational structures. Shedding light on the positioning of internal auditing between management and the audit committee with respect to fraud, our results show that additional meetings with management have a positive effect on the level of activities to prevent and detect fraud, while additional meetings with the audit committee have the opposite effect. Our results not only underscore the significance of technologically advanced working methods but also holistically address the ambiguity regarding the responsibility for detecting and preventing fraud. Our results have implications for building an appropriate protection against the steadily increasing risk of fraud within organizations.
Conference Paper
Full-text available
Data mining applications are utilized in many banking sectors for client segmentation and productivity, credit scores and authorization, predicting payment default, advertising, detecting fake transactions, etc. This paper presents a general idea about the model of Data Mining techniques and diverse cyber crimes in banking applications. It also provides an inclusive survey of competent and valuable techniques on data mining for cyber crime data analysis. The objective of cyber crime data mining is to recognize patterns in criminal manners in order to predict crime anticipate criminal activity and prevent it. This paper implements a novel data mining techniques like K-Means, Influenced Association Classifier and J48 Prediction tree for investigating the cyber crime data sets and sorts out the accessible problems. The K-Means algorithm is being utilized for unsupervised learning cluster within influenced Association Classification. K-means selects the initial centroids so that the classifier can mine the record and formulate predictions of cyber crimes with J48 algorithm. The collective knowledge of K-Means, Influenced Association Classifier and J48 Prediction tree tends certainly to afford a enhanced, incorporated, and precise result over the cyber crime prediction in the banking sectors Our law enforcement organizations require to be adequately outfitted to defeat and prevent the cyber crime.
Conference Paper
Full-text available
Webpages with terrorist and extremist content are key factors in the recruitment and radicalization of disaffected young adults who may then engage in terrorist activities at home or fight alongside terrorist groups abroad. This paper reports on advances in techniques for classifying data collected by the Terrorism and Extremism Network Extractor (TENE) webcrawler, a custom-written program that browses the World Wide Web, collecting vast amounts of data, retrieving the pages it visits, analyzing them, and recursively following the links out of those pages. The textual content is subjected to enhanced classification through software analysis, using the Posit textual analysis toolset, generating a detailed frequency analysis of the syntax, including multi-word units and associated part-of-speech components. Results are then deployed in a knowledge extraction process using knowledge extraction algorithms, e.g., from the WEKA system. Indications are that the use of the data enrichment through application of Posit analysis affords a greater degree of match between automatic and manual classification than previously attained. Furthermore, the incorporation and deployment of these technologies promises to provide public safety officials with techniques that can help to detect terrorist webpages, gauge the intensity of their content, discriminate between webpages that do or do not require a concerted response, and take appropriate action where warranted.
Conference Paper
The criminal behavior is a disorderliness that is a combined result of social and economic aspects. The crime rate has expanded and the activities of criminals have broaden in last few decades due to better communication system and transport. Crimes cause terror and damage our community enormously in several means. In cities and towns the crime trends rises due to fast developmental activities and increase in population. In India, the regional location has a powerful impact on criminal activity. The CrimeInfo report of National Crime Records Bureau (NCRB), India collects, analyze and publish the crime data. The crime profiling and zoning can be modeled with utilization of data mining. In this paper, we make cluster analysis by using k-means cluster algorithm on criminal dataset of India. The cluster input is used to create custom India map with the cluster zones of states. The custom maps displays an overall crime profiles of states which helps police and law enforcement department to take additional preventive measures to combat against the crime and plan advanced investigation strategies. The crime trend and zoning knowledge can also be helpful in cautioning police to increments and reductions in levels of actions.