Content uploaded by R. Woodley
All content in this area was uploaded by R. Woodley on Mar 26, 2015
Content may be subject to copyright.
Casino Fraud Data Mining
R. Woodley1, W. Noll1, and K. Shallenberger 1
121st Century Systems, Inc., 6825 Pine Street, Suite 141, Omaha, Nebraska, USA
Abstract - Average revenue per casino hotel resort per year
is $87,887,253 . This much revenue attracts fraud and
criminals leading to millions in lost revenue . Recently,
casinos have begun to track patrons. Their vital statistics and
spending habits are all recorded in massive databases. This,
however, has led to the challenge of extracting the pertinent
information from these data sets and how to connect actions
to fraud in causal chains. As data becomes more prevalent,
the need to link causal data into actionable information
becomes paramount. Analysts are faced with mountains of
data, and finding that piece of relevant information is the
proverbial needle in a haystack, only with dozens of
haystacks. Analysis tools that facilitate identifying causal
relationships across multiple data sets are sorely needed.
21st Century Systems, Inc. (21CSI) has initiated research
called Causal-View, a causal data-mining visualization tool,
to address this challenge. Causal-View provides causal
analysis tools to fill the gaps in the causal chain. We present
here the Causal-View concept, the initial research into data
mining tools that assist in forming the causal relationships,
and our initial findings.
Keywords: Causal data mining, Casino fraud, Causal data
relationships, Mahalanobis Taguchi System, Evidence
21st Century Systems, Inc. (21CSI) has been working
with the Borgata Casino, Hotel and Spa in Atlantic City to
develop a software package called Kaimi. We are developing
Kaimi as a decision support tool to assist Casino surveillance
teams with tracking the spending of patrons and “connect the
dots” between disparate data sets in a uniform and integrated
fashion to reduce revenue loss from fraud. The potential for
combining data sets and discovering hidden patterns
decreases the investigative time, allowing casino surveillance
teams to further combat fraud at the casino. Kaimi was
developed utilizing both new enabling technologies and
previously researched algorithms. The system prototype was
first deployed 90 days after the initial requirements meetings.
The Kaimi capability assists casino personnel in answering
the question, "Who is sitting at my gaming table?" Kaimi
looks for players and employees who live near or with each
other, players with the same or similar attributes, players
who normally play together, betting patterns, and other
possible relationships. The amount of data exceeds two
million records updated multiple times throughout the day.
To take Kaimi to the next level, we are adding the causal
data reasoning capability of Causal-View.
Figure 1: Causal View data flow diagram.
Causal-View is a causal data-mining visualization tool
being developed for the U.S. Army. In Figure 1, we see a
conceptual illustration of Causal-View. Causal-View is built
on an agent-enabled framework. The purpose behind using
an agent structure is that much of the processing that Causal-
view will do is in the background. When a user makes a
request for information, e.g., the betting history of a patron
for a particular table, Data Extraction Agents launch to
gather information. This initial search is a raw, Monte Carlo
type search designed to gather everything available that may
have relevance to the patron, the table, the dealers, and more.
This data is then processed by Data-Mining Agents. The
Data-Mining Agents are driven by user supplied feature
parameters. For example, if the analyst is looking to see if
the patron bets more or less for particular dealers the
extraction agent can make a direct link. On the other hand, if
the analyst is trying to see if there is a pattern in the patron‟s
bet, the mining agent can be instructed with the type and
relevance of the information fields to look at. The same data
is extracted from the database, but the Data Mining Agents
customize the feature set in order to determine causal
relationships that the user is interested in. At this point, the
Hypothesis Generation and Data Reasoning Agents take over
to form conditional hypotheses about the data and pare the
data, respectively. The newly formed information is then
published to the agent communication backbone of Causal-
View to be displayed in the Kaimi user interface.
2 Data Extraction, Hypothesis
Generation, and Evidence Reasoning
As illustrated by Thearling , data mining as a science,
growing out of the data collection and warehousing of the
1960s – 1980s, extends the ability of simple querying data
into guided search and information discovery. Causal Data
Mining (CDM) extends typical data mining even further as
shown by Silverstein, et.al. . Silverstein shows that mining
association rules is quite complex, particularly in
unstructured data. The association rule of “X implies Y” can
often be misinterpreted from the raw data and that additional
data and analysis is needed to justify the rule. The Bayesian
techniques in  do a reasonable job when enough
information is known about the data to form the a priori
conditional probabilities that drive the Bayesian network. In
our concept, we will try to extend this work by adding
technology (evidential reasoning) that intrinsically handles
the uncertainty created by the data-driven association rule
without the need for calculating the conditional probabilities.
We also move past the Bayesian approach utilizing the
Mahalanobis-Taguchi System (MTS) for data clustering.
2.1 Data Extraction
From Error! Reference source not found., the first
agent the data encounters is the extraction agent. The
extraction agent is the difference between a database query
and autonomous data extraction. We need to give the agent
the capability of discovering important pieces of data in both
structured and unstructured data. The data the agent finds
will trigger the reasoning engine to create a hypothesis. The
hypothesis, in turn, causes further data mining and trending
agents to find corroborating evidence until a consensus is
Information extraction techniques vary by domain and
source of the data. Statistical mechanisms and manual
annotations are commonly used on unstructured information,
while less linguistically intensive approaches have been
developed for the Internet using rule-based approaches that
are aware of a particular page‟s content format. Statistical
techniques include Maximum Entropy , Support Vector
Machines , Hidden Markov Models  (HMMs), while
information extraction techniques on the web generally deal
with the structured HTML/XHTML content that can be
reused on a site-to-site basis, or Resource Description
Framework (RDF) feeds.
We implemented a metasearch engine capable of
connecting to existing data sources and providing relevant
results from search queries. In addition to the metasearch
engine, a distributed search engine capable of crawling and
searching file systems, Intranet sites, etc. may be utilized.
2.2 Hypothesis Generation – Data Clustering
The primary research for the Causal-View project has
been in the Hypothesis Generation. Unlike most data
clustering, causal data mining rarely has ground truth by
which you may train a clustering algorithm. However, the
nature of the data allows us to make some assumptions that
we can use to create clusters. Primarily, given the enormous
volume of gambling action that occurs at the casino, it is
unlikely that, for a given period of time, any fraud is
occurring. This allows us to take a small subset as a normal
example and then compare against the larger data set where
an anomaly may or may not exist.
What we needed was a method that could be easily
reconfigured for different parameters of the data, use only a
small subset to “train,” and provide results that are readily
discernable. A candidate algorithm for the data clustering
challenge was Mahalanobis-Taguchi System (MTS) . MTS
is a fault detection, isolation, and prognostics scheme.
Currently, MTS fuses data from multiple sensors into a single
system-level performance metric using Mahalanobis Distance
(MD) and generates clusters based on MD values. MD
thresholds derived from the clustering analysis are used for
detection and isolation. We are investigating the extension of
the MTS scheme into causal mining. At present, the cluster
identification is performed manually off-line. We are
researching self-learning to generate the cluster heads for
MTS. A conceptual example of MTS for fault detection is
shown in Figure 2(a) whereby the MD (magnitude and angle)
can help detect that a fault is occurring and which type of
fault (root cause). Figure 2(b) shows the same concept with a
compound fault. In this case, either Fault 1 or Fault 2 may be
indicated by the MD. In particular, a change in parameters
would be needed to properly identify the fault. By creating a
self-learning scheme, the proper faults can be identified, and,
more importantly, which parameters to use to separate the
faults. This type of information can then be used to alert the
user that more information is needed.
Figure 3 shows a physical example of a compound fault that
is indistinguishable using only outlet pressure on a pump .
Figure 3: Example MD based fault clusters using only the
outlet pressure for a pump.
2.3 Data Reasoning
The final component of the Causal-View system is
reasoning about the information clusters. We employ
technology called the Evidential Reasoning Network (ERN®)
to assist in the data reasoning. The goal of the ERN
component is to indicate the uncertainty about a particular
hypothesis when compared against other hypotheses for the
same data. By minimizing the uncertainty, we can provide a
clear indication of the believability (or, conversely, the
disbelief) about a hypothesis. This should then lead to a
causal chain of evidence, if such a chain exists. Typical
decision-support approaches will use either a simplistic
uncertainty tracking method or something along the lines of a
Bayesian probability approach. Simple uncertainty tracking
does not fully account for the propagation and combination of
uncertainty. It does not propagate the error whereby it may
allow potentially erroneous data to bias the results. Bayesian
approaches are better and account for the error propagation,
but have the basic need of a priori probability measures on
the uncertain elements. What is sometimes needed is a way to
incorporate various degrees of uncertainty ranging from
simple percent unknown up to probabilistic measures, where
available. The ERN technology is designed for this purpose.
The ERN technology uses a belief algebra structure for
providing a mathematically rigorous representation and
manipulation of uncertainty within the evidential reasoning
network. Since the introduction of the Dempster-Shafer
Theory of Evidence , new evidential reasoning methods
have been, and continue to be, developed, including fuzzy
logic  and Subjective Logic , . An evidential
reasoning framework was needed to ensure that evidential
reasoning expressions are coherent, consistent, and
computationally tractable. ERN is a novel structure that
addresses these needs. The two prime belief algebra operators
required are consensus and discount. These operators allow
the propagation of belief values through the network amongst
various opinion generating authorities, i.e., the clustering
generated in MTS, which perform some sort of data analysis,
processing, and reasoning. The belief algebra structure is
capable of using probabilistic belief mass assignments
through the use of belief frames. The ERN Toolkit includes a
Subjective Logic and Dempster-Shafer belief algebra
The belief algebra equations direct ERN how to
combine information. A consensus operator is an additive
function that increases assenting opinion, where the discount
operator is multiplicative and will act to attenuate dissenting
opinion due to the normalized opinion values found within
Figure 2: MTS where the Mahalanobis distance around a fault cluster determines the variance from normal (lower
left corner) for simple fault conditions (a) and compound fault conditions (b).
the opinion-space used by ERN. The implementation of the
belief algebra is currently under development; results from
this research are expected shortly.
Our plan for the assigning of opinion and the
subsequent belief algebra equations would be similar to a
nearest-neighbor approach, only with much more information
and sophistication in how the opinions are grouped. Using
the calculated MD value as the measure, we will assign a
belief value that corresponds to known results (e.g., that a
player is winning/losing within expected ranges). An
uncertainty value can then be calculated based on changes in
behavior. These values form the initial opinion concerning
the likelihood of fraud. As other events occur (e.g., additional
play by the patron, or wins/losses by neighboring players),
they likewise generate opinion. By then combining these
events within the opinion space we can determine if the
player is indeed acting normally, or if potential fraud may be
occurring. As more events occur in the same location in
either the MD space or the opinion space, a causal
relationship between action and result can be determined.
As mentioned previously, this is an active research
project in its preliminary stages. The results, thus far, show
the use of the MTS algorithm on the casino data. The data set
has over 2 million entries where a typical data subset is
shown in Table 1.
Table 1 contains the first twenty entries (of over 500 for
this player on this particular table) for a player‟s average bet
(avgbet), the theoretical win/loss of the casino (theo – the
amount of money the casino should've won based on house
advantage of the game, pace of play, avgbet, and hours),
which work shift the player was at the game (shift), the
number of hours played at the game (hours), the total amount
bet (totalin), and the estimated win/loss (estwl – an estimate
of money going to the casino during the session). There are
many more columns in the full data set that indicate the
specific times played, the dealer during the time, and any
other piece of information that can be captured at the time the
player swipes his ID card at the game. The data shown is
currently sorted by average bet, but still shows some of the
volatility in the data. Part of our challenge with this data is
that we do not have the ground truth concerning if any fraud
has occurred in the data or not.
As a first pass on the data, we were interested to see
how our sample player compared against other players at the
same table. We began by selecting only the player for that
table and calculate the MD values. The amount of variation
in even this small example is extreme. Depending how the
data is sorted, what variables are used in the MD
calculations, and the weighting factor can influence how the
data is clustered. For this initial pass, we double sorted the
data by avgbet and theo. You can see from Table 1 that there
is still the possibility of some large variance in the data even
though it is sorted. Figure 4 shows that the player has a fairly
consistent betting history with only a few points that fall
outside the main cluster at the origin (from the scatter plot).
The line graphs indicate that the user has a general trend in
that he prefers mid-size wagers slightly more than small
wagers (as indicated by the MD distance near the center of
the green line graph) and a large preference over large
wagers (very large MD values at the upper end).
We next ran the same configuration (i.e., same game
table, for the same overall time period as our example player)
to see how his betting history compared against all other
patrons. Figure 5 shows the results against all other players.
We see from this data that almost all other players are very
close to our example player in their betting preferences as
indicated by the tight clustering occurring at the origin.
However, there are many outliers that may indicate that
significant differences are present. At this time, we have no
ground-truth information to indicate that fraud is occurring,
but this gives the analyst a much more narrowed field of
entries to investigate. Future work will allow the analyst to
simply click on the outliers to get more information
concerning these events. Furthermore, we can now begin to
apply the ERN technology to compare the results. For
example, if a particular grouping of outliers has a set of
similar characteristics, we can form a belief space opinion
cluster. If further analysis shows that the cluster indicates
possible suspicious behavior, we can trigger an alert in the
monitoring software the next time a patron exhibits the same
behavior. The causal chain is now formed that says if Patron
A exhibits Activity B, then the possible fraud has probability
C. This information can give security personnel the
information they need to catch the perpetrator.
Table 1: Data subset of the casino data.
In this paper, we have presented the conceptual idea of
Causal-View. Causal-View is a causal data mining engine
that finds relationships hidden within extremely large data
sets. Furthermore, we present here the underlying technology
with a discussion of how the components work together to
form the causal chains. We use a Monte Carlo data gathering
scheme to pull data in from whatever sources are available.
We are experimenting with MTS to find potential
hypotheses. The hypotheses are then evaluated leading to the
final causal chain by 21CSI‟s ERN technology. While the
concept shows promise, the work is only in its preliminary
state. We present an initial example problem in which we
were able to find some relationships (and differences)
between a particular patron‟s activity and all other patrons
on a particular game. We are confident that this technology
will continue to develop in a positive manner.
Our future work will include
improvements in the algorithms
and the user interface. We will be
giving the analyst the ability to
query the results of the clustering
action. We are researching methods
to configure the MTS output to
provide multiple “views” of the data
to help the reasoning engine to
discover patterns. Finally, we are
developing the belief algebra
equations that pull the hypotheses
into a causal chain.
21st Century Systems, Inc.
would like to thank the U.S. Army
for sponsoring this research.
(contract number: W15P7T-11-C-
H217). We also thank the Borgata
Casino, Hotel and Spa in Atlantic
City for allowing us to analyze their
 G. Haussman, “Nevada
Reaps $2.1 Billion in Casino
Profit Casino resorts
throughout the state celebrate
the New Year with record
profits and strong indications
of a robust 2007.,
om/article.aspx?articleID=6854‟,” Hotel Interactive,
 The Executive Office of the Governor, “Casinos in
Florida: An analysis of the Economic and Social
‟.”Office of Planning and Budgeting, The Capitol,
Tallahassee FL., 2007.
 K. Thearling, An Introduction to Data Mining.
 C. Silverstein, S. Brin, R. Motwani, and J. Ullman,
“Scalable techniques for mining causal structures,”
Data Mining and Knowledge Discovery, vol. 4, no. 2,
p. 163–192, 2000.
 A. E. Borthwick, “A maximum entropy approach to
named entity recognition,” New York University, 1999.
Figure 4: MD clustering for example user.
 T. Joachims, F. Informatik, F. Informatik, F.
Informatik, F. Informatik, and L. Viii, “Text
Categorization with Support Vector Machines:
Learning with Many Relevant Features,” 1997.
 D. M. Bikel, S. Miller, R. Schwartz, and R.
Weischedel, “Nymble: a High-Performance Learning
Name-finder,” IN PROCEEDINGS OF THE FIFTH
CONFERENCE ON APPLIED NATURAL
LANGUAGE PROCESSING, p. 194--201, 1997.
 G. Taguchi, S. Chowdhury, and Y. Wu, The
Mahalanobis-Taguchi System. McGraw-Hill
 Soylemezoglu, Ahmet, “Sensor Data-Based Decision
Making,” Missouri University of Science and
Technology, dissertation, 2010.
 G. Shafer, A mathematical theory of evidence.
Princeton NJ: Princeton University Press, 1976.
 P. Palacharla and P. Nelson, “Understanding relations
between fuzzy logic and evidential reasoning
methods,” in IEEE Proceedings of the Third IEEE
Conference on World Congress on Computational
Intelligence, 1994, pp. 1933-1938.
 A. Jøsang, “A Logic for Uncertain Probabilities,”
International Journal of Uncertainty, Fuzziness and
Knowledge-Based Systems, vol. 9, no. 3, pp. 279-311,
 A. Jøsang, “Subjective Evidential Reasoning,”
International Conference on Information Processing
and Management of Uncertainty in Knowledge-Based
Systems (IPMU 2002), p. 1671--1678, Jul. 2002.
Figure 5: Comparison of player 1 versus all other patrons.