Chapter

Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms

08/2011; DOI:10.1007/978-3-642-23783-6_16 pp.245-260

ABSTRACT If several friends of Smith have committed petty thefts, what would you say about Smith? Most people would not be surprised if Smith is a hardened criminal. Guilt-by-association methods combine weak signals to derive stronger ones, and have been extensively used for anomaly detection and classification
in numerous settings (e.g., accounting fraud, cyber-security, calling-card fraud).

The focus of this paper is to compare and contrast several very successful, guilt-by-association methods: Random Walk with Restarts, Semi-Supervised Learning, and Belief Propagation (BP).

Our main contributions are two-fold: (a) theoretically, we prove that all the methods result in a similar matrix inversion
problem; (b) for practical applications, we developed FaBP, a fast algorithm that yields 2× speedup, equal or higher accuracy than BP, and is guaranteed to converge. We demonstrate
these benefits using synthetic and real datasets, including YahooWeb, one of the largest graphs ever studied with BP.

KeywordsBelief Propagation–Random Walk with Restart–Semi-Supervised Learning–probabilistic graphical models–inference

0 0
 · 
0 Bookmarks
 · 
33 Views

Full-text

View
0 Downloads
Available from

Keywords

anomaly detection
 
Belief Propagation
 
BP
 
converge
 
derive stronger ones
 
equal
 
FaBP
 
Guilt-by-association methods
 
higher accuracy
 
KeywordsBelief Propagation–Random
 
largest graphs
 
methods result
 
numerous settings
 
petty thefts
 
practical applications
 
real datasets
 
Restart–Semi-Supervised Learning–probabilistic graphical models–inference
 
Smith
 
YahooWeb
 
yields 2× speedup