A Multiple Instance Learning Strategy for Combating Good Word Attacks on Spam Filters.

Journal of Machine Learning Research (Impact Factor: 3.42). 01/2008; 9:1115-1146. DOI: 10.1145/1390681.1390719
Source: DBLP

ABSTRACT Statistical spam filters are known to be vulnerable to adversarial attacks. One of the more common adversarial attacks, known as the good word attack, thwarts spam filters by appending to spam messages sets of "good" words, which are words that are common in legitimate email but rare in spam. We present a counterattack strategy that attempts to differentiate spam from legitimate email in the input space by transforming each email into a bag of multiple segments, and subsequently applying multiple instance logistic regression on the bags. We treat each segment in the bag as an instance. An email is classified as spam if at least one instance in the corresponding bag is spam, and as legitimate if all the instances in it are legitimate. We show that a classifier using our multiple instance counterattack strategy is more robust to good word attacks than its single instance counterpart and other single instance learners commonly used in the spam filtering domain.

  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents an agent-based model to analyse behaviour produced under noisy and deceptive information conditions. A simple yet powerful simulation environment is developed where adaptive agents act and adapt to varying levels of information quality that they sense about their environment. The simulation environment consists of two types of agents moving in a bounded two-dimensional continuous plane: a neuro-evolutionary learning agent that adapts its manoeuvreing strategies to escape a pre-programmed deceptive agent; and a pre-programmed agent, whose goal is to capture the adaptive agent, that acts on noisy information about the adaptive agent’s manoeuvres that it senses from the environment. The pre-programmed agent is also able to produce deceptive actions to confuse the adaptive agent. The behaviour is represented in terms of the manoeuvreing strategies that the agents adopt as their actions to the environmental changes. A behaviour analysis methodology is developed to compare agent actions under different information conditions, that elicits interesting relationships between behaviour and the studied information conditions. The framework is easily extendable to analyse human behaviour in similar environments by replacing the adaptive agent with an interactive human–machine interface.
    Adaptive Behavior 04/2013; 21(2):96-117. · 1.11 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In multiple-instance learning (MIL), an object is represented as a bag consisting of a set of feature vectors called instances. In the training set, the labels of bags are given, while the uncertainty comes from the unknown labels of instances in the bags. In this paper, we study MIL with the assumption that instances are drawn from a mixture distribution of the concept and the non-concept, which leads to a convenient way to solve MIL as a classifier combining problem. It is shown that instances can be classified with any standard supervised classifier by re-weighting the classification posteriors. Given the instance labels, the label of a bag can be obtained as a classifier combining problem. An optimal decision rule is derived that determines the threshold on the fraction of instances in a bag that is assigned to the concept class. We provide estimators for the two parameters in the model. The method is tested on a toy data set and various benchmark data sets, and shown to provide results comparable to state-of-the-art MIL methods.
    Pattern Recognition. 03/2013; 46(3):865–874.
  • [Show abstract] [Hide abstract]
    ABSTRACT: An increasing number of machine learning applications involve detecting the malicious behavior of an attacker who wishes to avoid detection. In such domains, attackers modify their behavior to evade the classifier while accomplishing their goals as efficiently as possible. The attackers typically do not know the exact classifier parameters, but they may be able to evade it by observing the classifier's behavior on test instances that they construct. For example, spammers may learn the most effective ways to modify their spams by sending test emails to accounts they control. This problem setting has been formally analyzed for linear classifiers with discrete features and convex-inducing classifiers with continuous features, but never for non-linear classifiers with discrete features. In this paper, we extend previous ACRE learning results to convex polytopes representing unions or intersections of linear classifiers. We prove that exponentially many queries are required in the worst case, but that when the features used by the component classifiers are disjoint, previous attacks on linear classifiers can be adapted to efficiently attack them. In experiments, we further analyze the cost and number of queries required to attack different types of classifiers. These results move us closer to a comprehensive understanding of the relative vulnerability of different types of classifiers to malicious adversaries.
    Proceedings of the 2013 ACM workshop on Artificial intelligence and security; 11/2013


Available from