Conference Paper

A Discriminative Classifier Learning Approach to Image Modeling and Spam Image Identification.

Conference: CEAS 2007 - The Fourth Conference on Email and Anti-Spam, 2-3 August 2007, Mountain View, California, USA
Source: DBLP
Download full-text


Available from: Steve Webb
  • Source
    • "Virus links from the spams could lead to personal or business loss and damage. There have been some studies on detecting spam emails [8] [15], spam messages [12], spam images [2], spam video [1], web spam [13], spammers [11] [9] [14], etc. However, one of the major challenges of spam detection in social media is that the spams are usually in the form of photos and text, and in the context of large scale dynamic social network. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We have entered the era of social media networks repre-sented by Facebook, Twitter, YouTube and Flickr. Internet users now spend more time on social networks than search engines. Business entities or public figures set up social networking pages to enhance direct interactions with on-line users. Social media systems heavily depend on users for content contribution and sharing. Information is spread across social networks quickly and effectively. However, at the same time social media networks become susceptible to different types of unwanted and malicious spammer or hacker actions. There is a crucial need in the society and industry for security solution in social media. In this demo, we propose a scalable and online social media spam detec-tion system for social network security. We employ our GAD clustering algorithm for large scale clustering and integrate it with the designed active learning algorithm to deal with the scalability and real-time detection challenges.
    Preview · Article · Aug 2011 · Proceedings of the VLDB Endowment
  • Source
    • "In particular, the rationale of the techniques in Hsia and Chen (2009); Zuo et al. (2009b,a) is to detect spam images identical or very similar to the ones in the training set, exploiting the fact that they are often sent in batches. In Byun et al. (2007) four properties were argued to be specific of image spam: colour characteristics (such as discontinuous distributions, high intensity, and dominant peaks), characterised with colour moments computed in the HSV space; colour heterogeneity (which is deemed to be more uniform in spam images), characterised by RMS differences between the original and the quantised images; " conspicuousness " (intended as the presence of highly contrasted colours aimed at making the spam message easily noticeable), characterised by colour saturation features; and self-similarity (based on the observation that different regions in the same spam image often exhibit similar characteristics, contrary to legitimate images), measured through a log-Gabor filter bank on predefined image blocks. A multi-class approach was proposed in this work, based on the rationale that several sub-classes of both spam and legitimate images exist, exhibiting large intra-class variations. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In their arms race against developers of spam filters, spammers have recently introduced the image spam trick to make the analysis of emails’ body text ineffective. It consists in embedding the spam message into an attached image, which is often randomly modified to evade signature-based detection, and obfuscated to prevent text recognition by OCR tools. Detecting image spam turns out to be an interesting instance of the problem of content-based filtering of multimedia data in adversarial environments, which is gaining increasing relevance in several applications and media. In this paper we give a comprehensive survey and categorisation of computer vision and pattern recognition techniques proposed so far against image spam, and make an experimental analysis and comparison of some of them on real, publicly available data sets.
    Full-text · Article · Jul 2011 · Pattern Recognition Letters
  • Source
    • "fingerprinting. There are relatively few works in image spam identification [4] [5]. All these works address the image spam filtering problem passively, and hence, it is essential to actively trace the origins of spam and bring down the botnets in order to stop spam. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we investigate image spam with data mining techniques in order to reveal the common sources of unsolicited emails. To identify the origins, a two-stage clustering method groups visually similar spam images by exploring their visual features, including color feature, layout feature, text layout, and background textures. We test the proposed approach under different settings and combinations of features and measure the performance with a modified F-measure.
    Full-text · Conference Paper · Jan 2009
Show more