Conference Paper

FCFraud: Fighting Click-Fraud from the User Side

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In this form of attack, attackers infect and use the computers of legitimate Internet users to deceive advertisers. In 2016, Iqbal et al. [38] introduced a novel technique named FCFraud, which can be built into the operating system (OS) to combat click fraud on the user side. Namely, the authors of [38] believe that adequate protection at the operating system level can save billions of dollars for advertisers. ...
... In 2016, Iqbal et al. [38] introduced a novel technique named FCFraud, which can be built into the operating system (OS) to combat click fraud on the user side. Namely, the authors of [38] believe that adequate protection at the operating system level can save billions of dollars for advertisers. FCFraud significantly protects innocent users by detecting the fraudulent processes that perform click fraud silently. ...
... It then uses heuristics to identify fraudulent ad clicks. In the final analysis presented in [38], the authors have examined their model using 25 popular websites (a total of 7708 HTTP requests), and the results have indicated that FCFraud could successfully detect all background processes involved in the execution of click fraud. Notably, FCFraud was 99.6% accurate in classifying ad requests from all user processes, and it showed 100% success in finding the fraudulent processes. ...
Article
Full-text available
Recent research has revealed an alarming prevalence of click fraud in online advertising systems. In this article, we present a comprehensive study on the usage and impact of bots in performing click fraud in the realm of digital advertising. Specifically, we first provide an in-depth investigation of different known categories of Web-bots along with their malicious activities and associated threats. We then ask a series of questions to distinguish between the important behavioral characteristics of bots versus humans in conducting click fraud within modern-day ad platforms. Subsequently, we provide an overview of the current detection and threat mitigation strategies pertaining to click fraud as discussed in the literature, and we categorize the surveyed techniques based on which specific actors within a digital advertising system are most likely to deploy them. We also offer insights into some of the best-known real-world click bots and their respective ad fraud campaigns observed to date. According to our knowledge, this paper is the most comprehensive research study of its kind, as it examines the problem of click fraud both from a theoretical as well as practical perspective.
... One such revenue model which is most commonly seen in the industry is the Pay-Per-Click(PPC) model. According to [2], a fixed amount is paid by advertiser to the publisher for each click made by the user. Click-fraud indulges in malpractices for deceiving the revenue model such as PPC by creating records for false clicking activities on online advertisement. ...
... If any of the policy have not been satisfied by the ad then that ad will be detected as fraudulent. [2]. When there is a data which contains multiple attributes,feature parallelism can be implemented concurrently. ...
... Special java script on advertiser's website is included which collects the important information about used behavior. Java Script validation is performed in [2] to check click on HTTP request to detect the malware whether it executes the attack in a time-period when there is a high traffic(busy period). FCFraud also works in case a user has a touchscreen monitor. ...
Article
Full-text available
Web services have become an integral part of our day-to-day life including the various advertisements viewed on websites. Revenue is generated by companies through advertisement by selling clicks (known as Pay-Per-Click model). The company is paid for each click performed by the user on link published on the webpage. The clicked link redirects the user to sponsoring company’s content. Invalid clicks that are generated either by humans or through a software as a malpractice for earning money is known as clickfraud. Several different time features are combined into a time print. Machine learning is performed for understanding up to what extent the unusual time prints occur in a data for distinguishing invalid clicks and identifying click fraud. The results generated shows that time prints are useful tool for the improvement of the quality of click fraud analysis and increases the overall accuracy of the observations.
... According to the Interactive Advertising Bureau trade group about 36% of all web traffic is considered fake [10]. In fact, a report issued by the digital security firm White Ops and the Association of National Advertisers estimates that advertisers could have lost $6.3 billion USD in 2015 due to click-fraud automated tools (a.k.a bots), which artificially increase the number of visits to a website [10]. ...
... According to the Interactive Advertising Bureau trade group about 36% of all web traffic is considered fake [10]. In fact, a report issued by the digital security firm White Ops and the Association of National Advertisers estimates that advertisers could have lost $6.3 billion USD in 2015 due to click-fraud automated tools (a.k.a bots), which artificially increase the number of visits to a website [10]. Hence, bot traffic detection has become an important task for industry and academia. ...
... Other approach for bot detection combines web log files and some features captured during the session of a website visitor, such as the number of clicks and keystrokes [10,17,15]. Nonetheless, this approach is difficult to deploy because people are reluctant to be monitored by script codes that are a potential threat to their privacy. ...
Conference Paper
Nowadays, companies invest resources in detecting non-human accesses on their web traffics. Usually, non-human accesses are a few compared with the human accesses, which is considered as a class imbalance problem, and as a consequence, classifiers bias their classification results toward the human accesses obviating, in this way, the non-human accesses. In some classification problems, such as the non-human traffic detection, high accuracy is not only the desired quality, the model provided by the classifier should be understood by experts. For that, in this paper, we study the use of contrast pattern-based classifiers for building an understandable and accurate model for detecting non-human traffic on web log files. Our experiments over five databases show that the contrast pattern-based approach obtains significantly better AUC results than other state-of-the-art classifiers.
... Second, rather than in a controlled environment devoted to fraud detection, real situations should be used to identify click fraud. By using a controlled environment, MAdFraud [13] may examine the advertising default behaviour of several different apps at once; only one application is active at a time, with all HTTP requests being logged for further study. However, in our case, fraud detection should take place in real-world situations without the use of a third party app. ...
... As a supplement to AdSherlock, these server-side methods may detect real human click fraud. There is a close link between FcFraud [13] and our work since it's the newest attempt in online advertising to identify click fraud. Advertisement clicks are recognised, and whether or not they are accompanied by real mouse actions is determined by the tool. ...
Article
Full-text available
Mobile PR is an important component of the mobile app ecosystem. A major threat to this ecosystem’s long-term health is click fraud, which involves clicking on ads while infected with malware or using an automated bot to do it for you. The methods used to identify click fraud now focus on looking at server requests. Although these methods have the potential to produce huge numbers of false negatives, they may easily be avoided if clicks are hidden behind proxies or distributed globally. AdSherlock is a customer-side (inside the app) efficient and deployable click fraud detection system for mobile applications that we provide in this work. AdSherlock separates the computationally expensive click request identification procedures into an offline and online approach. AdSherlock uses URL (Uniform Resource Locator) tokenization in the Offline phase to create accurate and probabilistic patterns. These models are used to identify click requests online, and an ad request tree model is used to detect click fraud after that. In order to develop and evaluate the AdSherlock prototype, we utilise actual applications. It injects the online detector directly into an executable software package using binary instrumentation technology (BIT). The findings show that AdSherlock outperforms current state-of-the-art methods for detecting click fraud with little false positives. Advertisement requests identification, mobile advertising fraud detection are some of the keywords used in this article.
... In existing version of Android, users can not block internet access. To show ZoneDroid's effectiveness against botnets, we create a simple number game which performs click-fraud [9]. We install the app in an Android TV which remains on 24/7. ...
... Their design is specific to the enterprise environment and they enforce policies per application. A number of similar solutions target governments and enterprises [9], [24], [34], [39]. ZoneDroid is designed for the end users who have multiple Android devices and want to control a group of apps easily on all devices. ...
Conference Paper
Research has shown that the android permission model was insufficient for providing protection against malicious behaviors of the untrusted third-party applications. To improve this scenario, Google modified the permission model in the recent Android version. However, in our analysis, it is still not an ideal option to enforce fine-grained access control. In this paper, we propose an extension and implementation of the Android permission model, ZoneDroid, to control a set of applications easily by creating multiple application zones (i.e., application groups). It is an approach to control application groups by modifying the Android permission model. All other previous approaches focused on restricting individual applications or creating separate user profiles. ZoneDroid minimizes security and privacy risks with a finer granularity of restrictions. Users can also control multiple devices using the cloud. Different zones (high privilege, trusted, new, restricted, etc.) have different runtime policies and enforce fine-grained access control. The ability to control application groups efficiently can be a valuable addition to the existing Android permission model. Experiments show that ZoneDroid is effective against information leak and it can protect the device from becoming a part of a botnet. ZoneDroid offers much less user action when controlling multiple applications and its performance overhead is negligible.
... In [26], presented an efficient and deployable solution for detecting click fraud at the client side in mobile apps. Finally, in [27], the Fight Click-Fraud (FCFraud) method was proposed to detect click fraud from the user side, which can be incorporated into smartphone and computer operating systems. The proposed method accurately classifies ad requests from all user actions 99.6% accurately detects click bots 100% successfully on mobiles and computer devices. ...
Article
Full-text available
With the rapid development of online advertising, click fraud is a serious issue for the internet market. Click fraud is a dishonest attempt to improve a website’s profit or deplete an advertiser’s budget by clicking on pay-per-click advertisements. For an extended period, this illegal act has a threat to the industrial sectors. As a result, these businesses hesitate to advertise their items on mobile apps and websites, as numerous groups attempt to take advantage of themes. To safely advertise their services and products online, a robust mechanism is needed for efficient click fraud detection. To tackle this issue, an ensemble architecture of machine learning and deep learning is proposed to detect click fraud in online advertisement campaigns. The proposed ensemble architecture consists of a Convolutional Neural Network (CNN), and a Bidirectional Long Short-Term Memory network (BiLSTM) is used to extract hidden features, while the Random Forest (RF) is used for classification. The main objective of the proposed research study is to develop a hybrid DL model for automatic feature extraction from clicks data and then process through an RF classifier into two classes, such as fraudulent and non-fraudulent clicks. Furthermore, a preprocessing module is developed to preprocess data by dealing with categorical attributes and imbalanced data to enhance the reliability and consistency of the clicks data. In addition, different evaluation criteria are used to evaluate and compare the performance of the proposed CNN-BiLSTM-RF with the ensemble and standalone models. The experimental results indicate that our ensemble architecture achieved the accuracy of 99.19 ± 0.08%, precision 99.89 ± 0.03%, sensitivity 98.50 ± 0.11%, F1-score 99.19 ± 0.08% and specificity 99.89 ± 0.03%. Furthermore, our proposed architecture produced superior results compared to other developed ensemble and conventional models. Moreover, our proposed ensemble architecture can be used as a safeguard against click fraud for pay-per-click advertising to facilitate industries for the safe and reliable promotion of their products.
... Other works, which tackled ad click fraud from a ML perspective, are described in [26,33,54,55,59]. For example, the work of [26] proposed an algorithm for fraud identification in online pay-per-click advertising model by identifying repetitive clicks made by bots. ...
Article
Full-text available
Click fraud is a serious problem facing online advertising business. The malicious intent of clicking online ads either committed by humans or by non-humans, forced financial losses on advertisers utilizing pay-per-click advertising. Non-human traffic is usually designed to inflate web traffic for fraudulent purposes. In this paper, we demonstrate a hybrid approach consisting of two-level fingerprint applied in two phases to detect illegitimate non-human traffic. The first-level fingerprint is a pattern generated using immutable information about a user navigating a website’s pages. It will be used in the first traffic illegitimacy detection phase to infer rules about illegitimate non-human traffic from a developed ontology about web traffic legitimacy. The second-level fingerprint is generated using behavioral ad click patterns, which will be used in the second detection phase by applying a Machine-Learning (ML) algorithm. To test the proposed approach, a real commercial website for ads, called Waseet.com, was used. The access logs of the website server were utilized for the purpose of this research. The experiments show that our proposed hybrid approach using the ontology of web traffic illegitimacy and the ML k-NN classifier detects around (98.6%) of fake clicks.
... To defend against click fraud, both academia and industry have proposed a series of dynamic analysis based approaches to distinguish fraudulent clicks from the legitimate clicks. These approaches fall into the following two categories: user-side [8,9,19,20,28,41] and ad network-side approaches [11,14,32,47,49,50]. (1) The userside approaches rely on installing an additional patch or ad SDK on the user's device. The legitimacy of ad clicks is determined by checking whether the click pattern meets a certain rule. ...
Preprint
Full-text available
Although the use of pay-per-click mechanisms stimulates the prosperity of the mobile advertisement network, fraudulent ad clicks result in huge financial losses for advertisers. Extensive studies identify click fraud according to click/traffic patterns based on dynamic analysis. However, in this study, we identify a novel click fraud, named humanoid attack, which can circumvent existing detection schemes by generating fraudulent clicks with similar patterns to normal clicks. We implement the first tool ClickScanner to detect humanoid attacks on Android apps based on static analysis and variational AutoEncoder (VAE) with limited knowledge of fraudulent examples. We define novel features to characterize the patterns of humanoid attacks in the apps' bytecode level. ClickScanner builds a data dependency graph (DDG) based on static analysis to extract these key features and form a feature vector. We then propose a classification model only trained on benign datasets to overcome the limited knowledge of humanoid attacks. We leverage ClickScanner to conduct the first large-scale measurement on app markets (i.e.,120,000 apps from Google Play and Huawei AppGallery) and reveal several unprecedented phenomena. First, even for the top-rated 20,000 apps, ClickScanner still identifies 157 apps as fraudulent, which shows the prevalence of humanoid attacks. Second, it is observed that the ad SDK-based attack (i.e., the fraudulent codes are in the third-party ad SDKs) is now a dominant attack approach. Third, the manner of attack is notably different across apps of various categories and popularities. Finally, we notice there are several existing variants of the humanoid attack. Additionally, our measurements demonstrate the proposed ClickScanner is accurate and time-efficient (i.e., the detection overhead is only 15.35% of those of existing schemes).
... Iqbal et al. developed an anti-virus tool that detects whether click from website is fraud or not by computing HTTP requests that are not similar with human activity; for instance mouse events. Although the user's method of detection is applicable, it does not currently apply to mobile in-app advertising [23,24]. Gabryel introduced an algorithm for click fraud detection primarily on norms received from the advertiser's web sites by the use of exclusive Javascript component. ...
Conference Paper
Full-text available
In today's scenario, web advertising has a very important role for promotion of any business through keywords. And due to this the capacity of click fraud to impact business models based on online advertisements emerged as a topic of discussion. Part of Google's financial success through the years was due to the rate per click online advertisement model; however, nowadays, this model has been heavily threatened by fraudulent clicks. Invalid clicks in advertisements, i.e., clicks that do not correspond to a real interest in the advertisement and deviate funds to scammers. One of the prominently used advertising model PPC(Pay Per Click) works by charging advertisers for each click on their advertisements or content as per the agreement. Among few styles of click-fraud, botnets and click-farm are the extreme ones. In order to solve this issue some solutions were presented. There are many techniques used for click-fraud analysis and detection, this paper is a survey that contemplates to showcase details related to click fraud and domains working on its detection.
... Instead of operating on the server side, FCFraud [8] runs locally on the devices of individual users as a means of preventing them from being part of a BotNet. A BotNet is a group of infected devices which is used to commit click-fraud by generating fake reports without the user's knowledge. ...
Article
Full-text available
Service commissions, which are claimed by Ad-Networks and Publishers, are susceptible to forgery as non-human operators are able to artificially create fictitious traffic on digital platforms for the purpose of committing financial fraud. This places a significant strain on Advertisers who have no effective means of differentiating fabricated Ad-Reports from those which correspond to real consumer activity. To address this problem, we contribute an advert reporting system which utilizes opportunistic networking and a blockchain-inspired construction in order to identify authentic Ad-Reports by determining whether they were composed by honest or dishonest users. What constitutes a user’s honesty for our system is the manner in which they access adverts on their mobile device. Dishonest users submit multiple reports over a short period of time while honest users behave as consumers who view adverts at a balanced pace while engaging in typical social activities such as purchasing goods online, moving through space and interacting with other users. We argue that it is hard for dishonest users to fake honest behaviour and we exploit the behavioural patterns of users in order to classify Ad-Reports as real or fabricated. By determining the honesty of the user who submitted a particular report, our system offers a more secure reward-claiming model which protects against fraud while still preserving the user’s anonymity.
Article
Online advertising utilizes the Internet technique to deliver marketing messages to promotional consumers. It is growing in recent years to facilitate the increasing demands of electronic commerce. Advertisers bid and pay for the advertisement whenever potential customers click it. The way of Pay-Per-Click (PPC) is vulnerable to malicious clicks that mimic real user behaviour to trick the platform into counting their clicks as legitimate. It causes massive financial losses on advertisers and also significantly reduces the credibility of online advertising platforms. The common strategies to detect fraud clicks are dynamically tailoring and interpreting data based on the machine learning model. These algorithms treat multi-dimensional data as an individual feature vector or matrix, making it different to explore intrinsic relations among a sequence of data. Million daily fraud clicks on various types further disperse the focus of models and result in relatively low efficiency for the current fraud prediction system. To tackle the fraud click problem, we introduce a tensor-based mechanism to predict fraud clicks. This paper considered reconstructing data into a high-rank tensor, implement tensor decomposition and transformation to explore hidden information under each data and explore the joined effect among a sequence of data. The proposed tensor transformation algorithm with locality-sensitive hashing (LSH) is tested by extensive experiments using real-world data. Compared with the state-of-art machine learning algorithms, our model can achieve significant performance in terms of accuracy and prediction-recall rate.
Article
As the mobile era matures, it is increasingly competitive to market mobile apps, forcing companies to invest heavily on mobile user acquisition campaigns. This has unfortunately given birth to a new form of Internet fraud, which we refer to as "app distribution fraud". This new fraud involves collusion between ISPs and fraudulent app distributors where app download is hijacked/redirected. In this paper, we have the unique opportunity to cooperate with a major e-commerce company (with about 0.2 billion active users per month) to take a first peek at this problem. Through the nationwide measurement results, we find that app distribution fraud is ubiquitous yet stealthy --- about 1.55% app downloads are hijacked/redirected, affecting more than 75% of the cities we tested and causing an estimated 7.46 billion U.S. dollars financial loss per year. We follow up with additional measurements on the technical mechanism of the fraud and the scope of the fraud (i.e., what other apps are also affected). Surprisingly, we find that sometimes the original app a user intends to download can be replaced with a completely different app, rendering the user's device at risks.
Chapter
A trailer is a short version of the movie which gives the viewer information about the movie such as its genre, the cast and what the audience needs to expect from the movie. You should never judge a book by its cover but you can always judge a movie by its trailer. So in this paper, we aim to classify actors in movie trailers by extracting key frames from trailers and obtaining features from it through convolutional neural network and then classifying it using the output function. A trailer emphasises the type of movie it is marketing. Actors are one of the most important aspects in a movie and play a decisive role when analysing the overall popularity of the movie. A trailer always gives a sneak-peak of the actors that are going to play a crucial role in the movie. Recognising actors in a movie trailer thus becomes an important task as much of the viewer’s pre-judge a movie based on the actors that glorify its cast.
Chapter
Full-text available
A number of apps are available in app stores for different types of online services and are growing with the use of mobile devices in the public. Advertisements on mobile through apps become a popular trend in online businesses. Ad frauds is a major concern for the advertising industry, and the most affected ones are the advertisers. The advertiser pays revenue to the ad network for each click on his ad. If ghost clicks are not captured as fraud or invalid clicks, then the advertiser has to pay to the ad network for ghost clicks also. Hence, return on investment (ROI) of the advertiser becomes very low and not achieving the goal as expected. Detecting fraud clicks on ads is very important to achieve the expected goal of the advertiser. The proposed methodology follows a heuristic approach to detect mobile ad-click frauds. A click fraud detection algorithm has been developed and implemented on the cloud server. The algorithm is triggered for each entry of the ad click to detect the click belongs to fraud or not fraud. The obtained results have been verified by popular machine learning (ML) algorithms namely as support vector machine (SVM), random forest, and k-nearest neighbors (k-NN) with the accuracies of 91%, 84% and 85%, respectively.
Article
Mobile advertising plays a vital role in the mobile app ecosystem. A major threat to the sustainability of this ecosystem is click fraud, i.e., ad clicks performed by malicious code or automatic bot problems. Existing click fraud detection approaches focus on analyzing the ad requests at the server side. However, such approaches may suffer from high false negatives since the detection can be easily circumvented, e.g., when the clicks are behind proxies or globally distributed. In this paper, we present AdSherlock, an efficient and deployable click fraud detection approach at the client side (inside the application) for mobile apps. AdSherlock splits the computation intensive operations of click request identification into an offline procedure and an online procedure. In the offline procedure, AdSherlock generates both exact patterns and probabilistic patterns based on URL tokenization. These patterns are used in the online procedure for click request identification and further used for click fraud detection together with an ad request tree model. We implement a prototype of AdSherlock and evaluate its performance using real apps. The online detector is injected into the app executable archive through binary instrumentation. Results show that AdSherlock achieves higher click fraud detection accuracy compared with state of the art, with negligible runtime overhead.
Chapter
An advertisement (ad) click fraud occurs when a user or a bot clicks on an ad with a malicious intent where advertisers need to pay for those fake clicks. Click-fraud is a serious problem for the online advertising industry. Our study demonstrates a hybrid approach using a two-level fingerprint to detect the illegitimate bots targeting ad click fraud. The approach consists of two detection phases: (1) a rule-based phase and (2) a machine learning-based phase. The first level of the fingerprint is used for rule-based detection phase. It is generated using immutable information about the user and traversing a website’s page. The second level of the fingerprint is generated using ad click behavioral patterns. It is used for machine learning-based detection phase. Different traditional classification algorithms were evaluated to be applied in the machine learning-based detection phase. To test our approach, we used a real commercial website for ads called Waseet where the access log of the website server was utilized as a dataset for our experiments. The results of our experiments show that our proposed hybrid approach entails promising results.
Article
Full-text available
Abstract Mobile ads are plagued with fraudulent clicks which is a major challenge for the advertising community. Although popular ad networks use many techniques to detect click fraud, they do not protect the client from possible collusion between publishers and ad networks. In addition, ad networks are not able to monitor the user’s activity for click fraud detection once they are redirected to the advertising site after clicking the ad. We propose a new crowdsource-based system called Click Fraud Crowdsourcing (CFC) that collaborates with both advertisers and ad networks in order to protect both parties from any possible click fraudulent acts. The system benefits from both a global view, where it gathers multiple ad requests corresponding to different ad network-publisher-advertiser combinations, and a local view, where it is able to track the users’ engagement in each advertising website. The results demonstrated that our approach offers a lower false positive rate (0.1) when detecting click fraud as opposed to proposed solutions in the literature, while maintaining a high true positive rate (0.9). Furthermore, we propose a new mobile ad charging model that benefits from our system to charge advertisers based on the duration spent in the advertiser’s website.
Article
Full-text available
Detecting non-human activity in social networks has become an area of great interest for both industry and academia. In this context, obtaining a high detection accuracy is not the only desired quality; experts in the application domain would also like having an understandable model, with which one may explain a decision. An explanatory decision model may help experts to consider, for example, taking legal action against an account that has displayed offensive behavior, or forewarning an account holder about suspicious activity. In this paper, we shall use a pattern-based classification mechanism to social bot detection, specifically for Twitter. Further, we shall introduce a new feature model for social bot detection, which extends (part of) an existing model with features out of Twitter account usage and tweet content sentiment analysis. From our experimental results, we shall see that our mechanism outperforms other, state-of-the-art classifiers, not based on patterns; and that our feature model yields better classification results than others reported on in the literature.
Article
Internet users are often victimized by malicious attackers. Some attackers infect and use innocent users' machines to launch large-scale attacks without the users' knowledge. One of such attacks is the click-fraud attack. Click-fraud happens in Pay-Per-Click (PPC) ad networks where the ad network charges advertisers for every click on their ads. Click-fraud has been proved to be a serious problem for the online advertisement industry. In a click-fraud attack, a user or an automated software clicks on an ad with a malicious intent and advertisers need to pay for those valueless clicks. Among many forms of click-fraud, botnets with the automated clickers are the most severe ones. In this paper, we present a method for detecting automated clickers from the user-side. The proposed method to Fight Click-Fraud, FCFraud, can be integrated into the desktop and smart device operating systems. Since most modern operating systems already provide some kind of anti-malware service, our proposed method can be implemented as a part of the service. We believe that an effective protection at the operating system level can save billions of dollars of the advertisers. Experiments show that FCFraud is 99.6% (98.2% in mobile ad library generated traffic) accurate in classifying ad requests from all user processes and it is 100% successful in detecting clickbots in both desktop and mobile devices. We implement a cloud backend for the FCFraud service to save battery power in mobile devices. The overhead of executing FCFraud is also analyzed and we show that it is reasonable for both the platforms.
Conference Paper
Full-text available
Click fraud - malicious clicks at the expense of pay-per-click advertisers - is posing a serious threat to the Internet economy. Although click fraud has attracted much attention from the security community, as the direct victims of click fraud, advertisers still lack effective defense to detect click fraud independently. In this paper, we propose a novel approach for advertisers to detect click frauds and evaluate the return on investment (ROI) of their ad campaigns without the helps from ad networks or publishers. Our key idea is to proactively test if visiting clients are full-fledged modern browsers and passively scrutinize user engagement. In particular, we introduce a new functionality test and develop an extensive characterization of user engagement. Our detection can significantly raise the bar for committing click fraud and is transparent to users. Moreover, our approach requires little effort to be deployed at the advertiser side. To validate the effectiveness of our approach, we implement a prototype and deploy it on a large production website; and then we run 10-day ad campaigns for the website on a major ad network. The experimental results show that our proposed defense is effective in identifying both clickbots and human clickers, while incurring negligible overhead at both the server and client sides.
Conference Paper
Full-text available
Internet-borne threats have evolved from easy to detect denial of service attacks to zero-day exploits used for targeted exfiltration of data. Current intrusion detection systems cannot always keep-up with zero-day attacks and it is often the case that valuable data have already been communicated to an external party over an encrypted or plain text connection before the intrusion is detected. In this paper, we present a scalable approach called Network Interrogator (NetGator) to detect network-based malware that attempts to exfiltrate data over open ports and protocols. NetGator operates as a transparent proxy using protocol analysis to first identify the declared client application using known network flow signatures.Then we craft packets that “challenge” the application by exercising functionality present in legitimate applications but too complex or intricate to be present in malware. When the application is unable to correctly solve and respond to the challenge, NetGator flags the flow as potential malware. Our approach is seamless and requires no interaction from the user and no changes on the commodity application software. NetGator introduces a minimal traffic latency (0.35 seconds on average) to normal network communication while it can expose a wide-range of existing malware threats.
Conference Paper
Full-text available
In pay-per-click online advertising systems like Google, Over- ture, or MSN, advertisers are charged for their ads only when a user clicks on the ad. While these systems have many advantages over other meth- ods of selling online ads, they suffer from one major drawback. They are highly susceptible to a particular style of fraudulent attack called click fraud. Click fraud happens when an advertiser or service provider gener- ates clicks on an ad with the sole intent of increasing the payment of the advertiser. Leaders in the pay-per-click marketplace have identified click fraud as the most significant threat to their business model. We demon- strate that a particular class of learning algorithms, called click-based al- gorithms, are resistant to click fraud in some sense. We focus on a simple situation in which there is just one ad slot, and show that fraudulent clicks can not increase the expected payment per impression by more than o(1) in a click-based algorithm. Conversely, we show that other common learn- ing algorithms are vulnerable to fraudulent attacks.
Article
Full-text available
Click fraud is a substantial threat in the cyberworld. Here, the author examines the contexts, mechanisms, and processes associated with the click-fraud industry from an economics viewpoint. The nature of electronic channels, characterized by asymmetric hypermediation, provides a fertile ground for such fraud.
Article
Full-text available
More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining [35]. These days, WEKA enjoys widespread acceptance in both academia and business, has an active community, and has been downloaded more than 1.4 million times since being placed on Source-Forge in April 2000. This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.
Article
Full-text available
Online advertising is currently the richest source of revenue for many Internet giants. The increased number of online businesses, specialized websites and modern profiling techniques have all contributed to an explosion of the income of ad brokers from online advertising. The single biggest threat to this growth, is however, click-fraud. Trained botnets and individuals are hired by click-fraud specialists in order to maximize the revenue of certain users from the ads they publish on their websites, or to launch an attack between competing businesses. In this note we wish to raise the awareness of the networking research community on potential research areas within the online advertising field. As an example strategy, we present Bluff ads; a class of ads that join forces in order to increase the effort level for click-fraud spammers. Bluff ads are either targeted ads, with irrelevant display text, or highly relevant display text, with irrelevant targeting information. They act as a litmus test for the legitimacy of the individual clicking on the ads. Together with standard threshold-based methods, fake ads help to decrease click-fraud levels.
Article
Many Android applications are distributed for free but are supported by advertisements. Ad libraries embedded in the app fetch content from the ad provider and display it on the app's user interface. The ad provider pays the developer for the ads displayed to the user and ads clicked by the user. A major threat to this ecosystem is ad fraud, where a miscreant's code fetches ads without displaying them to the user or "clicks" on ads automatically. Ad fraud has been extensively studied in the context of web advertising but has gone largely unstudied in the context of mobile advertising. We take the first step to study mobile ad fraud perpetrated by Android apps. We identify two fraudulent ad behaviors in apps: 1) requesting ads while the app is in the background, and 2) clicking on ads without user interaction. Based on these observations, we developed an analysis tool, MAdFraud, which automatically runs many apps simultaneously in emulators to trigger and expose ad fraud. Since the formats of ad impressions and clicks vary widely between different ad providers, we develop a novel approach for automatically identifying ad impressions and clicks in three steps: building HTTP request trees, identifying ad request pages using machine learning, and detecting clicks in HTTP request trees using heuristics. We apply our methodology and tool to two datasets: 1) 130,339 apps crawled from 19 Android markets including Play and many third-party markets, and 2) 35,087 apps that likely contain malware provided by a security company. From analyzing these datasets, we find that about 30% of apps with ads make ad requests while in running in the background. In addition, we find 27 apps which generate clicks without user interaction. We find that the click fraud apps attempt to remain stealthy when fabricating ad traffic by only periodically sending clicks and changing which ad provider is being targeted between installations.
Article
Click-spam in online advertising, where unethical publishers use malware or trick users into clicking ads, siphons off hundreds of millions of advertiser dollars meant to support free websites and apps. Ad networks today, sadly, rely primarily on security through obscurity to defend against click-spam. In this paper, we present Viceroi, a principled approach to catching click-spam in search ad networks. It is designed based on the intuition that click-spam is a profit-making business that needs to deliver higher return on investment (ROI) for click-spammers than other (ethical) business models to offset the risk of getting caught. Viceroi operates at the ad network where it has visibility into all ad clicks. Working with a large real-world ad network, we find that the simple-yet-general Viceroi approach catches over six very different classes of click-spam attacks (e.g., malware-driven, search-hijacking, arbitrage) without any tuning knobs.
Article
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ***, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.
Article
This paper provides a detailed case study of the architecture of the Clickbot. A botnet that attempted a low-noise click fraud attack against syndicated search engines. The botnet of over 100,000 machines was controlled using a HTTP-based botmaster. Google identified all clicks on its ads exhibiting Clickbot. Alike patterns and marked them as invalid. We disclose the results of our investigation of this botnet to educate the security research community and provide information regarding the novelties of the attack.
Conference Paper
Discovering associations between elements occurring in a stream is applicable in numerous applications, including predictive caching and fraud detection. These applications require a new model of association between pairs of elements in streams. We develop an algorithm, allows for integration with current stream management systems, since it employs existing techniques for finding frequent elements. The presentation emphasizes the applicability of the algorithm to fraud detection in advertising networks. Such fraud instances have not been successfully detected by current techniques. Our experiments on synthetic data demonstrate scalability and efficiency. On real data, potential fraud was discovered.
The anatomy of clickbot. a
  • N Daswani
  • M Stoppelman
N. Daswani and M. Stoppelman, "The anatomy of clickbot. a," in Proceedings of the 1st Conference on Hot Topics in Understanding Botnets. USENIX Association, 2007, pp. 11-11.
Madfraud: investigating ad fraud in android applications
  • J Crussell
  • R Stevens
  • H Chen
J. Crussell, R. Stevens, and H. Chen, "Madfraud: investigating ad fraud in android applications," in Proceedings of the 12th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 2014, pp. 123-134.