Conference Paper

What Ad Blockers Are (and Are Not) Doing

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The Web has many types of third-party domains and has a variety of available ad blockers. This work evaluates ad blocking tools for their effectiveness in blocking the retrieval of different categories of third-party content during download of popular websites. The results of this work demonstrate that there is much variation in the effectiveness of current ad blocking tools to prevent requests to different types of third-party domains. Non-configurable tools such as Blur and Disconnect provide only modest blockage of third-party domains in most categories. The tool uBlock generally demonstrates the best default configuration performance. By default, Ghostery provides no protection while Adblock Plus and Adguard provide minimal protection. They must be manually configured to obtain effective protection. The behavior of Adblock Plus is particularly notable as usage data indicates it has an 85% share of the ad blocking tool market. Other results based on network traces suggest that approximately 80% of these Adblock Plus users employ its default configuration. Construction of a "composite" ad blocker reflecting current usage of ad blockers and their configurations shows this composite ad blocker provides only a modest range reduction of 13-34% in the set of third-party domains retrieved in each category relative to not employing any ad blocker.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... They perform this blanket filtering operation by referring to a filter list containing the addresses of all known ad servers along with their pattern-matching rules. However, they do not eliminate any internal promotions or ads [51], most of which are usually deceptive [36]. Internal promotions or ads contribute significantly to the income of many websites and form an integral part of their content. ...
... In addition to commercially available ad blockers, some academic works have proposed ad detection algorithms [52,53]. For instance, Lashkari et al. [54] developed CIC-AB, which is an algorithm that employs machine learning methodologies to identify advertisements and classify them as non-ads, normal ads, and malicious ads, thereby eliminating the need to regularly maintain a filter list (as with earlier rule-based approaches) [51]. CIC-AB was developed as an extension for common browsers (e.g., Firefox and Chrome). ...
... CIC-AB was developed as an extension for common browsers (e.g., Firefox and Chrome). Similarly, Bhagavatula et al. [55] developed an algorithm using machine learning for ad blocking with less human intervention, maintaining an accuracy similar to hand-crafted filters (e.g., [51]), while also blocking new ads that would otherwise necessitate further human intervention in the form of additional handmade filter rules. Nonetheless, increasing numbers of websites are now discouraging ad blocking due to the loss of associated ad revenue. ...
Article
Full-text available
Advertisements have become commonplace on modern websites. While ads are typically designed for visual consumption, it is unclear how they affect blind users who interact with the ads using a screen reader. Existing research studies on non-visual web interaction predominantly focus on general web browsing; the specific impact of extraneous ad content on blind users’ experience remains largely unexplored. To fill this gap, we conducted an interview study with 18 blind participants; we found that blind users are often deceived by ads that contextually blend in with the surrounding web page content. While ad blockers can address this problem via a blanket filtering operation, many websites are increasingly denying access if an ad blocker is active. Moreover, ad blockers often do not filter out internal ads injected by the websites themselves. Therefore, we devised an algorithm to automatically identify contextually deceptive ads on a web page. Specifically, we built a detection model that leverages a multi-modal combination of handcrafted and automatically extracted features to determine if a particular ad is contextually deceptive. Evaluations of the model on a representative test dataset and ‘in-the-wild’ random websites yielded F1 scores of 0.86 and 0.88, respectively.
... Motivated by analogous theoretical scenarios and more traditional privacy questions, there has recently been an increasing interest to empirically examine dependencies between websites [5], [8], [10]. While new questions related to thirdparties have emerged with social media [3], the empirical research in this domain has its roots in the long-standing questions related to web advertisements, user tracking, and the never-ending "ad-blocking wars" [11], [12], [13]. This is also the domain to which this paper contributes with the focus on TCP connections and Finnish websites. ...
... Both lists were pre-processed by transforming the domain names into second-level domains. This commonly used manipulation (e.g., [13]) is both a reasonable simplification and a necessity due to the DNS-based rankings in the Cisco sample. For instance, google.com ...
... For providing a tentative answer to RQ 3 , the cross-domain connections initiated are compared against two so-called adblocking lists [25], [26], and two lists on the domains owned and used by Facebook and Google [27]. It should be remarked that particularly the ad-blocking lists contain many problems, including the manual maintenance [11] and the variance between lists [12], [13]. The two lists used should still provide a reasonable approximation on the prevalence of advertisement servers among the servers to which new cross-domain TCP connections were initiated. ...
Preprint
The loading of resources from third-parties has evoked new security and privacy concerns about the current world wide web. Building on the concepts of forced and implicit trust, this paper examines cross-domain transmission control protocol (TCP) connections that are initiated to domains other than the domain queried with a web browser. The dataset covers nearly ten thousand domains and over three hundred thousand TCP connections initiated by querying popular Finnish websites and globally popular sites. According to the results, (i) cross-domain connections are extremely common in the current Web. (ii) Most of these transmit encrypted content, although mixed content delivery is relatively common; many of the cross-domain connections deliver unencrypted content at the same time. (iii) Many of the cross-domain connections are initiated to known web advertisement domains, but a much larger share traces to social media platforms and cloud infrastructures. Finally, (iv) the results differ slightly between the Finnish web sites sampled and the globally popular sites. With these results, the paper contributes to the ongoing work for better understanding cross-domain connections and dependencies in the world wide web.
... Besides, having such as browser extension AdBlock might bring fewer difficulties form the perspectives of network administrator since the growing number of client own devices such as laptop, tablet, and smartphone are connected wirelessly access to the network [9]. A study conducted by [15] has evaluated multiple AdBlock tools effectiveness in blocking online ads during downloading popular websites however, their study scoped down to browser extension technique and it did not differentiate the traffic as will be conducted in this study. Besides, some research has done a study on the effectiveness of available Ad-Block; for example, they have conducted a test on most popular browser extension "AdBlock Plus" and well-known hardware solution "AdTrap" [14]. ...
... In order to conduct comparison of Ad-block effectiveness, this study created three network scenarios of experiment where Scenario 1: is network without Ad-Block, Scenario 2 network with Browser Extension Ad-Block and Scenario 3: network DNS Ad-Block. According to [15] and [16] the best way to test the effectiveness of adblock is to test on most popular websites. In each test scenario; there are three areas of investigation that this study focusing which are "HTTP Request == GET", "TCP Connection" and "Bandwidth Consumption". ...
... The framework that been used for this study is similar with previous research conducted by [15] where they used it to investigate the amount of data generated by advertisements when browsing as shown in Figure 2. Activities conducted in this experiment when we used network packet data to investigate the amount of data generated by advertisements when browsing. This experiment needs to set-up a network environment where it will be tested with three scenarios as mention before. ...
Article
Full-text available
This paper presents the evaluation of AdBlock technique implementation for enterprise network environment. This study has presented the impact of web browsing activities where it is the most active traffic where is consumed the highest inbound bandwidth usage in enterprise network environment. We can conclude that DNS AdBlock is the best solution for enterprise network environment in term of blocking advertisement compare to extension adblock. Adblock technique also reduce network data request by comparing front-end solution (browser extension AdBlock) at client web browser and networks level adblock. The parameters such as HTTP request, TCP connection and network bandwidth are being examined to measure the effectiveness of blocking online advertisement. Both techniques perform the reduction of traffics and bandwidth utilization. The result shows that DNS AdBlock is the most effective in blocking online advertisement using the examined parameters. DNS AdBlock can sustain the usage of web browsing activity for enterprise network and also generate substantial saving across several fonts. This study has identified current web browsing trends traffic in enterprise network where it consumed 50 percent in average. This number increased when industries are moving to cloud web-based consumption. However, industries such as educational sector, web browsing traffic is one of connectivity that enterprises network should be investing to support openness and heavy traffic from educational users.
... Most ad-blockers are able to block multiple types of adsincluding search ads appearing as sponsored search results on search engines and display ads appearing on other sites. Numerous researchers have investigated the technical performance of ad-blockers [81,95], and have demonstrated that ad-blockers are highly effective in eliminating online ads and limiting web tracking [5,28,50,72,74,75,109], and in reducing energy consumption on smartphones [20,79,92] and laptops [96]. As discussed in §1, users often deploy ad-blockers to counter privacy and security concerns. ...
... In summary, while a few studies have explored the privacy implications of online advertising tracking [114] or the economic impact of fraudulent ads on the companies' revenues [98], and have quantified ad-blockers' privacy implications [33,109], none have estimated the impact on ad-blocker users' economic welfare and satisfaction. To our knowledge, our study-investigating the impact of ad-blockers on actual Internet users' purchasing behavior, outcomes, and satisfaction-is the first to attempt to bridge the gaps in the existing research on ad-blockers' technical aspects of security, human factors, economic impact, and privacy implications. ...
Conference Paper
Full-text available
Ad-blocking applications have become increasingly popular among Internet users. Ad-blockers offer various privacy-and security-enhancing features: they can reduce personal data collection and exposure to malicious advertising, help safeguard users' decision-making autonomy, reduce users' costs (by increasing the speed of page loading), and improve the browsing experience (by reducing visual clutter). On the other hand, the online advertising industry has claimed that ads increase consumers' economic welfare by helping them find better, cheaper deals faster. If so, using ad-blockers would deprive consumers of these benefits. However, little is known about the actual economic impact of ad-blockers. We designed a lab experiment (N=212) with real economic incentives to understand the impact of ad-blockers on con-sumers' product searching and purchasing behavior, and the resulting consumer outcomes. We focus on the effects of blocking contextual ads (ads targeted to individual, potentially sensitive, contexts, such as search queries in a search engine or the content of web pages) on how participants searched for and purchased various products online, and the resulting consumer welfare. We find that blocking contextual ads did not have a statistically significant effect on the prices of products participants chose to purchase, the time they spent searching for them, or how satisfied they were with the chosen products, prices, and perceived quality. Hence we do not reject the null hypothesis that consumer behavior and outcomes stay constant when such ads are blocked or shown. We conclude that the use of ad-blockers does not seem to compromise consumer economic welfare (along the metrics captured in the experiment) in exchange for privacy and security benefits. We discuss the implications of this work in terms of end-users' privacy, the study's limitations, and future work to extend these results.
... Browser Extensions: Browser extensions vary in effectiveness and can be distinguished between (1) ad blocking extensions that limit ads from being loaded such, as Adblock Plus, and (2) tracker blocking extensions that focus on blocking trackers, such as Ghostery, Privacy Badger or Disconnect [51]. In the default settings, these extensions may not effectively block ads and trackers [58] and may need manual configuration for effective protection [80]. Although some extensions have improved their usability, their description in the past was found to be filled with jargon and were not easy for users to change their settings when the tool interfered with websites [43]. ...
... We note that the study participants may not be aware of the (in-)effectiveness (or the varied effectiveness across PETs), of the tracking protection methods that they use. In particular, some browser extensions may not effectively block ads and trackers [58] and may need manual configuration for effective protection [80]. In reality, malicious extension behaviour in browser extensions is a serious threat. ...
Article
Full-text available
Online tracking is complex and users find it challenging to protect themselves from it. While the academic community has extensively studied systems and users for tracking practices, the link between the data protection regulations, websites’ practices of presenting privacy-enhancing technologies (PETs), and how users learn about PETs and practice them is not clear. This paper takes a multidimensional approach to find such a link. We conduct a study to evaluate the 100 top EU websites, where we find that information about PETs is provided far beyond the cookie notice. We also find that opting-out from privacy settings is not as easy as opting-in and becomes even more difficult (if not impossible) when the user decides to opt-out of previously accepted privacy settings. In addition, we conduct an online survey with 614 participants across three countries (UK, France, Germany) to gain a broad understanding of users’ tracking protection practices. We find that users mostly learn about PETs for tracking protection via their own research or with the help of family and friends. We find a disparity between what websites offer as tracking protection and the ways individuals report to do so. Observing such a disparity sheds light on why current policies and practices are ineffective in supporting the use of PETs by users.
... While user privacy and security are crucial, even ads that are safe and not tracking users can have a significant performance impact that has cascading effects on user satisfaction and Internet costs. Some notable studies [30,36,50,52,56] lean on ad blockers to measure the performance cost of web ads. The key distinction between our approach and prior efforts is that we do not rely on ad blockers and content-blocking for performance analysis of ads for three main reasons: ...
... In other related studies [50,56], authors deploy ad blockers in the wild and then use passive measurements on the traces to characterize the network traffic. Both studies report 17-18% of the network requests to belong to adverts, which is close to our numbers. ...
Article
Full-text available
Monetizing websites and web apps through online advertising is widespread in the web ecosystem, creating a billion-dollar market. This has led to the emergence of a vast network of tertiary ad providers and ad syndication to facilitate this growing market. Nowadays, the online advertising ecosystem forces publishers to integrate ads from these third-party domains. On the one hand, this raises several privacy and security concerns that are actively being studied in recent years. On the other hand, the ability of today's browsers to load dynamic web pages with complex animations and Javascript has also transformed online advertising. This can have a significant impact on webpage performance. The latter is a critical metric for optimization since it ultimately impacts user satisfaction. Unfortunately, there are limited literature studies on understanding the performance impacts of online advertising which we argue is as important as privacy and security. In this paper, we apply an in-depth and first-of-a-kind performance evaluation of web ads. Unlike prior efforts that rely primarily on adblockers, we perform a fine-grained analysis on the web browser's page loading process to demystify the performance cost of web ads. We aim to characterize the cost by every component of an ad, so the publisher, ad syndicate, and advertiser can improve the ad's performance with detailed guidance. For this purpose, we develop a tool, adPerf, for the Chrome browser that classifies page loading workloads into ad-related and main-content at the granularity of browser activities. Our evaluations show that online advertising entails more than 15% of browser page loading workload and approximately 88% of that is spent on JavaScript. On smartphones, this additional cost of ads is 7% lower since mobile pages include fewer and well-optimized ads. We also track the sources and delivery chain of web ads and analyze performance considering the origin of the ad contents. We observe that 2 of the well-known third-party ad domains contribute to 35% of the ads performance cost and surprisingly, top news websites implicitly include unknown third-party ads which in some cases build up to more than 37% of the ads performance cost.
... Ad-blocking works using a filter list, which lists the locations of known publications [19]. Each ad-blocker is driven by a filter list, which is a set of syntactic matching rules to block (or allow) the URL's loading by the browser [20]. Ad-blockers can embed filter lists as part of their implementation or enable users to load one or several public filter lists into the tool [20]. ...
... Each ad-blocker is driven by a filter list, which is a set of syntactic matching rules to block (or allow) the URL's loading by the browser [20]. Ad-blockers can embed filter lists as part of their implementation or enable users to load one or several public filter lists into the tool [20]. EasyList and EasyPrivacy are two of the most used filter lists for blocking advertisements and trackers [13]. ...
... Therefore, new trackers or other third-party content are not captured or blocked by these extensions. For example, in [73], the authors demonstrated that existing extensions could not catch requests that utilize a DNS CNAME alias to prevent detection. Our experiments also confirm this technical issue. ...
... Moreover, in [42], the authors provide insight into the landscape of tracker blocker tools. Wills and Uzunoglu [73] explore the effectiveness of ad-blocking tools. In [31], Iqbal et al. study a graphbased machine learning approach to blocking ads and tracker. ...
Article
Full-text available
We introduce a novel approach to protecting the privacy of web users. We propose to monitor the behaviors of JavaScript code within a web origin based on the source of the code, i.e., code origin, to detect and prevent malicious actions that would compromise users’ privacy. Our code-origin policy enforcement approach not only advances the conventional same-origin policy standard but also goes beyond the “all-or-nothing” contemporary ad-blockers and tracker-blockers. In particular, our monitoring mechanism does not rely on browsers’ network request interception and blocking as in existing blockers. In contrast, we monitor the code that reads or sends user data sent out of the browser to enforce fine-grained and context-aware policies based on the origin of the code. We implement a proof-of-concept prototype and perform practical evaluations to demonstrate the effectiveness of our approach. Our experimental results evidence that the proposed method can detect and prevent data leakage channels not captured by the leading tools such as Ghostery and uBlock Origin. We show that our prototype is compatible with major browsers and popular real-world websites with promising runtime performance. Although implemented as a browser extension, our approach is browser-agnostic and can be integrated into the core of a browser as it is based on standard JavaScript.
... Such requests, most often to thirdparties, are first classified according to the type of elements they realize; whether advertisements, analytics, beacons, social-media, or functional widgets. The largest proportion of such requests (40-50%) are made to the first group, on which this module focuses, which includes ad and ad-tracking services [43]. This module determines which requests to block and which to allow, and distinguishes, in the latter category, between those that yield visual elements and those used only for tracking. ...
... In order to categorize such requests, we leverage the capabilities of the open-source uBlock-Origin [17] project, a configurable, list-based "blocker" that is effective and efficient [43]. Like other blockers, uBlock allows users to specify publicly accessible lists of resources which contain syntactic matching rules for the retrieval of web resources. ...
Conference Paper
Full-text available
The strategy of obfuscation has been broadly applied—in search, location tracking, private communication, anonymity—and has thus been recognized as an important element of the privacy engineer’s toolbox. However there remains a need for clearly articulated case studies describing not only the engineering of obfuscation mechanisms but providing a critical appraisal of obfuscation’s fit for specific socio-technical applications. This is the aim of our paper, which presents our experiences designing, implementing, and distributing AdNauseam, an open-source browser extension that leverages obfuscation to frustrate tracking by online advertisers.
... In this paper, we present a comprehensive study of the advertisement and tracking ecosystem with special emphasis on the performance gains resulting from using content-blockers when surfing the Internet. Some previous works have already analyzed and compared the effectiveness of existing contentfiltering systems in terms of blocking performance [3], [4]. However, to the best of our knowledge, this is the first work to compare content-blockers in terms of bandwidth and latency improvements with a large and diverse set of websites. ...
... In [4] Wills et al. makes a comparison between different adblockers and tracker-blockers, studying the success rate of their methods to detect and block the content of different third party trackers. Traverso et al. [3] study the behave of seven different content-filtering plugins within a set of 100 specific websites classified in different popular categories. ...
Preprint
With the evolution of the online advertisement and tracking ecosystem, content-filtering has become the reference tool for improving the security, privacy and browsing experience when surfing the Internet. It is also commonly believed that using content-blockers to stop unsolicited content decreases the time needed for loading websites. In this work, we perform a large scale study with the 100K most popular websites on the actual performance improvements of using content-blockers. We focus our study in the two most relevant metrics for user experience; bandwidth and latency. Our results show that using such tools results in small improvements in terms of bandwidth usage but, contrary to popular belief, it has a negligible impact in terms of loading time. We also find that, in the case of small and fast loading websites, the use of content-blockers can even result in increased browsing latency.
... They are very effective at reducing third-party tracking [36]. Third party tracking scripts are classified into such categories as: ad trackers, analytics, beacons, social, and widgets [37]. ...
... There are different reasons connected to blocking web advertisements. The research shows that the users' preference for privacy of their data and confidentiality of their online activities is the most important factor [35], and, for this reason, personalized advertisements are perceived as a threat, and the use of advertisements in a way that is unknown to the user can be the source of a risk of additional costs (using data transmission packages); so that is the main reason why they are simply blocked by users [36], [37]. Therefore, it is necessary to find the proper strategy, based on investigated reasons, to stop blocking advertisements, in order to make the development of e-business sustainable. ...
Article
Full-text available
The article shows the main factors of adblocking software usage. The study was based on data obtained by a web questionnaire. The research was focused on evaluation of adblocking software usage factors in five categories: (1) gender, age, and education; (2) use of advertising and sources of knowledge about advertising; (3) technical and social reasons for blocking online advertisements; (4) usage of an adblock-wall; and (5) type of online advertisement. An evaluation of adblock usage factors revealed four main technical reasons for adblock usage connected with website technology and web development problems – interruption, amount of ads, speed, and security; and one social reason for adblock usage, namely, the problem of privacy.
... Therefore, new trackers or other thirdparty content are not captured or blocked by these extensions. For example, in [64], the authors demonstrated that existing extensions could not catch requests that utilize a DNS CNAME alias in order to prevent detection. Our experiments also confirm this technical issue. ...
... Moreover, in [36], the authors provide insight into the landscape of tracker blocker tools. Wills and Uzunoglu [64] explore the effectiveness of ad-blocking tools. ...
Chapter
We introduce a novel approach to implementing a browser-based tool for web users to protect their privacy. We propose to monitor the behaviors of JavaScript code within a webpage, especially operations that can read data within a browser or can send data from a browser to outside. Our monitoring mechanism is to ensure that all potential information leakage channels are detected. The detected leakage is either automatically prevented by our context-aware policies or decided by the user if needed. Our method advances the conventional same-origin policy standard of the Web by enforcing different policies for each source of the code. Although we develop the tool as a browser extension, our approach is browser-agnostic as it is based on standard JavaScript. Also, our method stands from existing proposals in the industry and literature. In particular, it does not rely on network request interception and blocking mechanisms provided by browsers, which face various technical issues.
... One of the main goals is to contribute a lighthouse for research and industry communities about systems used by hundreds of millions. We notice there is a considerable amount of research depends on EasyList for their evaluation measurement [13,31,64,87,90]. The lack of understanding the limitations of EasyList may lead to inaccurate results. ...
... As to Filter Lists, Wills and Uzunoglu [87] have studied the third-party domains of filter lists and suggested ways to improve ad-blocking tools to prevent requests to a different type of these domains. The whitelist of EasyList were studied by Walls et al. [82]. ...
Conference Paper
Full-text available
Ad-blocking systems such as Adblock Plus rely on crowdsourcing to build and maintain filter lists, which are the basis for determining which ads to block on web pages. In this work, we seek to advance our understanding of the ad-blocking community as well as the errors and pitfalls of the crowdsourcing process. To do so, we collected and analyzed a longitudinal dataset that covered the dynamic changes of popular filter-list EasyList for nine years and the error reports submitted by the crowd in the same period. Our study yielded a number of significant findings regarding the characteristics of FP and FN errors and their causes. For instances, we found that false positive errors (i.e., incorrectly blocking legitimate content) still took a long time before they could be discovered (50% of them took more than a month) despite the community effort. Both EasyList editors and website owners were to blame for the false positives. In addition, we found that a great number of false negative errors (i.e., failing to block real advertisements) were either incorrectly reported or simply ignored by the editors. Furthermore, we analyzed evasion attacks from ad publishers against ad-blockers. In total, our analysis covers 15 types of attack methods including 8 methods that have not been studied by the research community. We show how ad publishers have utilized them to circumvent ad-blockers and empirically measure the reactions of ad blockers. Through in-depth analysis, our findings are expected to help shed light on any future work to evolve ad blocking and optimize crowdsourcing mechanisms.
... As noted previously, the strategies advertisers and publishers use to force ads on ad block users include using ad block walls, which deny users of ad blocking software access to a website (Wills and Uzunoglu 2016); disguising ads (i.e., native ads) to avoid detection by ad blocking software (Campbell and Evans 2018), which is comparable to content integrated advertising (De Haan, Wiesel, and Pauwels 2016); or paying a fee to the ad block developer to have ads appear (i.e., whitelisting) (Pujol, Hohlfeld, and Feldmann 2015). Whitelisting often also requires publishers and advertisers to meet specific criteria of having acceptable advertising; for example, "advertisements must be transparent about being ads, must be appropriate to the site they're being served on, and must not distort or disrupt the page content, among other criteria" (Geuss 2015). ...
Article
Full-text available
A growing group of consumers uses ad-blocking software, preventing advertisers from reaching them and resulting in a loss of ad revenue for publishers. Ways to resolve this issue include blocking these users, disguising ads, or paying the developer of the ad blocker so that ads will not be blocked. The question is to what extent these solutions are effective and desired. This study uses an experimental setup followed by an extensive survey to answer this question. The findings show that, when banner ads are forced on ad blocker users, these users (vs. ad blocker nonusers) spend 10%-20% less time on the web page, evaluate the website as worse, and pay less attention to the banners, while the ads are 190% more effective for ad blocker nonusers. Thus, ad blocking serves as a self-filtering mechanism that filters out consumers who are less responsive to advertising. Ad blockers thus help advertisers target the right consumers and increase the value of the remaining ad slots for publishers. Moreover, ad blocker users are more likely to pay for ad-free content, offering publishers an alternative business model for these consumers.
... Moreover, Wills and Uzunoglu provided in [7] a comprehensive study on evaluating the effectiveness of existing anti-tracking methods in terms of detecting and blocking various types of third-party resources. Moreover, they described how third-party resources are identified and classified according to several defined categories. ...
Article
Full-text available
Personal data are strongly linked to web browsing history. By visiting a certain website, a user can share her favorite items, location, employment status, financial information, preferences, gender, medical status, news, etc. Therefore, web tracking is considered as one of the most significant internet privacy threats that can have a serious impact on end-users. Usually, it is used by most websites to track visitors through the internet in order to enhance their services and improve search customization. Moreover, selling users’ data to the advertising companies without their permission. Although there are more research efforts focused on third-party tracking to protect user privacy, there are still no comprehensive approaches to develop an efficient and accessible privacy protection method, even if more attention is paid to the topic. The main goal of this paper is to conduct a literature review on the web-tracking domain and possible privacy defending methods by presenting an overview of privacy issues, determining the possible tracking mechanisms that might be exploited, discussing the available privacy defense tools that could be utilized for improvement, and presenting the strength and weaknesses of each method.
... According to PageFair, more than 700 million people were using adblockers at the end of 2019 [36]. These tools generally leverage filter lists that are usually created by a community of users [47]. As trackers change frequently their behavior to evade detection techniques and develop continuously new tracking technique, the filter lists and rulesets used for detecting trackers and advertiser have to be regularly updated to remain effective. ...
Preprint
Full-text available
Websites use third-party ads and tracking services to deliver targeted ads and collect information about users that visit them. These services put users privacy at risk and that's why users demand to block these services is growing. Most of the blocking solutions rely on crowd-sourced filter lists that are built and maintained manually by a large community of users. In this work, we seek to simplify the update of these filter lists by automatic detection of hidden advertisements. Existing tracker detection approaches generally focus on each individual website's URL patterns, code structure and/or DOM structure of website. Our work differs from existing approaches by combining different websites through a large scale graph connecting all resource requests made over a large set of sites. This graph is thereafter used to train a machine learning model, through graph representation learning to detect ads and tracking resources. As our approach combines different sources of information, it is more robust toward evasion techniques that use obfuscation or change usage patterns. We evaluate our work over the Alexa top-10K websites, and find its accuracy to be 90.9% also it can block new ads and tracking services which would necessitate to be blocked further crowd-sourced existing filter lists. Moreover, the approach followed in this paper sheds light on the ecosystem of third party tracking and advertising.
... Social media platforms themselves snatch a large share of online advertising because advertisers believe that social media outlets outperform online news media compared to the ad's cost (Adstartr, 2020;Smith, 2013). The growing use of ad-blocker tools discourages companies from banner advertising and puts downward pressure on what news organizations can charge for them (Wills & Uzunoglu, 2016). All this has led to Nepali online news media turning away from banner advertising as a reliable part of their business model. ...
Chapter
Full-text available
Internet-based news media, also known as online news media, have a relatively short but fascinating history. This newly evolved medium has brought multitudinous innovations and opportunities in the media and communication industry. It has simply redefined how mass and public communication works. Despite having so many exciting points, some challenges to rise and thrive in this youngest media are on the surface. Question of sustainability, more specifically, having an appropriate business model is considered a major one. In this chapter, we review existing business models in online news media across the world and connect their relevance in the context of Nepal. We further analyze unique trends and practices followed by Nepalese online outlets and discuss their straight from financial sustainability aspect.
... Client-side tools, in the form of browser extensions became popular mainly because the difficulties of setting and preserving opt-out cookies [16] and the fact that the Do-Not-Track HTTP header mostly is ignored by Websites [17] . Some of the most popular client-side tools exhibit high variation in the effectiveness when preventing requests to 3rd-party domains [18] , others are not able to fully block fingerprinting services [19] and suffer of lack of effective protection on mobile devices. In addition, in [20] authors showed significant usability problems of some popular privacy tools. ...
... An example of such a general database (and accompanying tools, e.g. web browser extensions) is the Adblock Plus tool, which allows for the management of multiple blacklisted (forbidden) and whitelisted (allowed) sources, including external community created lists, such as EasyList (Wills, Uzunoglu 2016). ...
Preprint
Full-text available
The move of propaganda and disinformation to the online environment is possible thanks to the fact that within the last decade, digital information channels radically increased in popularity as a news source. The main advantage of such media lies in the speed of information creation and dissemination. This, on the other hand, inevitably adds pressure, accelerating editorial work, fact-checking, and the scrutiny of source credibility. In this chapter, an overview of computer-supported approaches to detecting disinformation and manipulative techniques based on several criteria is presented. We concentrate on the technical aspects of automatic methods which support fact-checking, topic identification, text style analysis, or message filtering on social media channels. Most of the techniques employ artificial intelligence and machine learning with feature extraction combining available information resources. The following text firstly specifies the tasks related to computer detection of manipulation and disinformation spreading. The second section presents concrete methods of solving the tasks of the analysis, and the third sections enlists current verification and benchmarking datasets published and used in this area for evaluation and comparison.
... Pujol et al. [40] find that use of ad-blocking technology is widespread, although it seems to be rooted more in annoyance towards ads than privacy concerns. Wills and Uzunoglu [45] investigate differences among ad-blockers. Gomer et al. [24] analyze the exposure of users to tracking cookies specifically for search. ...
... Several ad-blocking tools, such as Ghostery [19], No Script [37], Adblock [1] and Adblock Plus [2], Disconnect [10], Privacy Badger [39], etc., have been developed. These tools generally leverage filter lists that are frequently created through a crowd-sourcing of a users' community [51]. AdTrackers are detected through matching rules defined over features that can be observed in a browser. ...
Conference Paper
Full-text available
Websites use third-party ads and tracking services to deliver targeted ads and collect information about users that visit them. These services put users’ privacy at risk, and that is why users’ demand for blocking these services is growing. Most of the blocking solutions rely on crowd-sourced filter lists manually maintained by a large community of users. In this work, we seek to simplify the update of these filter lists by combining different websites through a large-scale graph connecting all resource requests made over a large set of sites. The features of this graph are extracted and used to train a machine learning algorithm with the aim of detecting ads and tracking resources. As our approach combines different information sources, it is more robust toward evasion techniques that use obfuscation or changing the usage patterns. We evaluate our work over the Alexa top-10K websites and find its accuracy to be 96.1% biased and 90.9% unbiased with high precision and recall. It can also block new ads and tracking services, which would necessitate being blocked by further crowd-sourced existing filter lists. Moreover, the approach followed in this paper sheds light on the ecosystem of third-party tracking and advertising.
... An example of such a general database (and accompanying tools, e.g. web browser extensions) is the Adblock Plus tool, which allows for the management of multiple blacklisted (forbidden) and whitelisted (allowed) sources, including external community created lists, such as EasyList (Wills and Uzunoglu 2016). Intensive efforts in source certification widen the simple address-based judgement with transparency rules for best practices, covering citation and references work, reporter expertise, or other trust indicators. ...
Book
Disinformation has recently become a salient issue, not just for researchers but for the media, politicians, and the general public as well. Changing circumstances are a challenge for system and societal resilience; disinformation is also a challenge for governments, civil society, and individuals. Thus, this book focuses on the post-truth era and the online environment, which has changed both the ways and forms in which disinformation is presented and spread. The volume is dedicated to the complex processes of understanding the mechanisms and effects of online propaganda and disinformation, its detection and reactions to it in the European context. It focuses on questions and dilemmas from political science, security studies, IT, and law disciplines with the aim to protect society and build resilience against online propaganda and disinformation in the post-truth era. Miloš Gregor is Assistant Professor in the Department of Political Science, Masaryk University, Czech Republic. He focuses on political marketing, communication, and analysis of propaganda and disinformation, specifically on the manipulative techniques and narratives deployed to persuade the audience. Petra Mlejnková is Assistant Professor in the Department of Political Science, Masaryk University, Czech Republic. She focuses on security studies, the Far Right, and the analysis of propaganda and disinformation, specifically from a security perspective—how these phenomena affect national and international security.
... Comprehensive studies [255,227,25,58,256] comparing ad blocking tools such as Ghostery, Adblock Plus or Easylist, focusing only on the overall performance of blocking tracking elements. Other studies on various web tracking techniques like cookies, localStorage and Flash were performed earlier in [13,57,150]. ...
Thesis
Full-text available
Das Aufzeichnen der Internetaktivität ist mit der Verknüpfung persönlicher Daten zu einer Schlüsselressource für viele kostenpflichtige und kostenfreie Dienste im Web geworden. Diese Dienste sind zum einen Webanwendungen, wie beispielsweise die von Google bereitgestellten Karten/Navigation oder Websuche, die täglich kostenlos verwendet werden. Zum anderen sind es alle Webseiten, die meist kostenlos Nachrichten oder allgemeine Informationen zu verschiedenen Themen bereitstellen. Durch das Aufrufen und die Nutzung dieser Webdienste werden alle Informationen, die im Webdienst verarbeitet werden, an den Dienstanbieter weitergeben. Dies umfasst nicht nur die im Benutzerkonto des Webdienstes gespeicherte Profildaten wie Name oder Adresse, sondern auch die Aktivität mit dem Webdienst wie das anklicken von Links oder die Verweildauer. Darüber hinaus gibt es jedoch auch unzählige Drittparteien, welche zumeist im Hintergrund in die Webdienste eingebunden sind und das Benutzerverhalten der kompletten Webaktivität - Webseiten übergreifend - mitspeichern sowie auswerten. Der Einsatz verschiedener, in der Regel für den Benutzer verborgener Techniken, dient dazu das Online-Verhalten der Benutzer genau zu verfolgen und viele sensible Daten zu sammeln. Dieses Verhalten wird als Web-Tracking bezeichnet und wird hauptsächlich von Werbeunternehmen genutzt. Die gesammelten Daten sind oft personenbezogen und eine wertvolle Ressourcen der Unternehmen, um Beispielsweise passend zum Benutzerprofil personalisierte Werbung schalten zu können. Mit der Nutzung dieser personenbezogenen Daten entstehen aber auch weitreichendere Auswirkungen, welche sich unter anderem in Preisanpassungen für Benutzer mit speziellen Profilattributen, wie der Nutzung von teuren Endgeräten, widerspiegeln. Ziel dieser Arbeit ist es die Privatsphäre der Nutzer im Internet zu steigern und die Nutzerverfolgung von Web-Tracking signifikant zu reduzieren. Dabei stellen sich vier Herausforderungen, die jeweils einen Forschungsschwerpunkt dieser Arbeit bilden: (1) Systematische Analyse und Einordnung eingesetzter Tracking-Techniken, (2) Untersuchung vorhandener Schutzmechanismen und deren Schwachstellen,(3) Konzeption einer Referenzarchitektur zum Schutz vor Web-Tracking und (4) Entwurf einer automatisierten Testumgebungen unter Realbedingungen, um die Reduzierung von Web-Tracking in den entwickelten Schutzmaßnahmen zu untersuchen. Jeder dieser Forschungsschwerpunkte stellt neue Beiträge bereit, um einheitlich das übergeordnete Ziel zu erreichen: der Entwicklung von Schutzmaßnahmen gegen die Preisgabe sensibler Benutzerdaten im Internet. Der erste wissenschaftliche Beitrag dieser Dissertation ist eine umfassende Evaluation eingesetzter Web-Tracking Techniken und Methoden, sowie deren Gefahren, Risiken und Implikationen für die Privatsphäre der Internetnutzer. Die Evaluation beinhaltet zusätzlich die Untersuchung vorhandener Tracking-Schutzmechanismen und deren Schwachstellen. Die gewonnenen Erkenntnisse sind maßgeblich für die in dieser Arbeit neu entwickelten Ansätze und verbessern den bisherigen nicht hinreichend gewährleisteten Schutz vor Web-Tracking. Der zweite wissenschaftliche Beitrag ist die Entwicklung einer robusten Klassifizierung von Web-Tracking, der Entwurf einer effizienten Architektur zur Langzeituntersuchung von Web-Tracking sowie einer interaktiven Visualisierung des Auftreten von Web-Tracking im Internet. Dabei basiert der neue Klassifizierungsansatz, um Tracking zu identifizieren, auf der Entropie Messung des Informationsgehalts von Cookies. Die Resultate der Web-Tracking Langzeitstudien sind unter anderem 1.209 identifizierte Tracking-Domains auf den meistbesuchten Webseiten in Deutschland. Hierbei wurden innerhalb der Top 25 Webseiten im Durchschnitt 45 Tracking-Elemente pro Webseite gefunden. Der Tracker mit dem höchsten Potenzial zum Erstellen eines Benutzerprofils war doubleclick.com, da er 90% der Webseiten überwacht. Die Auswertung des untersuchten Tracking-Netzwerks ergab weiterhin einen detaillierten Einblick in die Tracking-Technik mithilfe von Weiterleitungslinks. Dabei haben wir 1,2 Millionen HTTP-Traces von monatelangen Crawls der 50.000 international meistbesuchten Webseiten analysiert. Die Ergebnisse zeigen, dass 11,6% dieser Webseiten HTTP-Redirects, verborgen in Webseiten-Links, zum Tracken verwenden. Dies wird eingesetzt, um den Webseitenverlauf des Benutzers nach dem Klick durch eine Kette von (Tracking-)Servern umzuleiten, welche in der Regel nicht sichtbar sind, bevor das beabsichtigte Link-Ziel geladen wird. In diesem Szenario erfasst der Tracker wertvolle Verbindungs-Metadaten zu Inhalt, Thema oder Benutzerinteressen der Website. Die Visualisierung des Tracking Ökosystem stellen wir in einem interaktiven Open-Source Web-Tool bereit. Der dritte wissenschaftliche Beitrag dieser Dissertation ist die Konzeption von zwei neuartigen Schutzmechanismen gegen Web-Tracking und der Aufbau einer automatisierten Simulationsumgebung unter Realbedingungen, um die Effektivität der Umsetzungen zu verifizieren. Der Fokus liegt auf den beiden meist verwendeten Tracking-Verfahren: Cookies (hierbei wird eine eindeutigen ID auf dem Gerät des Benutzers gespeichert), sowie Browser-Fingerprinting. Letzteres beschreibt eine Methode zum Sammeln einer Vielzahl an Geräteeigenschaften, um den Benutzer eindeutig zu (re- )identifizieren, ohne eine eindeutige ID auf dem Gerät zu speichern. Um die Effektivität der in dieser Arbeit entwickelten Schutzmechanismen vor Web-Tracking zu untersuchen, implementierten und evaluierten wir die Schutzkonzepte direkt im Chromium Browser. Das Ergebnis zeigt eine erfolgreiche Reduzierung von Web-Tracking um 44%. Zusätzlich verbessert das in dieser Arbeit entwickelte Konzept “Site Isolation” den Datenschutz des privaten Browsing-Modus, ermöglicht das Setzen eines manuellen Speicher-Zeitlimits von Cookies und schützt den Browser gegen verschiedene Bedrohungen wie CSRF (Cross-Site Request Forgery) oder CORS (Cross-Origin Ressource Sharing). Site Isolation speichert dabei den Status der lokalen Website in separaten Containern und kann dadurch diverse Tracking-Methoden wie Cookies, lokalStorage oder redirect tracking verhindern. Bei der Auswertung von 1,6 Millionen Webseiten haben wir gezeigt, dass der Tracker doubleclick.com das höchste Potenzial besitzt, den Nutzer zu verfolgen und auf 25% der 40.000 international meistbesuchten Webseiten vertreten ist. Schließlich demonstrieren wir in unserem erweiterten Chromium-Browser einen robusten Browser-Fingerprinting-Schutz. Der Test unseres Prototyps mittels 70.000 Browsersitzungen zeigt, dass unser Browser den Nutzer vor sogenanntem Browser-Fingerprinting Tracking schützt. Im Vergleich zu fünf anderen Browser-Fingerprint-Tools erzielte unser Prototyp die besten Ergebnisse und ist der erste Schutzmechanismus gegen Flash sowie Canvas Fingerprinting.
... All other advertising is blocked. The effectiveness of ad-blockers varies considerably [29]. Other work found a set of web traffic features that identify services that intrude on privacy [30]. ...
Article
Full-text available
The evolution of digital advertising, which is aimed at a mass audience, to programmatic advertising, which is aimed at individual users depending on their profile, has raised concerns about the use of personal data and invasion of user privacy on the Internet. Concerned users install ad-blockers that prevent users from seeing ads and this has resulted in many companies using anti-ad-blockers. This study investigates the sociological variables that make users feel that advertising is annoying and then decide to use ad-blockers to avoid it. Our results provide useful information for companies to appropriately segment user profiles. To do this, data collected from Internet users (n = 19,973) about what makes online advertising annoying and why they decide to use ad-blockers are analyzed. First, the existing literature on the subject was reviewed and then the relevant sociological variables that influence users’ feelings about online advertising and the use of ad-blockers were investigated. This work contributes new information to the discussion about user privacy on the Internet. Some of the key findings suggest that Internet advertising can be very intrusive for many users and that all the variables investigated, except marital status and education, influence the users’ opinions. It was also found that all the variables in this study are important when a user decides to use an ad-blocker. A clear and inverse correlation between age and opinion about advertising as annoying could be seen, along with a clear difference of opinion due to gender. The results suggest that users without children use ad-blockers the least, while retirees and housewives use them the most.
... • 596 respondents use adblock software (77.0%), ○ 12.9% of users configure the program on their own by adding filtering lists and turning on additional functions, ○ 87.1% of adblock users actively turn it on and off (compare with (Wills & Uzunoglu, 2016)). • 21.3% do not use the adblock program, and 1.7% people do not know about the adblock program. ...
Article
Background The aim of the paper is to diagnose the impact of the design thinking (DT) approach on the decision-making process. The paper is therefore a contribution to the sustainable development of the online advertising market. Methods The basic research problem is verification of the thesis that the application of the DT methodology in the implementation of projects supports creativity and streamlines project decision-making, and allows to better detect the causes of observed problems. Results The result of theoretical research is the decision model of the online advertising user. The result of empirical research is the process of creating a prototype of infographic devoted to the issue of ad blocking. Conclusions Implementation of various types of projects should be carried out using practical methods, such as DT. They strengthen creativity and a creative approach to problem-solving, which results in the creation of much better solutions of their usefulness.
... Comprehensive studies [31,37,50,53,88] comparing ad-blocking tools such as Ghostery, Adblock Plus or Easylist focusing only on the overall performance of blocking tracking elements. Other studies on various web tracking techniques like cookies, localStorage, and Flash were performed earlier in [2,69,82]. ...
Article
Full-text available
In today’s web, information gathering on users’ online behavior takes a major role. Advertisers use different tracking techniques that invade users’ privacy by collecting data on their browsing activities and interests. To preventing this threat, various privacy tools are available that try to block third-party elements. However, there exist various tracking techniques that are not covered by those tools, such as redirect link tracking. Here, tracking is hidden in ordinary website links pointing to further content. By clicking those links, or by automatic URL redirects, the user is being redirected through a chain of potential tracking servers not visible to the user. In this scenario, the tracker collects valuable data about the content, topic, or user interests of the website. Additionally, the tracker sets not only thirdparty but also first-party tracking cookies which are far more difficult to block by browser settings and ad-block tools. Since the user is forced to follow the redirect, tracking is inevitable and a chain of (redirect) tracking servers gain more insights in the users’ behavior. In this work we present the first large scale study on the threat of redirect link tracking. By crawling the Alexa top 50k websites and following up to 34 page links, we recorded traces of HTTP requests from 1.2 million individual visits of websites as well as analyzed 108,435 redirect chains originating from links clicked on those websites. We evaluate the derived redirect network on its tracking ability and demonstrate that top trackers are able to identify the user on the most visited websites. We also show that 11.6% of the scanned websites use one of the top 100 redirectors which are able to store nonblocked first-party tracking cookies on users’ machines even when third-party cookies are disabled. Moreover, we present the effect of various browser cookie settings, resulting in a privacy loss even when using third-party blocking tools.
... According to PageFair, more than 700 million people were using adblockers at the end of 2019 [36]. These tools generally leverage filter lists that are usually created by a community of users [47]. As trackers change frequently their behavior to evade detection techniques and develop continuously new tracking technique, the filter lists and rulesets used for detecting trackers and advertiser have to be regularly updated to remain effective. ...
Preprint
Full-text available
Websites use third-party ads and tracking services to deliver targeted ads and collect information about users that visit them. These services put users' privacy at risk, and that is why users' demand for blocking these services is growing. Most of the blocking solutions rely on crowd-sourced filter lists manually maintained by a large community of users. In this work, we seek to simplify the update of these filter lists by combining different websites through a large-scale graph connecting all resource requests made over a large set of sites. The features of this graph are extracted and used to train a machine learning algorithm with the aim of detecting ads and tracking resources. As our approach combines different information sources, it is more robust toward evasion techniques that use obfuscation or changing the usage patterns. We evaluate our work over the Alexa top-10K websites and find its accuracy to be 96.1% biased and 90.9% unbiased with high precision and recall. It can also block new ads and tracking services, which would necessitate being blocked by further crowd-sourced existing filter lists. Moreover, the approach followed in this paper sheds light on the ecosystem of third-party tracking and advertising.
... Perkembangan teknologi informasi juga memungkinkan alat untuk melakukan blok terhadap iklan yang muncul di internet. Wills & Uzunoglu (2016) mengungkapkan bahwa alat pemblokiran iklan (AdBlock Plus) menjadi salah satu alat pemblokiran iklan di internet yang favorit di App Store dengan memiliki 85% pangsa pasar. Hal ini menjelaskan bahwa pengguna internet sejatinya terus mencari kemudahan dalam merasakan pengaman berinternet yang nyaman dengan memblokir iklan yang secara periodik muncul pada tampilan layar. ...
Article
Full-text available
p align="center"> ABSTRACT The act of avoiding or inevatibility perception of advertisements formed from the simple concept of leaving the room the ad is displayed to change the program channel. Internet users avoid advertisements caused by 3 (three) factors, namely internet media is intended as a media that is more oriented to a specific destination other than entertainment. The second factor is that internet users feel that regular advertisements will slow down access speeds and download processes. The third factor is advertising on the internet at any time can make users click links that appear intentionally or not. This study aims to analyze the determinants of the attitude of avoiding advertisements on Youtube. This research uses quantitative methods with SMART PLS 3.0. The research involving 100 samples found that the act of Avoiding Adverts was not directly affected by the perception of the user's goal being obstructed in enjoying the shows on Youtube and also by the user's skepticism as a moderating effect; Actions to avoid ads are influenced by the clutter of the advertisement when enjoying shows on Youtube, and previous negative experiences both directly and by using the moderation effect. As a recommendation in this study, the use of more specific sample characters (channel owners or hobbyist favourite channel lovers) can contribute more diverse results. Keywords; inevatibility; advertising; scepticism; experience; youtube. ABSTRAK Tindakan menghindari atau persepsi keniscayaan iklan terbentuk dari konsep sederhana yaitu meninggalkan ruangan yang iklan yang ditayangkan sampai kepada mengganti kanal program. Pengguna internet menghindari iklan disebabkan oleh 3 (tiga) faktor yaitu media internet lebih ditujukan sebagai media yang lebih berorientasi pada tujuan tertentu selain hiburan. Faktor kedua yaitu pengguna internet merasa iklan yang tampil secara berkala akan memperlambat kecepatan akses dan proses unduh. Faktor ketiga yaitu iklan di internet sewaktu-waktu dapat membuat pengguna mengklik tautan yang muncul secara sengaja atau tidak. Penelitian ini bertujuan untuk menganalisis determinan sikap tindakan menghindari iklan di Youtube. Penelitian ini menggunakan metode kuantitatif dengan Structural Equation Modelling – Partial Least Square dengan menggunakan SMART PLS 3.0. Penelitian yang melibatkan 100 sampel ini menemukan bahwa tindakan menghindari Iklan tidak dipengaruhi secara langsung oleh persepsi terhambatnya tujuan pengguna dalam menikmati tayangan yang ada di Youtube dan juga dengan sikap skeptis pengguna sebagai efek pemoderasi; Tindakan menghindari Iklan dipengaruhi oleh kesemrawutan iklan yang ada pada saat menikmati tayangan yang ada di Youtube , dan pengalaman negatif sebelumnya baik secara pengaruh langsung maupun dengan menggunakan efek moderasi. Sebagai rekomendasi dalam penelitian ini, penggunaan karakter sampel yang lebih spesifik (pemilik kanal atau penikmat kanal favorit hobi) bisa memberikan kontribusi hasil yang lebih beragam. Kata Kunci; keniscayaan; iklan; skeptisisme; pengalaman; youtube. </p
... A number of studies have characterized and measured the ads and tracking ecosystem to protect user privacy and limit intrusive ads, resulting in numerous ad-blocking tools such as Ghostery [4], Adblock [1], Adblock Plus [2], Disconnect [7], and Privacy Badger [6] for the web [32] and mobile platforms [33]. Most of these tools are fueled by community-driven public blacklists (such as EasyPrivacy [16], EasyList [14], FanboyList [17], and hpHosts [18]) and are evaluated for their (in)effectiveness [44], [32]. Although the research work in this domain has been on-going for some time and ad-blocking tools have been developed, we believe these studies typically consist of short-term measurements of specific tracking techniques and the ad-blocking tools have not been comprehensively evaluated over time. ...
... A number of studies have characterized and measured the ads and tracking ecosystem to protect user privacy and limit intrusive ads, resulting in numerous ad-blocking tools such as Ghostery [14], Adblock [2], Adblock Plus [3], Disconnect [6], and Privacy Badger [22] for the web [35] and mobile platforms [36]. Most of these tools are fueled by community-driven public blacklists (such as EasyPrivacy [11], EasyList [9], FanboyList [13], and hpHosts [15]) and are evaluated for their (in)effectiveness [34,35,49]. ...
... Scholars in the technological field research on how to develop either new ad blocking or anti ad blocking solutions (Backes, Bugiel, Styp-Rekowsky, & WiBfeld, 2017;Ikram & Kaafar, 2017a, 2017bJalba, Olteanu, & Draghici, 2016;Lashkari, Seo, Gil, & Ghorbani, 2017;Mughees, Qian, & Shafiq, 2017;Mughees, Qian, Shafiq, Dash, & Hui, 2016;Nithyanand et al., 2016;Tramèr, Dupré, Rusak, Pellegrino, & Boneh, 2018;Wills & Uzunoglu, 2016). ...
... Client-side tools, in the form of browser extensions became popular mainly because the difficulties of setting and preserving opt-out cookies [16] and the fact that the Do-Not-Track HTTP header mostly is ignored by Websites [17] . Some of the most popular client-side tools exhibit high variation in the effectiveness when preventing requests to 3rd-party domains [18] , others are not able to fully block fingerprinting services [19] and suffer of lack of effective protection on mobile devices. In addition, in [20] authors showed significant usability problems of some popular privacy tools. ...
Article
A common practice for websites is to rely on services provided by third party sites to track users and provide personalized experiences. Unfortunately, this practice has strong implications for both users and performance. From one hand, the privacy of individuals is at a risk given the use of valuable information used for the reconstruction of personal profiles. From the other hand, many existing countermeasures to protect privacy, having been implemented into Web browsers, exhibit performance issues, mainly due to the use of huge (and difficult to maintain up to date) lists of resources that have to be filtered out, given their privacy intrusiveness. To overcome these limitations, we propose the use of a hybrid mechanism exploiting blacklisting and machine learning for the automatic identification of privacy intrusive services requested while browsing Web pages. The idea is to use the blacklisting technique (widely used by the majority of privacy tools), in combination with a machine learning model which distinguishes between malicious and functional resources, and hence updates the blacklist, accordingly. We found out that machine learning models are able to classify JavaScript programs and HTTP requests with accuracy up to 91% and 97%, respectively. We provided a prototype implementation of this hybrid mechanism, named GuardOne, and we performed an exhaustive evaluation study to assess its effectiveness and performance. Results showed that GuardOne is able to filter out malicious resources from users’ requests without performance degradation when compared with traditional systems that leverage on the use of static lists for filtering. Moreover, results about effectiveness show that our mechanism, even with some small improvements, is able to efficiently filter out malicious requests and reduce in a substantial way personal information leakage.
... Ad-blocking. Limitations of filter lists are well-studied [54,91,92]. Many new ad-blocker designs (e.g., [14,36,43]) replace hardcoded rules with ML models trained on similar features (e.g., 6 The perturbed audio stream has a signal-to-noise ratio of 37 dB. ...
Conference Paper
Perceptual ad-blocking is a novel approach that detects online advertisements based on their visual content. Compared to traditional filter lists, the use of perceptual signals is believed to be less prone to an arms race with web publishers and ad networks. We demonstrate that this may not be the case. We describe attacks on multiple perceptual ad-blocking techniques, and unveil a new arms race that likely disfavors ad-blockers. Unexpectedly, perceptual ad-blocking can also introduce new vulnerabilities that let an attacker bypass web security boundaries and mount DDoS attacks. We first analyze the design space of perceptual ad-blockers and present a unified architecture that incorporates prior academic and commercial work. We then explore a variety of attacks on the ad-blocker's detection pipeline, that enable publishers or ad networks to evade or detect ad-blocking, and at times even abuse its high privilege level to bypass web security boundaries. On one hand, we show that perceptual ad-blocking must visually classify rendered web content to escape an arms race centered on obfuscation of page markup. On the other, we present a concrete set of attacks on visual ad-blockers by constructing adversarial examples in a real web page context. For seven ad-detectors, we create perturbed ads, ad-disclosure logos, and native web content that misleads perceptual ad-blocking with 100% success rates. In one of our attacks, we demonstrate how a malicious user can upload adversarial content, such as a perturbed image in a Facebook post, that fools the ad-blocker into removing another users' non-ad content. Moving beyond the Web and visual domain, we also build adversarial examples for AdblockRadio, an open source radio client that uses machine learning to detects ads in raw audio streams.
... There are different reasons connecting with blocking web advertisement. The research shows, that the users' privacy [16] of their data and confidentiality of their online activities is the most important factor, and due to that personalized advertisements are perceived as a threat, and unknown advertisements usage can be the source of a risk of additional costs (using data transmission packages), so that is the main reason that they are simply blocked by users [17,18]. Therefore, it is necessary to find the proper strategy based on investigated reason to stop blocking advertisements in sustainable development of e-business. ...
Conference Paper
Full-text available
The article focuses on the complex problem of synergy of the expectations of recipients of web content-users with the expectations of content creators-publishers, who make them available free of charge but at the same time complement them with web advertisements. The article presents the results of author's research of users' reasons to block web advertisements through special adblock programs, thus condemning publishers' in e-business to failure. The results of empirical research prove that users do not interfere with advertising sensu stricto, but the way in which they are delivered to internet users. The results will be useful for developers of solutions, models or information systems for sustainable development in the area of e-business, in particular in the area of e-marketing.
... A number of studies have characterized and measured the ads and tracking ecosystem to protect user privacy and limit intrusive ads, resulting in numerous ad-blocking tools such as Ghostery [4], Adblock [1], Adblock Plus [2], Disconnect [7], and Privacy Badger [6] for the web [32] and mobile platforms [33]. Most of these tools are fueled by community-driven public blacklists (such as EasyPrivacy [16], EasyList [14], FanboyList [17], and hpHosts [18]) and are evaluated for their (in)effectiveness [44], [32]. Although the research work in this domain has been on-going for some time and ad-blocking tools have been developed, we believe these studies typically consist of short-term measurements of specific tracking techniques and the ad-blocking tools have not been comprehensively evaluated over time. ...
Preprint
Full-text available
Websites employ third-party ads and tracking services leveraging cookies and JavaScript code, to deliver ads and track users' behavior, causing privacy concerns. To limit online tracking and block advertisements, several ad-blocking (black) lists have been curated consisting of URLs and domains of well-known ads and tracking services. Using Internet Archive's Wayback Machine in this paper, we collect a retrospective view of the Web to analyze the evolution of ads and tracking services and evaluate the effectiveness of ad-blocking blacklists. We propose metrics to capture the efficacy of ad-blocking blacklists to investigate whether these blacklists have been reactive or proactive in tackling the online ad and tracking services. We introduce a stability metric to measure the temporal changes in ads and tracking domains blocked by ad-blocking blacklists, and a diversity metric to measure the ratio of new ads and tracking domains detected. We observe that ads and tracking domains in websites change over time, and among the ad-blocking blacklists that we investigated, our analysis reveals that some blacklists were more informed with the existence of ads and tracking domains, but their rate of change was slower than other blacklists. Our analysis also shows that Alexa top 5K websites in the US, Canada, and the UK have the most number of ads and tracking domains per website, and have the highest proactive scores. This suggests that ad-blocking blacklists are updated by prioritizing ads and tracking domains reported in the popular websites from these countries.
... We compare the domains of third party HTTP requests and cookies against EasyPrivacy to determine which of the requests and cookies set by a webpage are associated with web tracking. Importantly, this metric likely underestimates the amount of tracking, since EasyPrivacy is not exhaustive and may fail to flag some resources as tracking [27]. ...
Preprint
As Internet streaming of live content has gained on traditional cable TV viewership, we have also seen significant growth of free live streaming services which illegally provide free access to copyrighted content over the Internet. Some of these services draw millions of viewers each month. Moreover, this viewership has continued to increase, despite the consistent coupling of this free content with deceptive advertisements and user-hostile tracking. In this paper, we explore the ecosystem of free illegal live streaming services by collecting and examining the behavior of a large corpus of illegal sports streaming websites. We explore and quantify evidence of user tracking via third-party HTTP requests, cookies, and fingerprinting techniques on more than 27,303 unique video streams provided by 467 unique illegal live streaming domains. We compare the behavior of illegal live streaming services with legitimate services and find that the illegal services go to much greater lengths to track users than most legitimate services, and use more obscure tracking services. Similarly, we find that moderated sites that aggregate links to illegal live streaming content fail to moderate out sites that go to significant lengths to track users. In addition, we perform several case studies which highlight deceptive behavior and modern techniques used by some domains to avoid detection, monetize traffic, or otherwise exploit their viewers. Overall, we find that despite recent improvements in mechanisms for detecting malicious browser extensions, ad-blocking, and browser warnings, users of free illegal live streaming services are still exposed to deceptive ads, malicious browser extensions, scams, and extensive tracking. We conclude with insights into the ecosystem and recommendations for addressing the challenges highlighted by this study.
Conference Paper
This article addresses the impacts of the EU's data protection laws on digital marketing, emphasizing the phase-out of cookie-tracked programmatic advertising due to privacy concerns. The loss of third-party cookies affects advertisers, publishers, and users, prompting the exploration of alternatives such as first-party data, identity graphs, and new tools like Marketing Mix Models. These shifts, along with other novel strategies like contextual advertising and user cohort solutions, present diverse challenges, signifying a complex future for advertisers in the post-cookie landscape.
Preprint
Full-text available
Current content filtering and blocking methods are susceptible to various circumvention techniques and are relatively slow in dealing with new threats. This is due to these methods using shallow pattern recognition that is based on regular expression rules found in crowdsourced block lists. We propose a novel system that aims to remedy the aforementioned issues by examining deep textual patterns of network-oriented content relating to the domain being interacted with. Moreover, we propose to use federated learning that allows users to take advantage of each other's localized knowledge/experience regarding what should or should not be blocked on a network without compromising privacy. Our experiments show the promise of our proposed approach in real world settings. We also provide data-driven recommendations on how to best implement the proposed system.
Article
With the evolution of the online advertisement and tracking ecosystem, content-blockers have become the reference tool for improving the security, privacy and browsing experience when surfing the Internet. It is also commonly believed that using content-blockers to stop unsolicited content decreases the time needed for loading websites. In this work, we perform a large-scale study on the actual improvements of using content-blockers in terms of performance and quality of experience. For measuring it, we analyze the page size and loading times of the 100K most popular websites, as well as the most relevant QoE metrics, such as the Speed Index, Time to Interactive or the Cumulative Layout Shift, for the subset of the top 10K of them. Our experiments show that using content-blockers results in small improvements in terms of performance. However, contrary to popular belief, this has a negligible impact in terms of loading time and quality of experience. Moreover, in the case of small and lightweight websites, the overhead introduced by content-blockers can even result in decreased performance. Finally, we evaluate the improvement in terms of QoE based on the Mean Opinion Score (MOS) and find that two of the three studied content-blockers present an overall decrease between 3% and 5% instead of the expected improvement.
Article
Misinformation is a recurring problem that has experienced a significant growth in recent years due to the rapid development of the Internet. This development has driven the emergence of websites where their content is shared without control. This is even more dangerous in the health domain, given its specific nature and the increasing number of users searching for health-related information on the Internet. For these reasons, this information should be handled with special attention. In this paper, a novel system to detect misinformation in websites related to the health domain is presented. The proposed system uses text mining techniques and visual design features to estimate the trustworthiness of the website. It has been trained using human experts’ knowledge in the selected domain and their visual perception of the website design. Promising results have been obtained during the evaluation in the experimental stage.
Article
Consumption of digital content through medium like mobile phones, television set and laptop has seen dramatic increase over recent years. Possible increase behind this rise of consumption is the availability of internet across different regions and drastic rise in availability of internet bandwidth at a nominal cost. Eventually preparation of digital content to cater various user entertainment taste profile has increased. Since there is an increase in consumption of the digital content, it opens new challenges on securing this content. Visual analytics of the digital content on periodic basis is in rise to keep a check on the security of the content. Broadcast logo availability as part of digital content is unique and tracking this logo is one of the methods to distinguish between original content and pirated content. In recent years there's significant increase in research work carried out on the problems associated with object detection and classification. This implies there is a significant increase in terms of recognition performance of objects of interest. In this paper, the discussion is on different TV broadcast channel logo dataset creation, enhancing this dataset using different data augmentation techniques to extend the logo corpus. Proposed system includes TV broadcast/broadband content logo detection and classification pipeline that demonstrate the application of state-of-the-art object detection algorithm Single shot detector (SSD) for logo detection and classification on TV broadcast channel logo dataset which has undergone significant makeover for representing different logo conditions. Experimental result shows the pipeline potential to robustly recognize the logos under makeover in the context of content piracy
Chapter
Disseminating propaganda and disinformation in the online environment is possible thanks to the fact that, within the last decade, digital information channels have radically increased in popularity as a source of news. This has occurred because the main advantage of such media lies in the speed of creating and disseminating information. The price paid for this speed is fast editorial work (if any) and quick checking of facts and source credibility. In this chapter, an overview of computer-supported approaches for detecting disinformation and manipulative techniques based on several criteria is presented. We concentrate on the technical aspects of automatic methods which support fact checking, topic identification, text style analysis, or message filtering in social media channels. Most of the techniques employ artificial intelligence and machine learning with feature extraction and combine available information resources.
Chapter
We present the first detailed analysis of ad-blocking’s impact on user Web quality of experience (QoE). We use the most popular web-based ad-blocker to capture the impact of ad-blocking on QoE for the top Alexa 5,000 websites. We find that ad-blocking reduces the number of objects loaded by 15% in the median case, and that this reduction translates into a 12.5% improvement on page load time (PLT) and a slight worsening of time to first paint (TTFP) of 6.54%. We show the complex relationship between ad-blocking and quality of experience - despite the clear improvements to PLT in the average case, for the bottom 10 percentile, this improvement comes at the cost of a slowdown on the initial responsiveness of websites, with a 19% increase to TTFP. To understand the relative importance of this trade-off on user experience, we run a large, crowd-sourced experiment with 1,000 users in Amazon Turk. For this experiment, users were presented with websites for which ad-blocking results in both, a reduction of PLT and a significant increase in TTFP. We find, surprisingly, 71.5% of the time users show a clear preference for faster first paint over faster page load times, hinting at the importance of first impressions on web QoE.
Conference Paper
During the visit to any website, the average internaut may face scripts that upload personal information to so called online trackers, invisible third party services that collect information about users and profile them. This is no news, and many works in the past tried to measure the extensiveness of this phenomenon. All of them ran active measurement campaigns via crawlers. In this paper, we observe the phenomenon from a passive angle, to naturally factor the diversity of the Internet and of its users. We analyze a large dataset of passively collected traffic summaries to observe how pervasive online tracking is. We see more than 400 tracking services being contacted by unaware users, of which the top 100 are regularly reached by more than 50 % of Internauts, with top three that are practically impossible to escape. Worse, more than 80 % of users gets in touch the first tracker within 1 second after starting navigating. And we see a lot of websites that hosts hundreds of tracking services. Conversely, those popular web extensions that may improve personal protection, e.g., DoNotTrackMe, are actually installed by a handful of users (3.5 %). The resulting picture witnesses how pervasive the phenomenon is, and calls for an increase of the sensibility of people, researchers and regulators toward privacy in the Internet.
Conference Paper
In 2011, Adblock Plus---the most widely-used ad blocking software---began to permit some advertisements as part of their Acceptable Ads program. Under this program, some ad networks and content providers pay to have their advertisements shown to users. Such practices have been controversial among both users and publishers. In a step towards informing the discussion about these practices, we present the first comprehensive study of the Acceptable Ads program. Specifically, we characterize which advertisements are allowed and how the whitelisting has changed since its introduction in 2011. We show that the list of filters used to whitelist acceptable advertisements has been updated on average every 1.5 days and grew from 9 filters in 2011 to over 5,900 in the Spring of 2015. More broadly, the current whitelist triggers filters on 59% of the top 5,000 websites. Our measurements also show that the program allows advertisements on 2.6 million parked domains. Lastly, we take the lessons learned from our analysis and suggest ways to improve the transparency of the whitelisting process.
Article
The Internet revolution has led to the rise of trackers--online tracking services that shadow users' browsing activity. Despite the pervasiveness of online tracking, few users install privacy-enhancing plug-ins.
Article
In the early days of the web, content was designed and hosted by a single person, group, or organization. No longer. Webpages are increasingly composed of content from myriad unrelated "third-party" websites in the business of advertising, analytics, social networking, and more. Third-party services have tremendous value: they support free content and facilitate web innovation. But third-party services come at a privacy cost: researchers, civil society organizations, and policymakers have increasingly called attention to how third parties can track a user's browsing activities across websites. This paper surveys the current policy debate surrounding third-party web tracking and explains the relevant technology. It also presents the FourthParty web measurement platform and studies we have conducted with it. Our aim is to inform re-searchers with essential background and tools for contributing to public understanding and policy debates about web tracking.
Conference Paper
As a follow up to characterizing traffic deemed as unwanted by Web clients such as advertisements, we examine how information related to individual users is aggregated as a result of browsing seemingly unrelated Web sites. We examine the privacy diffusion on the Internet, hidden transactions, and the potential for a few sites to be able to construct a profile of individual users. We define and generate a privacy footprint allowing us to assess and compare the diffusion of privacy information across a wide variety of sites. We examine the effectiveness of existing and new techniques to reduce this diffusion. Our results show that the size of the privacy footprint is a legitimate cause for concern across the sets of sites that we study.
Conference Paper
For the last few years we have studied the diusion of pri- vate information about users as they visit various Web sites triggering data gathering aggregation by third parties. This paper reports on our longitudinal study consisting of mul- tiple snapshots of our examination of such diusion over four years. We examine the various technical ways by which third-party aggregators acquire data and the depth of user- related information acquired. We study techniques for pro- tecting against this privacy diusion as well as limitations of such techniques. We introduce the concept of secondary privacy damage. Our results show increasing aggregation of user-related data by a steadily decreasing number of entities. A hand- ful of companies are able to track users' movement across almost all of the popular Web sites. Virtually all the pro- tection techniques have signican t limitations highlighting the seriousness of the problem and the need for alternate solutions.
Annoyed users: Ads and ad-block usage in the wild
  • Enric Pujol
  • Oliver Hohlfeld
  • Anja Feldmann
Do users change their settings? User Interface Engineering
  • Jared Spool
Internet Jones and the raiders of the lost trackers
  • Adam Lerner
  • Anna Kornfeld Simpson
  • Tadayoshi Kohno
  • Franziska Roesner