ArticlePDF Available

Is the Internet for Porn? An Insight Into the Online Adult Industry


Abstract and Figures

The online adult industry is among the most profitable busi- ness branches on the Internet, and its web sites attract large amounts of visitors and traffic. Nevertheless, no study has yet characterized the industry's economical and security- related structure. As cyber-criminals are motivated by fi- nancial incentives, a deeper understanding and identifica- tion of the economic actors and interdependencies in the online adult business is important for analyzing security- related aspects of this industry. In this paper, we provide a survey of the different eco- nomic roles that adult web sites assume, and highlight their economic and technical features. We provide insights into security flaws and potential points of interest for cyber- criminals. We achieve this by applying a combination of automatic and manual analysis techniques to investigate the economic structure of the online adult industry and its busi- ness cases. Furthermore, we also performed several exper- iments to gain a better understanding of the flow of visitors to these sites and the related cash flow, and report on the lessons learned while operating adult web sites on our own.
Content may be subject to copyright.
Is the Internet for Porn?
An Insight Into the Online Adult Industry
Gilbert Wondracek
, Thorsten Holz
, Christian Platzer
Engin Kirda
, and Christopher Kruegel
Secure Systems Lab,
Institute Eurecom,
University of California,
Technical University Vienna Sophia Antipolis Santa Barbara
The online adult industry is among the most profitable busi-
ness branches on the Internet, and its web sites attract large
amounts of visitors and traffic. Nevertheless, no study has
yet characterized the industry’s economical and security-
related structure. As cyber-criminals are motivated by fi-
nancial incentives, a deeper understanding and identifica-
tion of the economic actors and interdependencies in the
online adult business is important for analyzing security-
related aspects of this industry.
In this paper, we provide a survey of the different eco-
nomic roles that adult web sites assume, and highlight their
economic and technical features. We provide insights into
security flaws and potential points of interest for cyber-
criminals. We achieve this by applying a combination of
automatic and manual analysis techniques to investigate the
economic structure of the online adult industry and its busi-
ness cases. Furthermore, we also performed several exper-
iments to gain a better understanding of the flow of visitors
to these sites and the related cash flow, and report on the
lessons learned while operating adult web sites on our own.
1 Introduction
“The Internet is for Porn” is the title of a satirical song
that has been viewed several million times on YouTube.
Its popularity indicates the common belief that consuming
pornographic content via the Internet is part of the modern
pop-culture. Compared to traditional media, the Internet
provides fast, easy, and anonymous access to the desired
content. That, in turn, results in a huge number of users
accessing pornographic content. According to the Inter-
net Pornography Statistics [14], 42,7% of all Internet users
view pages with pornographic content. From the male por-
tion of these users, 20% admittedly do it while at work.
With a total worth of more than 97 billion USD in
2006 [14], the Internet porn industry yields more rev-
enue than the top technology companies Microsoft, Google,
Amazon, eBay, Yahoo!, and Apple combined. Interestingly,
however, to the best of our knowledge, no study has yet been
published that analyzes the economical and technological
structure of this industry from a security point of view. In
this work, we aim at answering the following questions:
Which economic roles exist in the online adult industry?
Our analysis shows that there is a broad array of economic
roles that web sites in this industry can assume. Apart from
the purpose of selling pornographic media over the Internet,
there are much less obvious and visible business models in
this industry, such as traffic trading web sites or cliques of
business competitors who cooperate to increase their rev-
enue. We identify, in this paper, the main economic roles of
the adult industry and show the associated revenue models,
organizational structures, technical features and interdepen-
dencies with other economic actors.
Is there a connection between the online adult indus-
try and cyber-crime? According to web statistics, adult
web sites regularly rank among the top 50 visited web
sites worldwide [2]. Anonymous and free access to porno-
graphic media appeals to a huge audience, and attracts large
amounts of Internet traffic. In this paper, we show that this
highly profitable business is an attractive target for cyber-
criminals, who are mainly motivated by financial incen-
tives [9, 13].
What specific threats target visitors of adult web sites?
Common belief suggests that adult web sites tend to be
more dangerous than other types of web sites, considering
well-known web-security issues such as malware, or script
based attacks. Our results verify this assumption, and in
addition, we show that many adult web sites use aggres-
sive marketing and advertisement methods that range from
“shady” to outright malicious. They include techniques that
clearly aim at misleading web site visitors and deceiving
business partners. We describe the techniques we identified,
and their associated security risks.
Is there domain-specific malicious activity? To be
able to assess the abuse potential of adult web sites, we
describe how we created and operated two adult web
sites. This enabled us to identify potential attack points,
and participate in adult traffic trading. We conducted
several experiments and performed a security analysis of
data obtained from web site visitors, evaluating remote
vulnerabilities of visitors and possible attack vectors.
We also identified and experimentally verified scenarios
involving fraud and mass infection that could be abused
by adult site operators, showing that we could potentially
exploit more than 20,000 visitors spending only about $160.
To summarize, we make the following contributions:
1. We provide a detailed overview of the individual ac-
tors and roles within the online adult industry. This
enables us to better understand the mechanisms with
which visitors are redirected between the individual
parties and how money flows between them.
2. We examine the security aspects of more than 250,000
adult pages and study, among other aspects, the preva-
lence of drive-by download attacks. In addition, we
present domain-specific security threats such as dis-
guised traffic redirection techniques, and survey the
hosting infrastructure of adult sites.
3. By operating two adult web sites, we obtain a deeper
understanding of the related abuse potential. We par-
ticipate in adult traffic trading, and provide a detailed
discussion of this unique aspect of adult web sites,
including insights into the economical implications,
and possible attack vectors that a malicious site oper-
ator could leverage. Furthermore, we experimentally
show that a malicious site operator could benefit from
domain-specific business practices that facilitate click-
fraud and mass exploitation.
Ethical and Legal Considerations
Studying the online adult industry and performing ex-
periments in this area is an ethically sensitive area. Clearly,
one question that arises is if it is ethically acceptable and
justifiable to participate in adult traffic trading. Similar to
the experiments conducted by Jakobsson et al. in [15, 16],
we believe that realistic experiments are the only way to re-
liably estimate success rates of attacks in the real-world.
We also implemented several preventive measures to
limit ethical objections during our study. First, in the traffic
experiments we performed, we only collected user infor-
mation that is readily available by the webserver we set up
(such as for example the HTTP request headers) or informa-
tion that can be queried from the browser via standard inter-
faces such as JavaScript or Flash. Second, we anonymized
the information and only stored the data for the offline anal-
ysis we performed after collecting the information. Third,
we did not withdraw any funds but forfeited our traffic trad-
ing accounts at the end of the experiments. Fourth, we made
sure that during our crawling experiments the number of
outgoing requests was so low that it could not influence the
performance of any website we accessed.
We also consulted the legal department of our university
(comparable to the IRB in the US), and we were informed
that our experiments are approved.
2 Analysis Techniques
In this section, we describe the experimental setup that
we used to perform the analysis that allowed us to gain in-
sights into the online adult industry. As part of this study,
we first manually examined about 700 pornographic web
sites. This allowed us to infer a basic model of the indus-
try’s economic system. In the second step, we created a
system that crawls adult web sites and extracts information
from them to automatically gather additional data.
2.1 Manual Inspection
Given the minimal amount of (academic) information
currently available for this very specific type of Internet
content, we basically had to start from scratch by project-
ing ourselves into a “consumer” role. By using traditional
search engines, we located 700 distinct web sites related to
adult content. This initial sample set provided the first in-
sights into the general structure of adult web pages. For ex-
ample, we observed that many web sites contain parts that
implement similar functionality, such as preview sections
and sign-up forms. In addition, we also looked for special-
ized services and web sites that appeal to “producers” of
pornographic web sites. We used information gained from
industry-specific business portals [29] to identify business-
to-business web sites, such as adult hosting providers and
web payment systems.
We identified several web site “archetypes” that repre-
sent the most important business roles present in the online
adult industry. The majority of web sites that we analyzed
fits into exactly one of these roles. The economic relation-
ships between these entities are shown in Figure 1. When-
ever suitable, we named the roles according to the indus-
try jargon. In the following section, we provide a detailed
overview of each role. Based on these observations, we then
created an automated crawling and analysis system to gain
a broader insight into the common characteristics of adult
web pages, operating on a large sample set of about 270,000
URLs (on more than 35,000 domains).
Domain redirector
Traffic broker
Search engines
No content provided Promotional content Original content providers
link collections
flow of
flow of
Figure 1: Observed traffic and money flows for different roles within the online adult industry.
2.2 Identified Site Categories
Based on our observations, we can classify the market
participants in the following categories.
2.2.1 Paysites
This type of web sites constitutes the economic core of the
online adult industry. These web sites typically act as “con-
tent providers”, producing and distributing pornographic
media such as images and videos via their web pages, charg-
ing money in return. Most common users would consider
these sites to be representative for this genre.
2.2.2 Link Collections, TGP / MGP
Complementary to paysites, a large number of pornographic
web sites promise free content. These sites often call them-
selves link collections, thumbnail gallery posts (TGPs) or
movie gallery posts (MGPs), depending on the provided
form of pornographic media. We use the term free site to
denote these types of web sites.
Link collections typically consist of a series of hyper-
links (often adding textual descriptions of the underlying
media) to other web sites. TGP and MGP sites are struc-
turally similar, with the addition of displaying miniature
preview (still) images next to each link. It is indicative for
free sites that they do not produce their own content. Our
evaluation shows that they receive media from other con-
tent providers, as their main economic role is marketing for
paysites. A secondary role is traffic trading, as it will be
explained in Section 2.2.6.
2.2.3 Search Engines
With the multitude of different providers, specialized search
engines evolved to fit the need of every potential customer.
Functionally similar to general purpose search engines such
as Google, adult search engines [10] allow users to search
for web sites that match certain criteria or keywords. Un-
like traditional search engines, adult search engines claim
to manually classify the web sites in their index, instead of
relying on heuristics or machine learning techniques. How-
ever, this claim suggesting that their results are more accu-
rate than other search engines – is highly questionable, con-
sidering the fact that pornographic pages account for 12% of
the total number of web pages on the Internet [14]. Search
engines generally generate revenue by displaying advertise-
ments and selling higher-ranked search result positions.
2.2.4 Domain Redirector Services
Interestingly, there are services that specialize in managing
adult domain portfolios. They are similar to commercial
domain parking services that display web pages with ad-
vertisements (which are often targeted towards the domain
name) in lieu of “real” content [28].
Adult domain redirector services such as Domain Play-
ers Club [6] not only allow their clients to simply park their
domains, but are rerouting any web traffic from their clients’
domains to adult web sites. Adult sites that wish to receive
traffic from the redirector service have to pay a fee for being
registered as a possible redirection target. The exact desti-
nation of the redirections is typically based on the string edit
distance between the domain name of the web site partici-
pating in the redirector service, and the domain name of the
adult web sites which wish to receive traffic. For example,
a user might browse to, not knowing
that this site participates in a redirector service. The user
will then be redirected to an adult web site with a domain
name that has a low edit distance to this domain name. The
destination adult web site initially has to pay a fee for be-
ing considered by the redirection service, while the domain
owner is rewarded for any traffic that originates from his
domains. Technically, these redirector services work by us-
ing a layer of HTTP redirections, giving no indication to the
user that a redirection has occurred.
From a miscreant’s point of view, these redirector ser-
vices appear to be an ideal tool for typo-squatting [28].
Typo-squatting is the practice of registering domain names
that are syntactically very close to the names of legitimate
web sites. The idea behind typo-squatting is to parasitize
web traffic from users that want to go to the legitimate site,
but make a typographical error while entering the URL.
2.2.5 Keyword-Based Redirectors
Several businesses offer a service that aims at increasing
the visibility and (traditional) search engine ranking of their
clients (adult web sites). To this end, keyword based redi-
rector services operate websites that have a large numbers
of subdomains. The names of these subdomains consist of
combinations of adult-related search engine keywords.
Similar to domain redirector services, these subdomains
are configured to redirect visitors to “matching” web sites,
e.g. the redirector’s clients. Clearly, this technique is an at-
tempt to exploit ranking algorithms to achieve higher search
result positions, effectively subverting the search engine’s
business model of selling search result positions. Further-
more, it is an efficient way to prepare a web site for spam
advertisement. Unsolicited bulk (spam) mails tend to yield
a higher penetration rate when embedded links differ from
mail to mail [23].
2.2.6 Traffic Brokers
This unique type of service provider allows its clients to di-
rectly trade adult web traffic for money, and vice versa (i.e.,
web traffic can be turned into real money with this kind of
providers). Prospective clients who want to buy traffic can
place orders (typically in multiples of 1,000 visitors) that
will then be directed to a URL of their choice. Usually,
the buyer can select the source of the web traffic accord-
ing to several criteria, such as interest in certain niches of
pornography or from specific countries. Available options
also include traffic that originates from other adult sites, e-
casinos, or from users who click on advertisements such as
pop-up or pop-under windows, or even links in YouTube
comments. Another option is traffic that is redirected from
recently expired domains, which have been re-registered by
the traffic broker.
On the other hand, clients who want to sell traffic can do
so by redirecting their visitors to URLs that are specified by
the traffic broker, receiving money in return. If the broker
has no active orders from buyers for the type of traffic that
is provided, the traffic is sent back to a link specified by the
client. However, if the broker has an active order, the traffic
is redirected to the site of the buyer’s choice and the seller is
credited a small amount of money. Figure 2 visualizes the
flow of visitors and money for both scenarios.
Before a client can participate in traffic trading, brokers
typically claim that they check the source or destination site
of the traffic to prevent potential abuse. For example, many
traffic brokers state that they do not tolerate hidden frames
on target web sites. However, in our experiments with traf-
fic brokers, we found this claim to be false: We success-
fully managed to buy large quantities of traffic for a web
site that makes extensive use of hidden iframes and even
performs vulnerability checks on its visitors (see Section 4
for more details).
2.3 Experimental Setup
To acquire real-world data and to perform a large-scale
validation of the initial results from our manual analysis, we
created a web crawler system. Based on our observations,
we added several domain-specific features. Our system con-
sists of the following components.
2.3.1 Search Engine Mining
For our crawling system, it was necessary to acquire a set of
adult web sites that were suitable as initial input. To mimic
the way a consumer would look for adult web sites, we
made use of search engines. We manually compiled a set of
domain-specific search queries and automatically fed it as
input to a set of 13 search engines. This included three gen-
eral purpose search engines (Google, Yahoo, and Microsoft
Live) and ten adult search engines. We then automatically
extracted the URLs from the search results and stored them
in a database. The result set consisted of 95,423 URLs from
11,782 unique domains. These URLs were the seed used in
the crawling step.
2.3.2 Crawling Component
The core component of our system is a custom web crawler
we implemented for this purpose. We configured it to fol-
low links up to a depth of three for each domain. For per-
formance reasons, we additionally limited the maximum
amount of URLs for a single domain to 500. Starting
from the previously-mentioned seed, we crawled a total of
269,566 URLs belonging to 35,083 web sites. For each
crawled URL, we stored the web page source code, and the
embedded hyperlinks. This formed the data set for our sub-
sequent analysis. In addition to the crawling, we used the
Adult website: traffic seller
Traffic broker
Adult website: traffic buyer
(a) Traffic buyer is interested in receiving traffic and pays for it.
Adult website: traffic seller
Traffic broker
(b) No traffic buyer available, traffic broker returns visitor
to a specific URL.
Figure 2: Schematic overview of traffic trading and the flow of visitors/money.
following heuristics to further classify the content, and de-
tect a number of features.
Enter Page Detection. A characteristic feature of many
adult web sites (unrelated to their economic role) are “door-
way” web pages that require visitors to click on an Enter
link to access the main web site. These enter pages often
contain warnings, terms of use, or reminders of legal re-
quirements (for example, a required minimum age for ac-
cessing adult material).
In order to automatically detect enter pages, we used a
set of 16 manually compiled regular expressions to scan tex-
tual descriptions of links. Since some enter pages use but-
tons instead of text-only descriptions, we also checked the
HTML alternative text for images. For example, if a link
description matches . enter here. or . over. years.,
we classify the page as an enter page.
Adult Site Classifier. Since we wish to avoid crawling
non-adult web sites, and since not all outgoing links lead to
adult web sites, we created a simple, light-weight keyword-
based classifier to identify adult web sites. To this end,
we first check for the appearance of 45 manually selected,
domain-specific keywords in the web site’s HTML meta de-
scription tags. In case no matches are found, we also extend
our scan to the HTML body of the web page. If at least
two matches are encountered, we consider the web site to
contain pornographic content.
According to our experience, this na
ıve classification
works surprisingly well, as porn sites usually promote their
content openly. To evaluate the true positive (TP) and false
positive (FP) rate of our classifier, we ran it on a hand-
labeled subset of 102 web sites that we chose randomly our
manual-analysis test set. It achieved rates of 81.5% TP and
18.5% FP. Moreover, a limitation of our current implemen-
tation is that it currently only works with English-language
web sites. After excluding non-English web sites, the rate
improved to 90.1% TP and 9.9% FP. We are aware that far
more advanced classifiers for adult sites exist, for exam-
ple systems that include image recognition techniques [11].
However, these classifiers are typically aimed towards fil-
tering pornographic content and are not readily and freely
available, and our current heuristic yields sufficiently accu-
rate results for our purposes.
2.3.3 Client Honeypots
Malicious web sites are known to direct a multitude of dif-
ferent types of attacks against web surfers [21, 22, 27]. Ex-
amples include drive-by downloads, Flash-based browser
attacks, or malformed PDF documents that exploit third-
party software. To detect such attacks, we used two differ-
ent client honeypots to check the web sites that we crawled
in our study.
Capture-HPC. We used an adapted version of the
Capture-HPC [25] client honeypot. The tool detects and
records changes to the system’s filesystem and registry by
installing a special kernel driver. We set up Capture-HPC
in virtual machines (VMs) with a fully patched Windows
XP SP2, resembling a typical PC used for web browsing.
We then instrumented the VMs to open the URLs from our
crawling database using Internet Explorer 7 (including the
popular Flash and Adobe PDF viewer plugins). This al-
lowed us to detect malicious behavior triggered by (adult)
web sites. In our experimental setup, we ran eight instances
of the VMs in parallel, to achieve a higher throughput rate.
Wepawet. To complement the analysis performed by
Capture-HPC, we used another client honeypot, namely
Wepawet [18, 17], in parallel. The software features spe-
cial capabilities for detecting and analyzing Flash-based
exploits, and for handling obfuscated JavaScript, which is
commonly used to hide malicious code. Wepawet also
tries to match identified code signatures against a database
of known malware profiles, returning human-readable mal-
ware names.
2.3.4 Economic Classification
To decide if paysites are more or less secure (i.e., trustwor-
thy) than free sites, we created a heuristic for automatically
classifying each web site depending on its economic role.
Our classifier is limited to determining if a web site is either
a paysite or a free site; otherwise, the web site’s economic
role remains undefined.
Paysite Indicators. We identity paysites based on manual
observations and by using information we found on adult
business-to-business web sites: we compiled a list of 96
adult payment processors, i.e., companies appointed by a
web site operator to handle credit card transactions on be-
half of him. If a web site links to a payment service pro-
vided by one of these processors, we immediately mark it as
a paysite. In case no payment processor is found, we look
for additional features of paysites. To this end, we match
the web site source code against a set of regular expressions
to determine if it contains a “tour”, “member section”, or
membership sign-up form. We assume these structural fea-
tures to be indicative for paysites, as we did not find any
counter-examples in our manual observations.
Free Site Indicator. To identify free web sites, we exam-
ine their hyperlink topology. For this classification, we only
regard outgoing links as a reliable feature, as it is not feasi-
ble to recover (all) incoming links for a web site. We ana-
lyze the number of hyperlinks pointing to different domains
for each web site, and additionally compare the Whois en-
tries for both the source and destination domains. If a web
site exceeds a threshold t of links to “foreign” domains
(e.g., the Whois entries show different registrants), we la-
bel it as a free site. To evaluate this classifier and instanti-
ate a value for t, we tested it on a hand-labeled set of 384
link collection web sites that we selected randomly from our
database. Based on this experiment, we chose t = 25 for
the evaluation.
3 Observations and Insights
During our crawling experiments, we observed several
characteristics of adult sites. In this section, we provide an
overview of the most interesting findings, and discuss how
they are security-relevant.
3.1 Revenue Model
The ultimate goal for commercial web site operators is of
course to earn a maximum amount of money, and the slogan
“sex sells” is a clear testimony to this fact. In the following,
we analyze the revenue model of the major categories iden-
tified in Section 2.2.
3.1.1 Paysites
We found the revenue model of paysites to be centered
around selling memberships to customers. A membership
grants the customer access to an otherwise restricted mem-
ber area with username/password credentials. In the mem-
ber area, an archive of pornographic media can be browsed
or downloaded by the customer. Memberships typically
have to be renewed periodically, causing recurring fees for
the customer and, therefore, providing a steady cash-flow
for the paysite. To appeal to customers and to create a stim-
ulus for purchasing a membership, paysites rely heavily on
a number of marketing and advertising techniques, like for
A “Tour” of the Web Site. Similar to traditional adver-
tising methods (for example cinematic trailers for movies),
preview media content is published for free on the paysites’
web pages, eventually directing the user to membership
sign-up forms.
Search Engines and Web Site Directories. Specialized
promotion services, such as adult search engines and web
site directories, allow users to submit hyperlinks to web
sites. These links are then categorized (depending on the na-
ture of the content), and made available on a web site where
they can be searched and browsed. While these services are
typically free of charge, higher ranked result positions can
be purchased for a fee.
Affiliate Programs. The main purpose of an affiliate pro-
gram is to attract more visitors to the paysite. The business
rationale is that more visitors translates to more sales. To
this end, paysites allow business partners to register as af-
filiates, thus giving them access to promotional media. This
media is designated for marketing the paysite. It consists of
hyperlinks pointing to the paysite and optionally includes
a set of pornographic media files. In return for directing
visitors to the paysite, affiliates are rewarded a fraction of
the revenue that is generated by those customers that were
referred by the affiliate.
By using affiliate programs, paysites are effectively shift-
ing part of their marketing effort towards their affiliates.
Additionally, those sites that distribute the media files (in-
stead of just providing hyperlinks) can reduce their resource
consumption (such as bandwidth costs) as an additional
benefit. Many paysites even offer specialized services to
their affiliates, for example, by providing preview images
and textual descriptions of the content, or even creating ad-
ministrative shell scripts. Also, Internet traffic statistics are
made available to affiliates, so that they can optimize their
marketing efforts.
3.1.2 Free Sites
Free sites typically participate in multiple affiliate pro-
grams. We found examples of sites participating in more
than 100 different programs, generating revenue by direct-
ing visitors to paysites. To account for the origin of cus-
tomer traffic, paysites usually identify their affiliates by
unique tokens that are assigned on registration. These to-
kens are then used to associate traffic with affiliates, for ex-
ample, by incorporating them as HTTP parameters in hy-
perlinks pointing from the affiliate site to the paysite. The
same technique is used to identify links originating from
spam mails, providing the site with the means to evaluate a
spammers’ advertising impact.
Often, affiliates can choose between two revenue system op-
Pay-per-sign-up (PPS): The affiliate receives a one-
time payment from the paysite for each paysite mem-
ber that was referred by the free site.
Recurring income: In contrast to PPS, the affiliate can
choose to receive a fraction of each periodic fee as long
as the membership lasts.
We found that the payment systems that are used to trans-
fer money from paysites to affiliates offer a wide variety
of options, including wire transaction, cheques, and virtual
payment systems. In addition to affiliate programs, free
sites display advertisements to increase their revenue.
3.2 Organizational Structure
Paysites We noticed that many paysites are organized in
paysite networks. Such networks act as umbrella organiza-
tions, where each paysite contains hyperlinks to other mem-
bers of its network. Additionally, networks often offer cus-
tomers special membership “passes” that grant collective
membership for multiple paysites.
Interestingly, however, upon inspection of the
Whois [20] entries for member sites within several
networks, we found the registration information to often
match (e.g., the sites were belonging to the same owner).
Apparently, the individual network members prefer to
create the outward impression of representing different
enterprises, when they are in fact part of the same organi-
zation. This indicates that a diversification among paysites,
depending on the sexual specifics of the offered content, is
advantageous for the owners. These specialized sites are
called niche sites in the industry jargon.
Free Sites Similar to paysite networks, we found free
sites to be also organized in networks. However, in contrast
to paysites, free sites also frequently link to each other even
if the site owners differ. This means that business competi-
tors are collaborating. This appears counter-intuitive at first.
However, one has to take into account that cross-linking
between free sites is a search engine optimization method.
Thus, the search engine ranking of all sites participating in
a “clique” of free sites improves, as the sites are artificially
increasing their “importance” by creating a large number of
hyperlinks pointing towards them.
3.3 Economic Roles
From a consumer perspective, paysites and free sites are
the most important types of adult web sites. To get an
overview of the distribution of paysites and free sites with
regard to the total population of adult web sites, we applied
our classification heuristic to the 35,083 adult web sites (do-
mains) in our data set.
Our classifier was able to determine the role of 87,7% of
these web sites. For the remaining 12,3%, whose roles re-
mained undefined, we found a high percentage of web sites
that either served empty pages, returned HTTP error codes
(for example, HTTP 403 “Forbidden”), or were parked do-
mains. We assume that many of these sites are either still
under construction or simply down for maintenance during
our crawling experiment.
Our results indicate that 8.1% of the classified sites are
paysites and 91.9% are free sites (link collections). This is
consistent with the intuition that we gained from our ini-
tial, manual analysis, showing that most adult site operators
make money by indirectly profiting from the content pro-
vided by paysites.
3.4 Security-Related Observations
For either economic role, we found a relatively large
number of web sites that use questionable methods and
techniques that can best be described as “shady. Un-
like well-known web-based attacks and malicious activities
(such as drive-by downloads [21, 27]), these practices di-
rectly aim at manipulating and misleading a visitor to per-
form actions that result in an economic profit for the web
site operator. Overall, we found free sites to employ at least
one of these techniques more often (34.2%) when compared
to paysites (11.4%). In particular, we frequently found the
techniques listed below on adult web sites.
3.4.1 JavaScript Catchers
These client-side scripts “hijack” the user’s browser, pre-
venting him from leaving the web site. To this end, usu-
ally JavaScript code is attached to either the onunload or
onbeforeunload event handlers. Anytime the user tries
to leave the web site (e.g., by entering a new address, us-
ing the browser’s “Back” button, or closing the browser) a
confirmation dialogue is displayed. The user is then asked
to click on a button to leave the web site, while, at the same
time, advertisements are displayed or popup windows are
spawned. Apart from the obvious annoyance, this could
easily be used in a clickjacking attack scenario [12]. We
detected catcher scripts in 1.2% of the paysites and 3.9% of
free sites.
3.4.2 Blind Links
This technique uses client-side scripting via JavaScript to
obscure link destinations, effectively preventing the ad-
dresses from being displayed in the web browser’s sta-
tus bar. The most popular methods that we found in the
wild either work by overwriting the window.status or
parent.location.href variables. We scanned the
source code of the web sites for occurrences of these vari-
able names, and found 10.9% of paysites, and 26.2% of free
sites to use blind links.
While the destination addresses are still contained in the
web page source code, we believe it is fair to assume that
most users will be unable to extract them. This is problem-
atic, as it not only leaves the user unaware of the link’s des-
tination (leading to different web sites), but could also po-
tentially be used to mask malicious activities such as cross
site scripting (XSS) or cross site request forgery (CSRF) at-
3.4.3 Redirector Scripts
Redirector scripts make use of server-side scripting (for ex-
ample PHP scripts) to redirect users to different web sites.
In contrast to blind links, the link targets are determined at
the server at run-time, making it impossible for a client to
know in advance where a link really points to.
Typically, these redirector scripts are presented in com-
bination with pornographic media. For example, small
preview images usually have links to full-size versions at-
tached. Instead of this expected behavior, users are redi-
rected with a probability p to different web sites (so called
skimming rate). The rationale behind redirector scripts is
that users will know from experience that by keeping on
clicking on the preview image, the desired media will even-
tually be shown at some point. At the same time, they “gen-
erate” artificial outgoing traffic for the web site, even though
the user originally never intended to leave the site.
In our crawler implementation, we use a simple, yet ef-
fective technique to detect redirector scripts. Whenever
our system finds hyperlinks with a destination address that
contains a server-side script (currently
.php and
scripts), it resolves the link 10 times. If there is more than
one destination address, the script is regarded as a redirector
script, and the set of targets is added to our crawling queue.
We chose a value of 10, because in our initial tests, we ob-
served this as an upper bound for the number of redirection
targets. When tested on a sample of 100 redirector scripts,
none of them exceeded this threshold.
We found examples of p ranging from 0 (no random redi-
rection) to 1 (the promised content is never shown). Also,
the number of possible target addresses n varied from 1 to 6
destinations. Interestingly, only 3.2% of paysites but 23.6%
of free sites contained redirector scripts. This implies that
free sites have an incentive for using this technique.
The most likely explanation of this phenomenon are traf-
fic brokers (see Section 2.2.6). These services have special-
ized in (adult) traffic trading and allow visitor traffic to be
sold, a unique feature available only in this type of online
industry. This means that a miscreant could lure unsuspect-
ing visitors who click on pornographic media to click on
redirector links. The resulting traffic can then be sold to
such a traffic trading service, which redirects it to targets
of the buyer’s choice. The web site operator earns money
with every click, even if a single visitor clicks on one links
many times something not possible in traditional online
3.4.4 Redirection Chains
If web sites which contain redirector scripts link to other
sites with redirector scripts, we call this a redirection chain.
This topology can be abused to further increase the revenue
from artificial traffic generation.
We observed that JavaScript catchers are frequently used
in conjunction with redirector chains, effectively “trapping”
the user in a network of redirections. In our evaluation, we
found 34.4% of those web sites that use redirector scripts
to be part of redirector chains. Potentially, this could easily
be abused for performing click-fraud or similar traffic-based
cyber-crime because it enables the redirection operators to
direct large amounts of “realistic” traffic to destinations of
their choice. We study this phenomenon in more detail in
Section 4.
3.5 Malware
To find more “traditional” web-based attacks, we applied
our client honeypot analysis (see Section 2.3) to all 269,566
pages in our data set (which represents the adult web sites’
main pages, subdomain pages, and enter page targets). Of
these, 3.23% were found to trigger malicious behavior such
as code execution, registry changes, or executable down-
loads. This percentage is significantly higher than what we
expected based on related work [21], where slightly more
than 0.6% of adult web sites were detected as malicious.
We used Anubis [5], a behavior-based malware analy-
sis tool, to further analyze the malware samples that were
collected by the honeypots. Also, Wepawet could suc-
cessfully identify several families of exploit toolkits used
by the malicious sites. This gave us human-readable mal-
ware names for the malware, showing that the most popu-
lar types of malware that we found are Spyware and Tro-
jan downloaders (e.g., rootkit.win32.tdss.gen or
Whenever iframes were used as infection vectors, we
extracted the hosting location of the injected code, finding
the malicious code to be mostly (98.2%) not stored on the
adult web sites themselves. We believe this is a clear indi-
cation that the web sites that distribute the malware were
originally exploited themselves, and are not intentionally
serving malware. This was also confirmed by results from
Wepawet, which automatically attributed several exploits to
the “LuckySploit” malware campaign [8].
4 Becoming an Adult Webmaster
The analysis methods and findings presented in the pre-
vious sections allow us to gain information from an external
observer’s point of view, enabling us to outline the online
adult industry’s business relationships and studying some
security-related aspects. However, we are also interested
in more technical, security relevant information that is only
available to adult web site operators themselves, for exam-
ple, data about the web site visitors or the mechanisms be-
hind traffic trading. One of the goals of our research is to
estimate the malicious potential of adult web sites, for ex-
ample, as a mass exploitation vector. Therefore, we also
need the internal point of view to understand this area of
the Internet in detail.
Unfortunately, we are not aware of any available real-
world data set that could be used for such an analysis.
Therefore, we took over the role of an adult webmaster and
created two adult web sites from scratch to conduct our ex-
4.1 Preparation Steps
To be able to interact with the adult industry, we per-
formed the following operations to mimic an adult web site.
First, we created two relatively simple web sites. We de-
signed both sites’ layout to resemble existing, genuine adult
web sites, allowing us to blend in with the adult web site
landscape. We chose to mimic two popular types of free
sites, one “thumbnail gallery” web site and one link col-
lection web site. After registering domain names that are
indicative for adult web sites, we put the sites online on a
rented web hosting server.
Affiliate Programs. To receive promotional media, we
then registered as an adult web site operator at eight adult
affiliate programs. Surprisingly, the requirements for join-
ing affiliate programs appear to be very low. In our case,
only the web site URL, a contact name, and an email ad-
dress had to be provided. There is no verification of neither
the contact identity information nor is a proof of ownership
required for the web site.
Immediately before signing up to an affiliate program,
we created a snapshot of our web server access logs. As
soon as an affiliate program accepted our application, we
compared the current access logs to the snapshot. We found
that six of the eight affiliate programs were accepting our
application, even though no access to our web sites hap-
pened during the period between sign-up and acceptance.
This means, that they were blindly accepting our applica-
tion, performing no check of the web sites at all.
Traffic Brokers. Furthermore, we also registered our web
sites at four traffic brokers that we chose due to their pop-
ularity among adult site operators, allowing us to partici-
pate in traffic trading. The registration procedure was al-
most identical to affiliate programs, and again, most bro-
kers accepted our application without looking at the web
sites. Only one broker checked our site and subsequently
declined our application after detecting our analysis scripts
(see next section).
Payment System. To be able to buy traffic, we had to
send money to the traffic brokers. To this end, we used
the “ePassporte” electronic payment system, that is popu-
lar among adult site operators, as it is widely accepted in
the adult industry. We spent slightly more than $160 for our
traffic trading experiments (including transaction fees).
4.2 Traffic Profiling
Our main goal in operating these web sites is to ac-
quire as much security relevant information about web traf-
fic coming to the sites as possible. To this end, we added
several features to the web sites that allow us to collect
additional information from each visitor. Since the col-
lected data may contain detailed information about a unique
visitor, and especially privacy related information, we im-
plemented several precautions to protect the user’s privacy
(e.g., anonymization of the collected raw log data). This in-
formation is then used in subsequent offline analysis steps,
for example to determine if a user is vulnerable to remote
exploits like arbitrary code execution or drive-by down-
loads. Specifically, we collect the following information
from each visitor:
Browser Profiling. First, we store general information for
each visitor that is available through the web server log files,
for example, the User-Agent string and the HTTP re-
quest headers that are sent by the user’s browser.
Additionally, we added several JavaScript functions to
the web site. These routines gather specific data about a
visitor’s web browser capabilities, for example, the sup-
ported data types or installed languages. We also collect
information about any installed browser plugins, including
their version numbers. This information is security relevant,
as browser plugins are frequently vulnerable to remote ex-
ploits, and we can infer from this data if the visitor is poten-
tially vulnerable to a drive-by download attack.
In particular, we are interested in the Flash browser-
plugin [1], which is typically used to embed videos in web
sites, as it is known for its bad security record [26]. Our
intuition is that visitors to multimedia-rich adult web sites
will most likely have Flash installed. Therefore, in addi-
tion to the plugin detection, we implemented a JavaScript-
independent Flash detection mechanism that uses a small
Flash script to check if the user has Flash installed. This
allows us to detect vulnerable clients, even if they have
JavaScript turned off (see Section 4.4). In addition to Flash,
we also check for vulnerable versions of browser plugins
for the Adobe PDF document viewer and Microsoft Office
as they are the most prevalent targets for malicious attack-
ers [4].
Outgoing Links. To be able to verify statistics provided
by affiliate program partners, we track all outgoing (i.e.,
leaving the web site) hyperlinks that a user has clicked. This
is implemented by scripts that operate similar to redirector
scripts often employed on adult web sites (see Section 3.4.3
for details).
4.3 Traffic Buying Experiments
After having prepared the web sites with our profiling
tools, we placed orders for buying web site visitors at three
different traffic brokers. We tested different brokers to study
the differences in delivered traffic and to gain a better un-
derstanding of their intricacies. In total, we ordered al-
most 49,000 visitors at the three different traffic brokers
during a period of seven weeks. We spent a total of $161.84
on these traffic orders (average $3.30 per thousand visi-
tors). Surprisingly, each traffic broker redirected traffic to
our site (almost) instantly after placing an order. This sug-
gests that they have an automated traffic distribution system
in place, capable of flexibly rerouting traffic to customers,
and enough incoming traffic that they can handle orders in
a timely manner. Checking our web server logs confirmed
that we indeed received the correct amount of visitors (e.g.,
clients with unique IP addresses) at the correct rate for all
In addition to the rate limit, we also chose the more ex-
pensive “high quality” option when buying traffic, which is
regarded by traffic brokers as synonymous with traffic com-
ing mostly from the US and Europe. To verify the geograph-
ical origin of traffic, we performed an IP to country lookup
for the bought traffic. We found that 98.22% of the traffic
really originates from the US and Europe, thus the origin is
correct for the vast majority of visitors.
4.4 Profiling Results
After having received the ordered amount of traffic, we
analyzed the output of the profiling steps outlined in Sec-
tion 4.2. An overview of the results of this analysis is shown
in Table 1. All brokers sent a similar type of visitors to our
site and there are no major differences between the brokers.
Therefore, we discuss the overall results in the following
4.4.1 Browser Profiling
When a visitor accesses one of our web sites, we automati-
cally start to collect information about him (e.g., all request
headers and information about browser extensions). In cer-
tain cases, our system cannot obtain this profiling informa-
tion for a web site visitor. The reasons can be manifold, for
example a client can have JavaScript support disabled, it can
be an “exotic” web browsers with reduced functionality, the
visitor might stay for only a few seconds on our web site, or
it might not be a human visitor but a bot. The most preva-
lent case were visitors that did not correctly execute our
JavaScript-independent Flash detection: 18,794 (38.43%)
of our overall visitors behaved in this way. In contrast,
30,106 (61.57%) visitors correctly performed the test, and
of those 96.24% had Flash installed. Furthermore 10,214
visitors (about 20.89%) did not download any images, but
just requested the HTML source code of the site. While we
cannot coherently explain this behavior, we think that it is
caused by bots (e.g., click-bots [7]), since the browser of a
human visitor would start to download the complete content
of the site.
For about 47% of all visitors we were able to build
a complete browser profile, which includes all the infor-
mation we are interested in. For the remaining visitors
only certain types of information were collected (e.g., only
HTTP headers and no other information since the visitor
spent not enough time on our site). We opted to analyze
only the cases in which we have collected the complete
browser profile to be conservative in our analysis.
During our analysis we also detected some noteworthy
anomalies that prohibit browser profiling. For example,
about 0.53% of the visitors used browser versions typically
found in mobile phones or video game consoles (such as
Nintendo Wii, Playstation Portable, or Sony Playstation).
These devices do not fully support JavaScript or have a lim-
ited set of features, preventing our profiling scripts from ex-
ecuting correctly. We also found that in about 0.14% of the
cases our profiling did not work since the HTTP headers
were purged, a fact that we could attribute to clients which
have the Symantec Personal Firewall installed.
4.4.2 Vulnerability Assessment
We determine if a client is vulnerable to known exploits
by matching the visitor’s browser properties (e.g., version
number of common plugins and add-ons) against a list of
common vulnerabilities we compiled manually. We fo-
cussed on only the most prevalent browser plugins such as
those related to Adobe Flash and PDF, and Microsoft Of-
fice. These three plugins had seven vulnerabilities in the re-
cent past, and an attacker can buy toolkits that exploit these
vulnerabilities to compromise a visitor [4]. Since realisti-
cally, additional exploits (even some that are not publicly
known yet) exist in the wild, this provides us with a lower
bound for the number of vulnerable systems among visitors
to our web sites. Using this heuristic, we found that more
Broker A Broker B Broker C Total
Ordered Visitors 12,000 7,900 29,000 48,900
Performed Flash Detection 8,638 (71.98%) 5,010 (63.42%) 16,458 (56.75%) 30,106 (61.57%)
7→ Flash Found 8,401 (97.26%) 4,876 (97.33%) 15,697 (95.38%) 28,974 (96.24%)
Complete Browser Profiles 6,183 (51.53%) 3,682 (46.60%) 13,176 (45.43%) 23,041 (47.12%)
7→ Vulnerable 5,251 (84.93%) 3,242 (88.05%) 11,847 (89.91%) 20,340 (88.28%)
# Clicked Links 3,662 2,742 8,997 15,401
Table 1: Statistics about the visitors studied during our traffic buying experiments.
than 20,000 visitors had at least one vulnerable component
installed and more than 5,700 visitors had multiple vulnera-
ble components. Figure 3a shows a Venn diagram depicting
the prevalence of different types of vulnerabilities.
1,257 12
MS Office
(a) Distribution of the three vulnerability
types we examined. Note that the display for-
mat is not proportional.
User-Agent # Suspicious Clients Type
FunWebProducts 260 Adware
SIMBAR 136 Adware
DesktopSmiley 93 Spyware
JuicyAccess 85 Nagware
Antivir XP 2008 52 Fake AV
Other 289 -
Total 915 -
(b) Overview of suspicious User-Agent strings that we observed fre-
quently, indicating that these clients are presumably infected with some
kind of malware, e.g., scareware or adware.
Figure 3: Results for vulnerability assessment of clients
studied during traffic experiments.
A malicious site operator could take advantage of these
vulnerabilities and compromise the visitor’s browser with a
drive-by download [21]. Besides the opportunity to build a
botnet with only a small investment (e.g., we spent $160 and
could potentially infect more than 20,000 machines), an op-
erator could also earn money with the help of so called Pay-
Per-Install (PPI) affiliate programs. In a PPI program, the
“advertiser” pays the partner a commission for every install
of a specific program by a user. The exact amount of this
commission depends on the countries that the users come
from. For example, we registered at one PPI program (note
that we did not install any software to clients) and found the
rate for 1,000 installs to computers located in the US and
parts of Europe to be set to $130, while it would be as low
as $3 for most Asian countries. This is consistent with in-
formation that we manually compiled from ve other PPI
program web sites. Related work that focusses on PPI (for
example [24]) lists even higher prices per installation. Since
we only bought US and European traffic in our experiments,
we found a large fraction of traffic to fall into the highest
selling PPI category (more than 95%). While an in-depth
analysis of PPI programs is outside the scope of this work,
these figures clearly show that it would be highly profitable
for a malicious site operator to participate in PPI programs,
and covertly trigger installs of unwanted software at vulner-
able clients.
In addition to vulnerable browser versions and plug-
ins, we also analyzed the User-Agent strings obtained
from the visitor’s browser. This enables us to detect cer-
tain cases of clients that are already compromised: While
the User-Agent string can be arbitrarily set by a client,
it is still a good indicator for clients that are infected with
certain types of malware, which intentionally “mark” in-
fected clients to avoid re-infection or change the behavior
of web sites that act as an infection vector. We found 915
clients (1.87%) that contain known malware marker strings,
such as for example the adware “Simbar”, or scareware like
“Fake Antivirus 2008”. Figure 3b provides an overview of
the most common suspicious User-Agent strings we ob-
served in the visitor’s browser.
4.5 Traffic Selling Experiments
Traffic brokers also allow their clients to sell web traffic,
paying them for visitors that are redirected to the broker’s
web site; from there the visitors are forwarded to traffic buy-
ers (see Section 2.2.6 for details). The commission a traffic
seller receives mainly depends on the niche that is attributed
to the traffic, and is influenced by the type of web site the
seller operates. To explore the security aspects of traffic
selling, we included traffic selling links on our web sites
and participated in this business.
4.5.1 Click Inflation Fraud Scenario
The first thing we noticed is the fact that traffic brokers do
not require traffic selling web sites to include any content
(for example a script) that is hosted by the traffic broker
or by a third party. This stands in contrast to other types
of web businesses that rely on partner web sites to pub-
lish information. For example, online advertisers such as
Google typically require the inclusion of JavaScript code
that is hosted by the advertiser himself on the publisher’s
web site. This code enables the content provider to acquire
information about the publishing web site that can be used
for abuse and fraud detection, for example by computing
the click-through-ratio (CTR) or by checking the cookie in-
formation of a user that clicks on a link.
Since traffic brokers do not use this technique, they can-
not implement these well-known techniques for fraud de-
tection and are thus subject to specific abuses. However,
we found that traffic brokers check the HTTP Referrer
header of redirected traffic to see if it really originates from
the seller’s web site. If this is not the case, the traffic is ei-
ther rejected (redirected back to the seller), or only a very
low price is paid.
These observations led us to the assumption that the level
of sophistication of anti-fraud techniques employed by traf-
fic brokers is rather low. To verify this assumption, we de-
vised a simple, yet effective fraud scenario to test the vul-
nerability of traffic brokers to click fraud. In this scenario
an attacker (legitimately) buys traffic from at least one traf-
fic broker, and then “resells” this traffic to n different traffic
brokers in parallel by forwarding the incoming traffic. Fig-
ure 4 illustrates the concept of our attack, which is a varia-
tion of click inflation attacks [3].
Figure 4: Overview of n-fold click inflation fraud scenario.
With the help of this n-fold click inflation, an attacker
can earn money if the total earnings from selling the traffic
n times exceeds the amount of money she needs to spend for
buying the traffic. Furthermore, she could even earn more
money by abusing each visitor she bought, for example by
compromising vulnerable visitors with the help of a drive-
by download.
From a technical point of view, we found that simple
HTTP or JavaScript redirections to the traffic selling URLs
would not suffice, as many popular web browsers such as
Internet Explorer and Mozilla Firefox incorporate pop-up
blocking features that prevent opening new browser win-
dows without the user’s interaction. However, during our
traffic buying experiment, we noticed that a relatively high
amount of visitors clicked on links on the web site: over-
all, we had more than 15,400 clicks based on just 48,900
visitors (see Table 1). From those users that clicked on at
least one link, we received an average of 3.78 clicks. Based
on this observation and the fact that pop-up blockers do not
trigger if user interaction is involved in opening links, we
were able to attach JavaScript code to the onclick event
handler of hyperlinks. This allows us to perform n-fold traf-
fic selling every time a user clicks on a hyperlink on the web
We performed an experiment that shows that this attack
is effective against traffic brokers: We signed up as traf-
fic seller at two different traffic brokers and bought visitors
from a third broker. Each click on our web site was redi-
rected to both brokers. We implemented only a 2-fold click
inflation attack to test the setup in practice, but higher val-
ues for n can be implemented without problems. In total,
we bought 17,000 visitors to our site, from which more than
1,800 visitors clicked at least one link and thus we could sell
them to both brokers. In total, the visitors generated about
4,100 clicks. During our experiment, we successfully ac-
cumulated funds of slightly less than $10 (on average, we
received $2.22 for 1,000 sold visitors). To prevent damage
to the traffic broker, we did not withdraw any funds, but
forfeited our traffic trading accounts.
Based on the insights gained from this experiment, we
think that other, even more powerful types of click fraud
(such as clickjacking [12]) would be equally easy to employ.
The success of this experiment also suggests that traffic bro-
kers do not share information among each other about traf-
fic sold, and that no advanced fraud detection systems are
in place.
5 Related Work
Little academic information is available about the online
adult industry, yet it consists of thousands of web sites that
generate a revenue of billions of dollars every year. Many
publications that analyze general web security issues have
been published in recent years. For example, Wang et al.
developed client honeypots to detect and capture web-based
malware samples [27].
Existing work on web-based threats often targets specific
types of malware. For example, Moshchuk et al. provide an
analysis of web-based spyware [19]. Provos et al. focus on
analyzing technical exploitation details [21]. Zhuge et al.
studied malicious aspects of the Chinese Web [30]. The au-
thors did mention the adult industry, however, the scope of
their work is limited to drive-by downloads found on adult
web sites and no other aspects were studied.
Several studies show parallels and draw connections be-
tween malicious Internet activity and the underground econ-
omy. For example, Provos et al. provide technical details
on how cyber-criminals use web-based malware to their ad-
vantage [22]. The aspect of an underground economy that
is fuelled by financially motivated cyber-criminals is high-
lighted by Franklin et al. [9]. In a recent paper, Holz et al.
study the structure and profits of keyloggers [13].
To the best of our knowledge, this study is the first that
combines an economic analysis of the online adult industry
with a security analysis from a technical and a cyber-crime
6 Conclusion
In this paper, we presented novel insights into the on-
line adult industry. We analyzed the economic structure of
this industry, and found that apart from the expected “core
business” of adult sites, more shady business models ex-
ist in parallel. Our evaluation shows that many adult web
sites try to mislead and manipulate their visitors, with the
intent of generating revenue. To this end, a wide range of
questionable techniques are employed, and openly offered
as business-to-business services. The tricks that these web
sites employ range from simple obfuscation techniques such
as relatively harmless blind links, over convenience services
for typo-squatters, to sophisticated redirector chains that are
used for traffic trading. Additionally, the used techniques
have the potential to be exploited in more harmful ways, for
example by facilitating CSRF attacks or click-fraud.
By becoming adult web site operators ourselves, we
gained additional insights on unique security aspects in this
domain. For example, we discovered that a malicious oper-
ator could infect more than 20,000 with a minimal invest-
ment of about $160. We conclude that many participants of
this industry have business models that are based on very
questionable practices that could very well be abused for
malicious activities and conducting cyber-crime. In fact, we
found evidence that this kind of abuse is already happening
in the wild.
[1] Adobe Systems Incorporated. Adobe Flash Player.
flashplayer/, 2009.
[2] Alexa. Top 500 Global Sites. http://www.alexa.
com/topsites, 2009.
[3] V. Anupam, A. Mayer, K. Nissim, B. Pinkas, and M. K. Re-
iter. On the Security of Pay-per-click and Other Web Adver-
tising Schemes. In Proceedings of the Eighth Conference on
World Wide Web (WWW), 1999.
[4] B. Stone-Gross and M. Cova and L. Cavallaro and B. Gilbert
and M. Szydlowski and R. Kemmerer and C. Kruegel and
G. Vigna. Your Botnet is My Botnet: Analysis of a Botnet
Takeover. In ACM Conference on Computer and Communi-
cations Security (CCS), 2009.
[5] U. Bayer, P. M. Comparetti, C. Hlauschek, C. Kruegel, and
E. Kirda. Scalable, Behavior-Based Malware Clustering.
In Symposium on Network and Distributed System Security
(NDSS), 2009.
[6] Beano Publishing. Domain Players Club. http://www., 2009.
[7] N. Daswani and M. Stoppelman. The Anatomy of Click-
bot.A. In First Workshop on Hot Topics in Understanding
Botnets (HotBots), 2007.
[8] Finjan Inc. LuckySploit Toolkit Exposed. http://www.,
[9] J. Franklin, V. Paxson, S. Savage, and A. Perrig. An inquiry
into the nature and causes of the wealth of internet miscre-
ants. In ACM Conference on Computer and Communica-
tions Security (CCS), 2007.
[10] Guywire, Inc. Booble. Adult Search Engine. http://, 2009.
[11] M. Hammami, Y. Chahir, and L. Chen. Webguard: A
web filtering engine combining textual, structural, and vi-
sual content-based analysis. IEEE Transactions on Knowl-
edge and Data Engineering, 18(2), 2006.
[12] R. Hansen and J. Grossman. Clickjacking. Technical
report, SecTheory
clickjacking.htm, 2008.
[13] T. Holz, M. Engelberth, and F. Freiling. Learning More
About the Underground Economy: A Case-Study of Key-
loggers and Dropzones. In European Symposium on Re-
search in Computer Security (ESORICS), 2009.
[14] Internet Filter. Internet Pornography Statistics. http:
html, 2006.
[15] M. Jakobsson, P. Finn, and N. Johnson. Why and How
to Perform Fraud Experiments. Security & Privacy, IEEE,
6(2):66–68, March-April 2008.
[16] M. Jakobsson and J. Ratkiewicz. Designing ethical phishing
experiments: a study of (ROT13) rOnl query features. In
15th International Conference on World Wide Web (WWW),
[17] M. Cova and C. Kruegel and G. Vigna. Detection and Analy-
sis of Drive-by Download Attacks and Malicious JavaScript
Code. In 19th International World Wide Web Confer-
ence (WWW2010), 2010. http://wepawet.iseclab.
[18] M. Cova and S. Ford. Wepawet: Detecting and Analyzing
Web-Based Malware. http://wepawet.iseclab.
org, 2009.
[19] A. Moshchuk, T. Bragin, S. D. Gribble, and H. M. Levy. A
crawler-based study of spyware on the web. In Symposium
on Network and Distributed System Security (NDSS), 2006.
[20] Network Working Group. WHOIS Protocol Specification., 2004.
[21] N. Provos, P. Mavrommatis, M. Abu Rajab, and F. Monrose.
All Your iFRAMEs Point to Us. In 17th Usenix Security
Symposium, 2008.
[22] N. Provos, D. McNamee, P. Mavrommatis, K. Wang, and
N. Modadugu. The Ghost In The Browser. In First Work-
shop on Hot Topics in Understanding Botnets (HotBots),
[23] Spam Assassin. List of performed Tests. http:
html, last accessed: 23.04.2009, 2009.
[24] Symantec Corporation. Misleading Applications.
[25] The Honeynet Project. Capture-HPC Client Hon-
capture-hpc, 2009.
[26] Trusteer, Inc. Flash Security Hole Advisory.
Security_Hole_Advisory.pdf, 2009.
[27] Y.-M. Wang, D. Beck, X. Jiang, R. Roussev, C. Verbowski,
S. Chen, and S. King. Automated Web Patrol with Strider
HoneyMonkeys. In Symposium on Network and Distributed
System Security (NDSS), 2006.
[28] Y.-M. Wang, D. Beck, J. Wang, C. Verbowski, and
B. Daniels. Strider Typo-Patrol: Discovery and Analysis
of Systematic Typo-Squatting. In 2nd Conference on Steps
to Reducing Unwanted Traffic on the Internet, 2006.
[29] XBIZ. The Adult Industry Source for Business News and
Information., 2009.
[30] J. Zhuge, T. Holz, C. Song, J. Guo, X. Han, and W. Zou.
Studying Malicious Websites and the Underground Econ-
omy on the Chinese Web . In Proceedings of 2008 Workshop
on the Economics of Information Security (WEIS’08), June
... Pornography is amongst the most searched for content on the web [22,14]. Although this topic remains a taboo in some research fields, there has been an expanding body of research into the video platforms that drive its delivery. ...
... There have also been a number of related studies that have explored the topic of online pornography more generally, e.g., privacy [21]; automated recognition and classification [11,9]; interest recommendations [17]; and security issues [22]. This paper presents one of the first large-scale studies of an online adult multimedia delivery service. ...
Full-text available
Adult content constitutes a major source of Internet traffic. As with many other platforms, these sites are incentivized to engage users and maintain them on the site. This engagement (e.g., through recommendations) shapes the journeys taken through such sites. Using data from a large content delivery network, we explore session journeys within an adult website. We take two perspectives. We first inspect the corpus available on these platforms. Following this, we investigate the session access patterns. We make a number of observations that could be exploited for optimizing delivery, e.g., that users often skip within video streams.
... Surprisingly, they found that experts do not necessarily behave more securely than nonexperts [28,45]. Among other findings, certain types of sites (e.g., streaming and pornography) present higher risks of infection than others [45,61,87]. Researchers monitoring residential and enterprise networks tested which behaviors are correlated with manifestations of compromise [52]. ...
... Webpage Categories. Finally, motivated by prior findings that certain types of webpages present higher risks of infection than others [45,61,87], we posit that certain categories of webpages may be more strongly correlated with exposure risks. For example, Fig. 6 shows the proportion of advertising and adult webpages for unexposed and exposed users. ...
Conference Paper
Many computer-security defenses are reactive---they operate only when security incidents take place, or immediately thereafter. Recent efforts have attempted to predict security incidents before they occur, to enable defenders to proactively protect their devices and networks. These efforts have primarily focused on long-term predictions. We propose a system that enables proactive defenses at the level of a single browsing session. By observing user behavior, it can predict whether they will be exposed to malicious content on the web seconds before the moment of exposure, thus opening a window of opportunity for proactive defenses. We evaluate our system using three months' worth of HTTP traffic generated by 20,645 users of a large cellular provider in 2017 and show that it can be helpful, even when only very low false positive rates are acceptable, and despite the difficulty of making "on-the-fly'' predictions. We also engage directly with the users through surveys asking them demographic and security-related questions, to evaluate the utility of self-reported data for predicting exposure to malicious content. We find that self-reported data can help forecast exposure risk over long periods of time. However, even on the long-term, self-reported data is not as crucial as behavioral measurements to accurately predict exposure.
... Conversely, we explain the different cookie ecosystems in the bottom categories with different business models: for example, malware and scam are funded through other means; placeholders websites are essentially unused; charitable organizations are funded through donations; office/business applications and financial services are often paid services that are not based on advertising. Our results for the pornography category confirm those of Wondracek et al. [47], who find that this ecosystem is well separated from that of other web industries, with separate advertisement and different business models. ...
... The two most common lawless spaces for many-to- (Wondracek et al., 2010) to live-steamed abuse-on-demand systems IWF, 2018b), and may involve both traditional and mobile device consumption (Horsman, 2018;. ...
Full-text available
BACKGROUND Modern Child Sexual Exploitation Material (CSEM) offences predominantly occur within a technological ecosystem. The behaviours and cognitions of CSEM offenders influence, and are influenced by, their choice of facilitative technologies that form that ecosystem. OBJECTIVES This thesis will review the prior research on cognitive distortions present in and technology usage by CSEM offenders, and present a new theory, Lawless Space Theory (LST), to explain those interactions. The cognitions and technical behaviours of previously convicted CSEM offenders will be examined in a psychosocial context and recommendations for deterrence, investigative, and treatment efforts made. PARTICIPANTS AND SETTING Data was collected using an online survey collected from two samples, one from a reference population of the general public (n=524) and one from a population of previously convicted CSEM offenders (n=78), both of which were composed of adults living in the United States. METHODS Two reviews were conducted using a PRISMA methodology - a systematic review of the cognitive distortions of CSEM offenders and an integrative review of their technology usage. A theoretical basis for LST was developed, and then seven investigations of the survey data were conducted evaluating the public’s endorsement of lawless spaces; the public’s perceptions of CSEM offenders; the self-perceptions of CSEM offenders; the suicidality of the offender sample; the use of technology and countermeasures by the offender sample; the collecting and viewing behaviours of the offender sample; and the idiographic profiles of the offender sample. RESULTS The reviews found that the endorsement of traditional child contact offender cognitive distortions by CSEM offenders was low, and that they continued to use technology beyond its normative lifecycle. LST was developed to explain these behaviours, and the view of the Internet as generally lawless was endorsed by the reference and offender samples. The public sample showed biased beliefs that generally overestimated the prevalence of, and risk associated with, CSEM offending when compared to the offender sample. Offenders were found to have viewed investigators as having a lack of understanding and compassion, and they exhibited very high suicidal ideation following their interaction with law enforcement. Offenders exhibited similar technical abilities and lower technophilia than the reference sample, chose technologies to both reduce psychological strain and for utility purposes, and many exhibited cyclic deletions of their collections as part of a guilt/shame cycle. CONCLUSIONS AND IMPLICATIONS Understanding CSEM offenders’ technological behaviours and cognitions can inform more effective investigative, deterrence, and treatment efforts. Law enforcement showing compassion during investigations may generate more full disclosures while facilitating offender engagement with resources to reduce suicidality. Deterrence efforts focused on establishing capable guardianship and reducing perceived lawlessness provide the potential to reduce offending. Treatment of criminogenic needs for the majority of CSEM offenders is not supported by evidence, but noncriminogenic treatment warrants broader consideration.
... The popularity of online pornography may be attributed to the fact that much of it is free. To illustrate, Wondracek et al. (2010) examined 700 pornographic sites (i.e., 270,000 URLs on more than 35,000 domains) and found that 91.9% of them were free. ...
Full-text available
A content analysis was conducted to explore sexual indicators of aggression, objectification, exploitation, and agency in 50 “hijab” pornographic videos. Our findings suggest that women were the target of aggressive acts in all videos, with gagging (42%) and spanking (38%) being the most common. Also, in comparison with men, women were more likely to be objectified and exploited, and less likely to possess agency. Limitations of the current study and directions for future research are detailed.
... Also, opting out from the services described in Section 3.3 reduces tracking only in less than 30% of the websites (Figure 4d). We think this happens because this category has its own ecosystem and advertising networks [56], which appears to be quite segregated from the rest of the Internet. In this industry, the adoption of the GDPR rules appears to be progressing more slowly than on the rest of the Internet. ...
Conference Paper
Full-text available
The European Union's (EU) General Data Protection Regulation (GDPR), in effect since May 2018, enforces strict limitations on handling users' personal data, hence impacting their activity tracking on the Web. In this study, we perform an evaluation of the tracking performed in 2,000 high-traffic websites, hosted both inside and outside of the EU. We evaluate both the information presented to users and the actual tracking implemented through cookies; we find that the GDPR has impacted website behavior in a truly global way, both directly and indirectly: USA-based websites behave similarly to EU-based ones, while third-party opt-out services reduce the amount of tracking even for websites which do not put any effort in respecting the new law. On the other hand, we find that tracking remains ubiquitous. In particular, we found cookies that can identify users when visiting more than 90% of the websites in our dataset - and we also encountered a large number of websites that present deceiving information, making it it very difficult, if at all possible, for users to avoid being tracked.
... Figure 4 illustrates that malicious resources are mainly located on adult websites (19% in total). This fact is consistent with a previous study [47]. Besides, 17% of malicious websites are defined as Security risk category (classified as malicious domains by FortiGuard). ...
Conference Paper
Web security is a big concern in the current Internet; users may visit websites that automatically download malicious codes for leaking user's privacy information, or even mildly their web browser may help for someone's cryptomining. In this paper, we analyze abusive web resources (i.e. malicious resources and cryptomining) crawled from the Alexa Top 150,000 sites. We highlight the abusive web resources on Alexa ranking, TLD usage, website geolocation, and domain lifetime. Our results show that abusive resources are spread in the Alexa ranking, websites particularly generic Top Level Domain (TLD) and their recently registered domains. In addition, websites with malicious resources are mainly located in China while cryptomining is located in USA. We further evaluate possible counter-measures against abusive web resources. We observe that ad or privacy block lists are ineffective to block against malicious resources while coin-blocking lists are powerful enough to mitigate in-browser cryptomining. Our observations shed light on a little studied, yet important, aspect of abusive resources, and can help increase user awareness about the malicious resources and drive-by mining on web browsers.
Full-text available
The internet is flooded with malicious content that can come in various forms and lead to information theft and monetary losses. From the ISP to the browser itself, many security systems act to defend the user from such content. However, most systems have at least one of three major limitations: 1) they are not personalized and do not account for the differences between users, 2) their defense mechanism is reactive and unable to predict upcoming attacks, and 3) they extensively track and use the user’s activity, thereby invading her privacy in the process. We developed a methodological framework to predict future exposure to malicious content. Our framework accounts for three factors–the user’s previous exposure history, her co-similarity to other users based on their previous exposures in a conceptual network, and how the network evolves. Utilizing over 20,000 users’ browsing data, our approach succeeds in achieving accurate results on the infection-prone portion of the population, surpassing common methods, and doing so with as little as 1/1000 of the personal information it requires.
Conference Paper
Full-text available
Modern privacy regulations, including the General Data Protection Regulation (GDPR) in the European Union, aim to control user tracking activities in websites and mobile applications. These privacy rules typically contain specific provisions and strict requirements for websites that provide sensitive material to end users such as sexual, religious, and health services. However, little is known about the privacy risks that users face when visiting such websites, and about their regulatory compliance. In this paper, we present the first comprehensive and large-scale analysis of 6,843 pornographic websites. We provide an exhaustive behavioral analysis of the use of tracking methods by these websites, and their lack of regulatory compliance, including the absence of age-verification mechanisms and methods to obtain informed user consent. The results indicate that, as in the regular web, tracking is prevalent across pornographic sites: 72% of the websites use third-party cookies and 5% leverage advanced user fingerprinting technologies. Yet, our analysis reveals a third-party tracking ecosystem semi-decoupled from the regular web in which various analytics and advertising services track users across, and outside, pornographic websites. We complete the paper with a regulatory compliance analysis in the context of the EU GDPR, and newer legal requirements to implement verifiable access control mechanisms (e.g., UK's Digital Economy Act). We find that only 16% of the analyzed websites have an accessible privacy policy and only 4% provide a cookie consent banner. The use of verifiable access control mechanisms is limited to prominent pornographic websites.
Full-text available
The World Wide Web gains more and more popularity within China with more than 1.31 million websites on the Chinese Web in June 2007. Driven by the economic profits, cyber criminals are on the rise and use the Web to exploit innocent users. In fact, a real underground black market with thousands of parti cipants has developed, which brings together malicious users who trade exploits, malware, virtual assets, stolen credentials, and more. In this chapter, we provide a detailed overview of this underground black market and present a model to describe the market. We substantiate our model with the help of measurement results within the Chinese Web. First, we show that the amount of virtual assets traded on this underground market is huge. Second, our research proves that a significant amount of websites within China’s part of the Web contain some kind of malicious content: our measurements reveal that about 1.49% of the examined sites contain malicious content that tries to attack the visitor’s browser.
Conference Paper
Full-text available
Anti-malware companies receive thousands of malware samples every day. To process this large quantity, a number of automated analysis tools were developed. These tools execute a malicious program in a controlled environment and produce reports that summarize the program's actions. Of course, the problem of analyzing the reports still re- mains. Recently, researchers have started to explore au- tomated clustering techniques that help to identify samples that exhibit similar behavior. This allows an analyst to dis- card reports of samples that have been seen before, while focusing on novel, interesting threats. Unfortunately, pre- vious techniques do not scale well and frequently fail to generalize the observed activity well enough to recognize related malware. In this paper, we propose a scalable clustering approach to identify and group malware samples that exhibit simi- lar behavior. For this, we first perform dynamic analysis to obtain the execution traces of malware programs. These execution traces are then generalized into behavioral pro- files, which characterize the activity of a program in more abstract terms. The profiles serve as input to an efficient clustering algorithm that allows us to handle sample sets that are an order of magnitude larger than previous ap- proaches. We have applied our system to real-world mal- ware collections. The results demonstrate that our tech- nique is able to recognize and group malware programs that behave similarly, achieving a better precision than previous approaches. To underline the scalability of the system, we clustered a set of more than 75 thousand samples in less than three hours.
Conference Paper
Full-text available
Botnets, networks of malware-infected machines that are controlled by an adversary, are the root cause of a large number of security problems on the Internet. A particularly sophisticated and insidi- ous type of bot is Torpig, a malware program that is designed to harvest sensitive information (such as bank account and credit card data) from its victims. In this paper, we report on our efforts to take control of the Torpig botnet and study its operations for a period of ten days. During this time, we observed more than 180 thousand infections and recorded almost 70 GB of data that the bots col- lected. While botnets have been "hijacked" and studied previously, the Torpig botnet exhibits certain properties that make the analysis of the data particularly interesting. First, it is possible (with rea- sonable accuracy) to identify unique bot infections and relate that number to the more than 1.2 million IP addresses that contacted our command and control server. Second, the Torpig botnet is large, targets a variety of applications, and gathers a rich and diverse set of data from the infected victims. This data provides a new un- derstanding of the type and amount of personal information that is stolen by botnets.
Typo-squatting refers to the practice of registering domain names that are typo variations of popular websites. We propose a new approach, called Strider Typo-Patrol, to discover large-scale, systematic typo- squatters. We show that a large number of typo- squatting domains are active and a large percentage of them are parked with a handful of major domain parking services, which serve syndicated advertisements on these domains. We also describe the Strider URL Tracer, a tool that we have released to allow website owners to systematically monitor typo-squatting domains of their sites.
This paper provides a detailed case study of the architecture of the Clickbot. A botnet that attempted a low-noise click fraud attack against syndicated search engines. The botnet of over 100,000 machines was controlled using a HTTP-based botmaster. Google identified all clicks on its ads exhibiting Clickbot. Alike patterns and marked them as invalid. We disclose the results of our investigation of this botnet to educate the security research community and provide information regarding the novelties of the attack.
We present a hit inflation attack on pay-per-click Web advertising schemes. Our attack is virtually impossible for the program provider to detect conclusively, regardless of whether the provider is a third-party `ad network' or the target of the click itself. If practiced widely, this attack could accelerate a move away from pay-per-click programs and toward programs in which referrers are paid only if the referred user subsequently makes a purchase (pay-per-sale) or engages in other substantial activity at the target site (pay-per-lead). We also briefly discuss the lack of auditability inherent in these schemes.
Conference Paper
Malicious spyware poses a significant threat to desktop security and integrity. This paper examines that threat from an Internet perspective. Using a crawler, we performed a large-scale, longitudinal study of the Web, sampling both executables and conventional Web pages for malicious ob- jects. Our results show the extent of spyware content. For example, in a May 2005 crawl of 18 million URLs, we found spyware in 13.4% of the 21,200 executables we identified. At the same time, we found scripted "drive-by download" attacks in 5.9% of the Web pages we processed. Our analy- sis quantifies the density of spyware, the types of of threats, and the most dangerous Web zones in which spyware is likely to be encountered. We also show the frequency with which specific spyware programs were found in the content we crawled. Finally, we measured changes in the density of spyware over time; e.g., our October 2005 crawl saw a substantial reduction in the presence of drive-by download attacks, compared with those we detected in May.
Conference Paper
This paper studies an active underground economy which specializes in the commoditization of activities such as credit card fraud, identity theft, spamming, phishing, online credential theft, and the sale of compromised hosts. Using a seven month trace of logs collected from an active underground market operating on public Internet chat networks, we measure how the shift from “hacking for fun” to “hacking for profit” has given birth to a societal substrate mature enough to steal wealth into the millions of dollars in less than one year.
Conference Paper
As the web continues to play an ever increasing role in information exchange, so too is it becoming the pre- vailing platform for infecting vulnerable hosts. In this paper, we provide a detailed study of the pervasiveness of so-called drive-by downloads on the Internet. Drive- by downloads are caused byURLs that attempt to exploit their visitors and cause malware to be installed and run automatically. Over a period of 10 months we processed billions of URLs, and our results shows that a non-trivial amount, of over 3 million maliciousURLs, initiate drive- by downloads. An even more troubling finding is that approximately 1.3% of the incoming search queries to Google's search engine returned at least oneURL labeled as malicious in the results page. We also explore sev- eral aspects of the drive-by downloads problem. Specifi- cally, we study the relationship between the user brows- ing habits and exposure to malware, the techniques used to lure the user into the malware distribution networks, and the different properties of these networks.