Article

Empirical analysis of Tor Hidden Services

Authors:
Article

Empirical analysis of Tor Hidden Services

If you want to read the PDF, try requesting it from the authors.

Abstract

Tor hidden services allow someone to host a website or other transmission control protocol (TCP) service whilst remaining anonymous to visitors. The collection of all Tor hidden services is often referred to as the 'darknet'. In this study, the authors describe results from what they believe to be the largest study of Tor hidden services to date. By operating a large number of Tor servers for a period of 6 months, the authors were able to capture data from the Tor distributed hash table to collect the list of hidden services, classify their content and count the number of requests. Approximately 80,000 hidden services were observed in total of which around 45,000 are present at any one point in time. Abuse and Botnet C&C servers were the most frequently requested hidden services although there was a diverse range of services on offer.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Like the Internet, the dark web exists on a system which is decentralised in nature with no central servers or point of control, so it is difficult to shut down (Tanenbaum and Van Steen 2007). Organisations, such as human-rights activist groups or universities promoting open access, are known to actively donate bandwidth to this cause and run relays (Owen and Savage 2016). The Tor protocol uses an encrypted connection to each destination Gupta, Maynard & Ahmad 2019, Perth, WA The Dark Web Phenomenon across multiple nodes (or relays) which enables the Tor browser to provide anonymity even when browsing to websites on the surface web. ...
... Malware-as-a-Service for criminal services (Tsakalidis and Vergidis 2017); Command-and-Control (C2) servers deployed as hidden services (Owen and Savage 2016); Terrorism Operations conducted in conjunction with other roles (Denic 2017;Weimann 2016a). ...
... Avoid censorship by circumventing blocks (Chertoff and Simon 2015;Denic 2017); Protection from persecution by local authorities due to browsing anonymity (Chertoff and Simon 2015;Denic 2017;Owen and Savage 2016 ...
Preprint
Full-text available
The internet can be broadly divided into three parts: surface, deep and dark. The dark web has become notorious in the media for being a hidden part of the web where all manner of illegal activities take place. This review investigates how the dark web is being utilised with an emphasis on cybercrime, and how law enforcement plays the role of its adversary. The review describes these hidden spaces, sheds light on their history, the activities that they harbour including cybercrime, the nature of attention they receive, and methodologies employed by law enforcement in an attempt to defeat their purpose. More importantly, it is argued that these spaces should be considered a phenomenon and not an isolated occurrence to be taken as merely a natural consequence of technology. This paper contributes to the area of dark web research by serving as a reference document and by proposing a research agenda.
... Silkroad [29] and Hansa-Market [13] are well-known Dark Web marketplaces that sell drugs, illegal weapons, and even malware. In addition, researchers have revealed that the Dark Web contained a considerable amount of harmful content [27,53], and their findings have been confirmed by government investigative agencies [6] as well. ...
... According to Biryukov et al. [27], most dark websites hosted content devoted to adult films (17%), drugs (15%), counterfeit items (8%), and weapons (4%) in 6,579 HTTP(S) Web services. Moreover, another study [53] has shown that the most popular Tor hidden services were botnet command and control (C&C) servers, and a large proportion of the content on the Dark Web is of questionable legality. ...
... On the customer side of dark markets, Van Hout et al. [63] and Barratt et al. [25] have provided notable insights into the participants of drug marketplaces. Biryukov et al. [27,28] and Owen et al. [53] have analyzed the content and popularity of the Tor hidden services in general. Recently, Sanchez-Rola et al. [57] measured the structural connection between the Dark Web and the Surface Web, and they also assessed the usage of tracking scripts on the Dark Web. ...
Conference Paper
Anonymous network services on the World Wide Web have emerged as a new web architecture, called the Dark Web. The Dark Web has been notorious for harboring cybercriminals abusing anonymity. At the same time, the Dark Web has been a last resort for people who seek freedom of the press as well as avoid censorship. This anonymous nature allows website operators to conceal their identity and thereby leads users to have difficulties in determining the authenticity of websites. Phishers abuse this perplexing authenticity to lure victims; however, only a little is known about the prevalence of phishing attacks on the Dark Web. We conducted an in-depth measurement study to demystify the prevalent phishing websites on the Dark Web. We analyzed the text content of 28,928 HTTP Tor hidden services hosting 21 million dark webpages and confirmed 901 phishing domains. We also discovered a trend on the Dark Web in which service providers perceive dark web domains as their service brands. This trend exacerbates the risk of phishing for their service users who remember only a partial Tor hidden service address. Our work facilitates a better understanding of the phishing risks on the Dark Web and encourages further research on establishing an authentic and reliable service on the Dark Web.
... Research papers N. of papers Content classification [23], [24], [25], [26], [9], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38] 17 Security [39], [40], [41], [42], [43], [44], [45], [46], [47], [48], [49], [50], [51], [52], [53], [54], [55], [56], [57], [21], [58], [59], [60], [61], [62], [63], [64], [19] 28 Performance and deficiencies [65], [66], [67], [68], [69] 5 Changes in the design [69], [70], [71], [61], [72], [73] 6 Discovery and measurement [74], [75], [32], [33], [34], [37], [38], [68], [23], [30], [31], [19] 12 ...
... Research papers N. of papers Content classification [23], [24], [25], [26], [9], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38] 17 Security [39], [40], [41], [42], [43], [44], [45], [46], [47], [48], [49], [50], [51], [52], [53], [54], [55], [56], [57], [21], [58], [59], [60], [61], [62], [63], [64], [19] 28 Performance and deficiencies [65], [66], [67], [68], [69] 5 Changes in the design [69], [70], [71], [61], [72], [73] 6 Discovery and measurement [74], [75], [32], [33], [34], [37], [38], [68], [23], [30], [31], [19] 12 ...
... Their results indicate that most of the hidden services display illegal or controversial content in their dataset. In contrast, Savage and Owen [32] return to manual study and classification, suggesting that given the variety and complex technical nature of some content, automatic classifiers would be insufficient, due to the difficulty of interpreting the context completely. Other documentation of a manual classification reviewing 3480 HS is by Faizan and Khan [28], but they claim that only 38% of the servers found offer illegal services. ...
Preprint
Full-text available
Anonymous communications networks were born to protect the privacy of our communications, preventing censorship and traffic analysis. The most famous anonymous communication network is Tor. This anonymous communication network provides some interesting features, among them, we can mention user’s IP location or Tor Hidden Services (THS) as a mechanism to conceal the location of servers, mainly, web servers. THS is an important research field in Tor. However, there is a lack of reviews that sump up main findings and research challenges. In this article we present a systematic literature review that aims to offer a comprehensive view on the research made on Tor Hidden services presenting the state of the art and the different research challenges to be addressed. This review has been developed from a selection of 57 articles and present main findings and advances regarding Tor Hidden Services, limitations found, and future issues to be investigated.
... In the past there have been several other research efforts to learn more about how onion services are being used [6,7], but they all focused on V2 onion services. This is mainly caused by the fact that certain weaknesses in V2 onion services made it easier to collect and analyze data about them. ...
... Consequently, running a relay only to collect and probe onion addresses is considered malicious 1 behavior by the Tor project and such relays are actively removed from the consensus. These weaknesses were also used by researchers to collect the onion addresses of active V2 onion services [1,7] and establish connections to them in order to learn about the services they provide. 1 https://community.torproject.org/relay/community-resources/bad-relays/ Another aspect of the hidden service directory that made V2 onion services vulnerable, was the fact that it assigned the responsibilities for descriptor space based on the relay's fingerprint. ...
... The next interesting piece of information to know about V3 onion services is their average lifetime but, in contrast to previous studies on V2, we have no way of finding out if two blinded public keys belong to the same onion address. However, we do know that similar research on V2 onion services [7] found that most onion services did not live long enough to show up in their data multiple times. While we have no way of confirming these results for V3, we do know that every blinded public key is valid for 48 hours and is re-uploaded at least every 60-120 minutes [8]. ...
Conference Paper
Full-text available
Tor onion services are a challenging research topic because they were designed to reveal as little metadata as possible which makes it difficult to collect information about them. In order to improve and extend privacy protecting technologies, it is important to understand how they are used in real world scenarios. We discuss the difficulties associated with obtaining statistics about V3 onion services and present a way to monitor V3 onion services in the current Tor network that enables us to derive statistically significant information about them without compromising the privacy of individual Tor users. This allows us to estimate the number of currently deployed V3 onion services along with interesting conclusions on how and why onion services are used.
... Similarly, Biryukov et al. [25] suggested that although the list of top 10 HS included services related to BotNets, the most popular HS at that moment was Silk Road, which was shut down by the FBI in 2013. In the same way but with different results, Savage and Owen [34] found that the first place was given to HS related to abuse, although for ethical reasons they did not specify what types of abuse, nor make their .onion addresses public. ...
... Their results indicate that most of the hidden services display illegal or controversial content. In contrast, Savage and Owen [34] returned to manual study and classification, suggesting that given the variety and complex technical nature of some content, automatic classifiers would be insufficient, due to the difficulty of interpreting the context completely. Other documentation of a manual classification reviewing 3480 HS was prepared by Faizan and Khan [42], but they claimed that only 38% of the servers found offer illegal services. ...
... Savage and Owen [34] attempted to discover HS by operating 40 ORs over six months, each with a bandwidth of approximately 50 kB/s and left active continuously for 25 h, with the intention that their nodes or one of them would be eligible to obtain the HSDir indicator and be able to recover the maximum number of .onion addresses possible. ...
Article
Full-text available
Anonymous communications networks were created to protect the privacy of communications, preventing censorship and traffic analysis. The most famous anonymous communication network is Tor. This anonymous communication network provides some interesting features. Among them, we can mention that Tor can hide a user’s IP address when accessing to a service such as the Web, and it also supports Tor hidden services (THS) (now named onion services) as a mechanism to conceal the server’s IP address, used mainly to provide anonymity to websites. THS is an important research field in Tor. However, there is a lack of reviews that sum up the main findings and research challenges. In this article, we present a systematic literature review that aims to offer a comprehensive overview of the research made on THS by presenting the state-of-the-art and the different research challenges to be addressed. This review has been developed from a selection of 57 articles and presents main findings and advances regarding Tor hidden services, limitations found, and future issues to be investigated.
... In addition, it is necessary to monitor Tor traffic in order to uncover illegal activities and to create related measures. For this purpose, numerous crawler software are being developed [3,4,5]. This softwares send requests to the seed onion address which is given them as a parameter and download the content of that address if the connection is provided. ...
... In the work of [4], used a crawling software running on the TOR network by running a large number of TOR servers, to collect lists of hidden services, to classify their contents, and to calculate the number of requests. In the study of [9], they conducted a review of the features and performance comparison of various open-source crawling software such as Scrapy, Apache Nutch, Heritrix, WebSphinix. ...
Article
Full-text available
TOR (The Onion Routing) is a network structure that has become popular in recent years due to providing anonymity to its users and is often preferred by hidden services. In this network, which attracts attention due to the fact that privacy is essential, so the amount of data stored increases day by day, making it difficult to scan and analyze the data. In addition, it is highly likely that the process performed during the onion extension services scan will be considered as cyber-attack and the access to the relevant address will be blocked. Various crawler software has been developed in order to scan and access the services (onion web pages) in this network. However, crawling here is different from crawling pages in a surface network with extensions such as com, net, org. This is because the TOR network is located on the lower layers of the surface network, and the pages in TOR network are accessed only through the TOR browser instead of the traditional browsers (Chrome, Mozilla, etc.). In the crawler softwares developed to date, this situation was taken into consideration and in order to protect the confidentiality, the data was obtained by selecting paths through different relays in the requests made to the addresses. In the TOR network, reaching the target address by passing over different nodes in each request sent by the users, slows down this network. In addition, the low performance of a browser that tries to retrieve information through TOR brings long periods of waiting. Therefore, working with crawler software with high crawling and information acquisition speed will improve the analysis process of the researchers. 4 different crawler software was evaluated according to various criteria in terms of guiding the people who will conduct research in this field and evaluating the superior and weaknesses of the crawlers against each other. The study provides an important point of view for choosing the right crawler in terms of initial starting points for the researchers want to analyze of Tor web services.
... Academic studies and the Tor project themselves have long acknowledged the potential for misuse of the service (for example, see "Doesn't Tor enable criminals to do bad things?"; Minárik and Osula, 2015; Owen and Savage, 2016) and when combined with technologies such as untraceable cryptocurrencies, the possibility for criminals to hide their activities poses a real threat. ...
... Of the estimated 2.6 million users that use the Tor network daily, 12 21,718 (including 418 via a bridge) requests originated from an Australian Internet address (this does not equate to individuals, as requests for access and a user may make multiple requests). Owen & Savage (2016) reported that only 2% (52,000) of the users access onion services. They also found that approximately 80% of this traffic to onion services (42,000) was directed towards services which offered unmoderated porn, including CSAM 13 . ...
Technical Report
Full-text available
This submission considers the capability of Australia’s law enforcement agencies to tackle the growing scourge of child exploitation. In particular, concern about the existence of dedicated CSAM onion services hosted on the Tor network were raised (Terms of Reference [a]). This also required an understanding of the tools used by offenders to access CSAM and the ability of law enforcement to detect CSAM and identify offenders (Terms of Reference [d]). We suggest specific support be provided to develop and evaluate online treatment programs for CSAM offenders using the anonymous format provided by Tor . Finally, we propose a study examining the links between offline and online/contact and non-contact offenders in the context of anonymity (Terms of Reference [f]).
... We use the PSC deployment described in §3.1 to safely capture the approximate number of v2 onion services observed by our HSDirs. Unlike existing work that also attempts to quantify the number of onion services [25,29,30,39], we avoid the need to store (even temporarily) onion addresses, since PSC uses oblivious counters. ...
... Their work focused on traffic analysis attacks and the popularity study of a single social networking onion service. Finally, Owen and Savage perform empirical measurements of Tor's onion services [39]. We apply similar techniques-operating Tor relays and observing HSDir lookups-but also protect user privacy by using differentially private techniques. ...
Conference Paper
The Tor anonymity network is difficult to measure because, if not done carefully, measurements could risk the privacy (and potentially the safety) of the network's users. Recent work has proposed the use of differential privacy and secure aggregation techniques to safely measure Tor, and preliminary proof-of-concept prototype tools have been developed in order to demonstrate the utility of these techniques. In this work, we significantly enhance two such tools---PrivCount and Private Set-Union Cardinality---in order to support the safe exploration of new types of Tor usage behavior that have never before been measured. Using the enhanced tools, we conduct a detailed measurement study of Tor covering three major aspects of Tor usage: how many users connect to Tor and from where do they connect, with which destinations do users most frequently communicate, and how many onion services exist and how are they used. Our findings include that Tor has ~8 million daily users, a factor of four more than previously believed. We also find that ~40% of the sites accessed over Tor have a torproject.org domain name, ~10% of the sites have an amazon.com domain name, and ~80% of the sites have a domain name that is included in the Alexa top 1 million sites list. Finally, we find that ~90% of lookups for onion addresses are invalid, and more than 90% of attempted connections to onion services fail.
... In particular, Biryukov et al. [1] managed to collect a large number of hidden service descriptors by exploiting a presently-fixed Tor vulnerability to find out that most popular hidden services were related to botnets. Owen et al. [14] reported over hidden services persistence, contents, and popularity, by operating 40 relays over a 6 month time frame. Their aim was classifying services based on their contents. ...
... Yet, it must be kept in mind that the analysis may be susceptible to fluctuations due to the order in which pages have been first visited -and, hence, not revisited thereafter [23]. In the case of the Tor Web, the issue is exacerbated by the renowned volatility of Tor hidden services [1,13,14]. By executing three independent scraping attempts over five months, we aimed at making our analysis more robust and at telling apart "stable" and "temporary" features of the Tor Web. ...
Preprint
Full-text available
Tor is an anonymity network that allows offering and accessing various kinds of resources, known as hidden services, while guaranteeing sender and receiver anonymity. The Tor web is the set of web resources that exist on the Tor network, and Tor websites are part of the so-called dark web. Recent research works have evaluated Tor security, evolution over time, and thematic organization. Nevertheless, few information are available about the structure of the graph defined by the network of Tor websites. The limited number of Tor entry points that can be used to crawl the network renders the study of this graph far from being simple. In this paper we aim at better characterizing the Tor Web by analyzing three crawling datasets collected over a five-month time frame. On the one hand, we extensively study the global properties of the Tor Web, considering two different graph representations and verifying the impact of Tor's renowned volatility. We present an in depth investigation of the key features of the Tor Web graph showing what makes it different from the surface Web graph. On the other hand, we assess the relationship between contents and structural features. We analyse the local properties of the Tor Web to better characterize the role different services play in the network and to understand to which extent topological features are related to the contents of a service.
... Owen et. al. [12], by operating 40 relays over a 6 month time frame, reported over hidden services persistence, contents, and popularity. Their aim was classifying services based on their content. ...
... As showed by other studies [1,23,12], there is a huge variability in the persistence of Tor hidden services. This must be carefully taken into consideration in any attempt of characterizing the topology of the Tor Web graph, because by scraping the Tor network we only obtain a snapshot of the hidden services that were active at the time the crawler issued a connection request. ...
Conference Paper
Tor hidden services allow offering and accessing various Internet resources while guaranteeing a high degree of provider and user anonymity. So far, most research work on the Tor network aimed at discovering protocol vulnerabilities to de-anonymize users and services. Other work aimed at estimating the number of available hidden services and classifying them. Something that still remains largely unknown is the structure of the graph defined by the network of Tor services. In this paper, we describe the topology of the Tor graph (aggregated at the hidden service level) measuring both global and local properties by means of well-known metrics. We consider three different snapshots obtained by extensively crawling Tor three times over a 5 months time frame. We separately study these three graphs and their shared “stable” core. In doing so, other than assessing the renowned volatility of Tor hidden services, we make it possible to distinguish time dependent and structural aspects of the Tor graph. Our findings show that, among other things, the graph of Tor hidden services presents some of the characteristics of social and surface web graphs, along with a few unique peculiarities, such as a very high percentage of nodes having no outbound links.
... Despite several attacks on Tor such as statistical and confirmation attacks which are done by closely monitoring the timings of the packets at different nodes in Tor [5], [6], Tor remains a popular choice for botnets to hide their C&C servers [7] because it makes the C&C servers anonymous and setting up any botnet to use Tor is easy. The traditional methods for detecting botnets at the network through signatures, DNS analysis, anomalies, and traffic analysis are insufficient because Tor uses encryption and the previously known distinguishing features for botnets like the type of protocol used, port numbers, IP addresses, and packet size are the same with the normal traffic features making malicious traffic appear legitimate. ...
... The traditional methods for detecting botnets at the network through signatures, DNS analysis, anomalies, and traffic analysis are insufficient because Tor uses encryption and the previously known distinguishing features for botnets like the type of protocol used, port numbers, IP addresses, and packet size are the same with the normal traffic features making malicious traffic appear legitimate. Botnet C&C servers like Sefnit and a modified version of Zeus (called Skynet) have become problematic, receiving the highest number of hidden service requests in Tor [7]. A Skynet botnet C&C server analyzed was found to have between 12,000 to 30,000 infected hosts connecting to it [8]. ...
... Academic studies and the Tor project themselves have long acknowledged the potential for misuse of the service (for example, see "Doesn't Tor enable criminals to do bad things?"; Minárik and Osula, 2015; Owen and Savage, 2016) and when combined with technologies such as untraceable cryptocurrencies, the possibility for criminals to hide their activities poses a real threat. ...
... Of the estimated 2.6 million users that use the Tor network daily, 12 21,718 (including 418 via a bridge) requests originated from an Australian Internet address (this does not equate to individuals, as requests for access and a user may make multiple requests). Owen & Savage (2016) reported that only 2% (52,000) of the users access onion services. They also found that approximately 80% of this traffic to onion services (42,000) was directed towards services which offered unmoderated porn, including CSAM 13 . ...
... Academic studies and the Tor project themselves have long acknowledged the potential for misuse of the service (for example, see "Doesn't Tor enable criminals to do bad things?"; Minárik and Osula, 2015; Owen and Savage, 2016) and when combined with technologies such as untraceable cryptocurrencies, the possibility for criminals to hide their activities poses a real threat. ...
... Of the estimated 2.6 million users that use the Tor network daily, 12 21,718 (including 418 via a bridge) requests originated from an Australian Internet address (this does not equate to individuals, as requests for access and a user may make multiple requests). Owen & Savage (2016) reported that only 2% (52,000) of the users access onion services. They also found that approximately 80% of this traffic to onion services (42,000) was directed towards services which offered unmoderated porn, including CSAM 13 . ...
... Whilst the Tor Project and other stakeholders frequently describe hidden services as an example of privacy and anonymity for political dissidents, the academic literature paints a considerably different picture where the majority of hidden services facilitate criminal activity (e.g. child abuse and drugs) (Owen and Savage, 2016;Biryukov et al., 2013). Other authors have specifically examined the political uses of Tor hidden services and concluded that much of the discourse is banal and of little interest to anyone (Guitton, 2013). ...
... Next, we study the types of services offered by Tor onions. Given evidence in (Owen and Savage, 2016) of a large turnover in hidden services, we sought to understand more clearly the reasons for the turnover. Whilst long-lived services have been studied many time, we are not aware of any papers which have looked at those onions which are only up for a short period of time. ...
Article
The Tor Darknet is a pseudo-anonymous place to host content online frequently used by criminals to sell narcotics and to distribute illicit material. Many studies have attempted to estimate the size of the darknet, but this paper will show that previous estimates on size are inaccurate due to hidden service lifecycle. The first examination of its kind will be presented on the differences between short-lived and long-lived hidden services. Finally, in light of a new Tor protocol for the darknet which will prevent the running of relays to learning darknet sites, an analysis is presented of the use of crawling and whether this is an effective mechanism to discover sites for law enforcement.
... We use the PSC deployment described in §3.1 to safely capture the approximate number of v2 onion services observed by our HSDirs. Unlike existing work that also attempts to quantify the number of onion services [26,30,31,39], we avoid the need to store (even temporarily) onion addresses, since PSC uses oblivious counters. The results of our onion service address mea- surements are summarized in Table 6. ...
... Their work focused on traffic analysis attacks and the popularity study of a single so- cial networking onion service. Finally, Owen and Sav- age perform empirical measurements of Tor's onion ser- vices [39]. We apply similar techniques-operating Tor relays and observing HSDir lookups-but also protect user privacy by using differentially private techniques. ...
Preprint
The Tor anonymity network is difficult to measure because, if not done carefully, measurements could risk the privacy (and potentially the safety) of the network's users. Recent work has proposed the use of differential privacy and secure aggregation techniques to safely measure Tor, and preliminary proof-of-concept prototype tools have been developed in order to demonstrate the utility of these techniques. In this work, we significantly enhance two such tools--PrivCount and Private Set-Union Cardinality--in order to support the safe exploration of new types of Tor usage behavior that have never before been measured. Using the enhanced tools, we conduct a detailed measurement study of Tor covering three major aspects of Tor usage: how many users connect to Tor and from where do they connect, with which destinations do users most frequently communicate, and how many onion services exist and how are they used. Our findings include that Tor has ~8 million daily users (a factor of four more than previously believed) while Tor user IPs turn over almost twice in a 4 day period. We also find that ~40% of the sites accessed over Tor have a torproject.org domain name, ~10% of the sites have an amazon.com domain name, and ~80% of the sites have a domain name that is included in the Alexa top 1 million sites list. Finally, we find that ~90% of lookups for onion addresses are invalid, and more than 90% of attempted connections to onion services fail.
... Researchers primarily used technical skills, such as traffic analysis (Biryukov et al. 2014) and web-crawling (Dolliver and Kenney 2016;Moore and Rid 2016;Soska and Christin 2015) to understand the hidden websites. In the largest study related to Tor hidden services, at the time of the research, for example, Owen and Savage (2016) collected approximately 80,000 hidden services, using 40 onion relays over a period of 6 months. These approaches were used to identify the nature and characteristics of the websites and their users' activities on the Darknet. ...
Article
Full-text available
Accepted for publication on the 19th of July 2018 In recent years, the Darknet has become one of the most discussed topics in cyber security circles. Current academic studies and media reports tend to highlight how the anonymous nature of the Darknet is used to facilitate criminal activities. This paper reports on a recent research in four Darknet forums that reveals a different aspect of the Darknet. Drawing on our qualitative findings, we suggest that many users of the Darknet might not perceive it as intrinsically criminogenic, despite their acknowledgement of various kinds of criminal activity in this network. Further, our research participants emphasized on the achievement of constructive socio-political values through the use of the Darknet. This achievement is enabled by various characteristics that are rooted in the Darknet's technological structure, such as anonymity, privacy, and the use of cryptocurrencies. These characteristics provide a wide range of opportunities for good as well as for evil.
... Furthermore, offenders who traffic CSAM are often on the cutting edge of technology, utilizing virtual private networks (VPNs), encryption techniques in messaging apps, peer-to-peer sharing networks (P2P), and Tor (Dark Web) to conceal their online activity (Bursztein et al. 2019;Keller and Dance 2019). One research study into Tor hidden services found that 80% of total requests were for abuse sites, predominantly CSA (Owen and Savage 2016). The authors indicated that these abuse sites were "easily identifiable in the meta data, suggesting webmasters had confidence that Tor would provide robust anonymity" (Owen and Savage 2016, pp. ...
Article
Full-text available
With technological advances, the creation and distribution of child sexual abuse material (CSAM) has become one of the fastest growing illicit online industries in the United States. Perpetrators are becoming increasingly sophisticated and exploit cutting-edge technology, making it difficult for law enforcement to investigate and prosecute these crimes. There is limited research on best practices for investigating cases of CSAM. The aim of this research was to understand challenges and facilitators for investigating and prosecuting cases of CSAM as a foundation to develop best practices in this area. To meet these objectives, qualitative interviews and focus groups were conducted with participants throughout the western United States. Two major themes arose from this research: Theme 1: Challenges to investigating and prosecuting CSAM; and Theme 2: Facilitators to investigating and prosecuting CSAM. Within Theme 1, subthemes included technology and internet service providers, laws, lack of resources, and service provider mental health and well-being. Within Theme 2, subthemes included multidisciplinary teams and training. This research is a first step in understanding the experiences of law enforcement and prosecutors in addressing CSAM. Findings from this study can be used to support the development of best practices for those in the justice system investigating and prosecuting CSAM.
... Botnets are being controlled by C2 services hosted as Tor Hidden Services. (Owen and Savage, 2016). ...
... Сервери за команда и контрола (C2) распоредени како скриени услуги Ботнет се контролираат од C2 услугите хостирани како Тор скриени услуги (Owen and Savage, 2016). ...
... Yet, it must be kept in mind that the analysis may be susceptible to fluctuations due to the order in which pages have been first visited -and, hence, not revisited thereafter [26]. In the case of the Tor Web, the issue is exacerbated by the renowned volatility of Tor hidden services [7,8,32]. By executing three independent scraping attempts over five months, we aimed at making our analysis more robust and at telling apart "stable" and "temporary" features of the Tor Web. ...
Article
Full-text available
Tor is an open source software that allows accessing various kinds of resources, known as hidden services, while guaranteeing sender and receiver anonymity. Tor relies on a free, worldwide, overlay network, managed by volunteers, that works according to the principles of onion routing in which messages are encapsulated in layers of encryption, analogous to layers of an onion. The Tor Web is the set of web resources that exist on the Tor network, and Tor websites are part of the so-called dark web. Recent research works have evaluated Tor security, its evolution over time, and its thematic organization. Nevertheless, limited information is available about the structure of the graph defined by the network of Tor websites, not to be mistaken with the network of nodes that supports the onion routing. The limited number of entry points that can be used to crawl the network, makes the study of this graph far from being simple. In the present paper we analyze two graph representations of the Tor Web and the relationship between contents and structural features, considering three crawling datasets collected over a five-month time frame. Among other findings, we show that Tor consists of a tiny strongly connected component, in which link directories play a central role, and of a multitude of services that can (only) be reached from there. From this viewpoint, the graph appears inefficient. Nevertheless, if we only consider mutual connections, a more efficient subgraph emerges, that is, probably, the backbone of social interactions in Tor.
... Instead of monitoring their victims network traffic, attackers just have to expand resources to become part of the hidden service directory and they can monitor a random share of onion services every day. Previous research [2,12,21] has demonstrated that this issue can be exploited to obtain significant information about onion services. From the Tor project perspective this issue is acceptable, as it does not compromise the anonymity of the communication partners, which is their primary focus. ...
Article
Full-text available
Digital identity documents provide several key benefits over physical ones. They can be created more easily, incur less costs, improve usability and can be updated if necessary. However, the deployment of digital identity systems does come with several challenges regarding both security and privacy of personal information. In this paper, we highlight one challenge that digital identity systems face if they are set up in a distributed fashion: Network Unlinkability. We discuss why network unlinkability is so critical for a distributed digital identity system that wants to protect the privacy of its users and present a specific definition of unlinkability for our use-case. Based on this definition, we propose a scheme that utilizes the Tor network to achieve the required level of unlinkability by dynamically creating onion services and evaluate the feasibility of our approach by measuring the deployment times of onion services.
... Clients still have a unique identifier (usually an onion address), which they must share with all their communication partners. This is an issue because Owen and Savage [7] have demonstrated that the hidden service directory leaks information on when and how often onion services are accessed. In their conclusion, they go so far as to claim that "it is straight forward to collect and monitor Tor HSs without detection". ...
Conference Paper
Full-text available
Tor onion services utilize the Tor network to enable incoming connections on a device without disclosing its network location. Decentralized systems with extended privacy requirements like metadata-avoiding messengers typically rely on onion services. However, a long-lived onion service address can itself be abused as identifying metadata. Replacing static onion services with dynamic short-lived onion services may by a way to avoid such metadata leakage. This work evaluates the feasibility of short-lived dynamically generated onion services in decentralized systems. We show, based on a detailed performance analysis of the onion service deployment process, that dynamic onion services are already feasible for peer-to-peer communication in certain scenarios.
... Despite Soghoian's and Loesing's work, a traffic logging approach was still later used by Ling et al. [35] to measure and classify malicious traffic on Tor using the snort intrusion detection system. More recently, Owen and Savage collected a list of unique hidden service addresses by running a large number of hidden service directories and recording hidden service lookups [42]. These direct logging studies are illuminating but are ethically questionable since they provide no privacy protections. ...
Conference Paper
Experimentation tools facilitate exploration of Tor performance and security research problems and allow researchers to safely and privately conduct Tor experiments without risking harm to real Tor users. However, researchers using these tools configure them to generate network traffic based on simplifying assumptions and outdated measurements and without understanding the efficacy of their configuration choices. In this work, we design a novel technique for dynamically learning Tor network traffic models using hidden Markov modeling and privacy-preserving measurement techniques. We conduct a safe but detailed measurement study of Tor using 17 relays (~2% of Tor bandwidth) over the course of 6 months, measuring general statistics and models that can be used to generate a sequence of streams and packets. We show how our measurement results and traffic models can be used to generate traffic flows in private Tor networks and how our models are more realistic than standard and alternative network traffic generation~methods.
... Other work applied different approaches to data collection for classification. Owen et al. [30] measured the activity on each page of the darknet by operating a large number of Tor relays, a measure which is not possible with our methodology. Similarly to our work, Biryukov et al. [4] have classified darknet content with findings comparable to ours. ...
Preprint
Full-text available
In this paper, we analyze the topology and the content found on the "darknet", the set of websites accessible via Tor. We created a darknet spider and crawled the darknet starting from a bootstrap list by recursively following links. We explored the whole connected component of more than 34,000 hidden services, of which we found 10,000 to be online. Contrary to folklore belief, the visible part of the darknet is surprisingly well-connected through hub websites such as wikis and forums. We performed a comprehensive categorization of the content using supervised machine learning. We observe that about half of the visible dark web content is related to apparently licit activities based on our classifier. A significant amount of content pertains to software repositories, blogs, and activism-related websites. Among unlawful hidden services, most pertain to fraudulent websites, services selling counterfeit goods, and drug markets.
... While legal content exists in high proportions on the Tor network (Faizan and Khan, 2019;Moore and Rid, 2016), many of the Onion/Hidden services run cryptomarkets that provide global access to illicit substances, particularly drugs (Christin, 2013;Martin, 2014). Furthermore, drug websites turn out to be the most frequently visited Onion/Hidden services on the Tor network (Owen and Savage, 2016). Also, a subnational study showed that search interests on the Dark Web significantly predict the consumption of cannabis in the United States . ...
Article
The Onion Router (Tor) network is one of the most prominent technologies for accessing online resources while preserving anonymity. Effectively employing the technology is not a trivial process and involves the following steps: (1) motivated by needs, (2) becoming aware of and learning the technology, and (3) realizing desired purposes by usage. Using country-level panel data, this study examines the knowledge accumulation process through which motivated users eventually employ Tor. The results suggest that Tor is often searched in less free countries for censorship circumvention, while it is employed for Dark Web activities in more free countries. There is also an indirect relationship between being aware of the technology and its usage through how-to knowledge accumulation. This study is the first attempt to understand the role of knowledge accumulation in the global usage of Tor. The findings provide insights into the worldwide concerns of online privacy and Dark Web regulation.
... The researchers who are working on anonymous communication systems, try to improve security features. But at the same Analysis Tor Code [71] Resilience analysis [51] Latency and/or performance [121,131,114,132,55,134,137,64,72,76] Congestion [134,212] Path selection algorithm [81,131,132,140,137,35] Bridge discovery [84,139,127] Hidden services [89,123,97,36,126,77] Attack detection mechanisms and different kind of attacks (deanonymization, sybil, cell-attack, application classification attack, etc) [81,83,88,104,85,89,90,91,94,95,106,36,98,101,102,69] Improving anonymity [78,81,131,145,132,116,28,124,97,183,37] Improving security [121,42,114,118,122,123,93,124,55,106,126,127,37,69] Use of Tor in applications [40,148,125,134,174,213] time, they should cope with several security issues. When they improve a security feature, it could undermine the other one. ...
Article
Full-text available
Privacy is an important research topic due to its implications in society. Among the topics covered by privacy, we can highlight how to establish anonymous communications. During the latest years we have seen an important research in this field. In order to know what the state of the art in the research in anonymous communication systems (ACS) is, we have developed a systematic literature review (SLR). Namely, our SLR analyzes several issues: activity performed in the field, major research purposes, findings, what the most ACS study, the limitations of current research, how is leading the research in this field and the most highly-cited articles. Our SLR provides an analysis on 203 papers found in conferences and journals focused on anonymous communications systems between 2011 and 2016. Thus, our SLR provides an updated view on the status of the research in the field and the different future topics to be addressed.
Method
Full-text available
Since the advent of darknet markets, or illicit cryptomarkets, there has been a sustained interest in studying their operations: the actors, products, payment methods, and so on. However, this research has been limited by a variety of obstacles, including the difficulty in obtaining reliable and representative data, which present challenges to undertaking a complete and systematic study. The Australian National University’s Cybercrime Observatory has developed tools that can be used to collect and analyse data obtained from darknet markets. This paper describes these tools in detail. While the proposed methods are not error-free, they provide a further step in providing a transparent and comprehensive solution for observing darknet markets tailored for data scientists, social scientists, criminologists and others interested in analysing trends from darknet markets.
Article
Full-text available
The networked nature of criminals using the dark web is poorly understood and infrequently studied, mostly due to a lack of data. Rarer still are studies on the topological effectiveness of police interventions. Between 2014 and 2016, the Brazilian Federal Police raided a child pornography ring acting inside the dark web. With these data, we build a topic-view network and compare network disruption strategies with the real police work. Only 7.4% of the forum users share relevant content, and the topological features of this core differ markedly from other clandestine networks. Approximately 60% of the core users need to be targeted to fully break the network connectivity, while the real effect of the arrests was similar to random failure. Despite this topological robustness, the overall “viewership network” was still well disrupted by the arrests, because only 10 users contributed to almost 1/3 of the total post views and 8 of these were apprehended. Moreover, the users who were arrested provided a total of 60% of the viewed content. These results indicate that for similar online systems, aiming at the users that concentrate the views may lead to more efficient police interventions than focusing on the overall connectivity.
Conference Paper
The strong anonymity and hard-to-track mechanisms of the dark web provide shelter for illegal activities. The illegal content on the dark web is diverse and frequently updated. Traditional dark web classification uses large-scale web pages for supervised training. However, the difficulty of collecting enough illegal dark web content and the time consumption of manually labeling web pages have become challenges of current research. In this paper, we propose a method that can effectively classify illegal activities on the dark web. Instead of relying on the massive dark web training set, we creatively select laws and regulations related to each type of illegal activities to train the machine learning classifiers and achieve a good classification performance. In the areas of pornography, drugs, weapons, hackers, and counterfeit credit cards, we select relevant legal documents from the United States Code for supervised training and conduct a classification experiment on the illegal content of the real dark web we collected. The results show that combined with TF-IDF feature extraction and Naive Bayes classifier, we achieved an accuracy of 0.935 in the experimental environment. Our approach allows researchers and the network law enforcement to check whether their dark web corpus contains such illegal activities based on the relevant laws of the illegal categories they care about in order to detect and monitor potential illegal websites in a timely manner. And because neither a large training set nor the seed keywords provided by experts are needed, this classification method provides another idea for the definition of illegal activities on the dark web. Moreover, it makes sense to help explore and discover new types of illegal activities on the dark web.
Conference Paper
Tor onion services can be accessed and hosted anonymously on the Tor network. We analyze the protocols, software types, popularity and uptime of these services by collecting a large amount of .onion addresses. Websites are crawled and clustered based on their respective language. In order to also determine the amount of unique websites a de-duplication approach is implemented. To achieve this, we introduce a modular system for the real-time detection and analysis of onion services. Address resolution of onion services is realized via descriptors that are published to and requested from servers on the Tor network that volunteer for this task. We place a set of 20 volunteer servers on the Tor network in order to collect .onion addresses. The analysis of the collected data and its comparison to previous research provides new insights into the current state of Tor onion services and their development. The service scans show a vast variety of protocols with a significant increase in the popularity of anonymous mail servers and Bitcoin clients since 2013. The popularity analysis shows that the majority of Tor client requests is performed only for a small subset of addresses. The overall data reveals further that a large amount of permanent services provide no actual content for Tor users. A significant part consists instead of bots, services offered via multiple domains, or duplicated websites for phishing attacks. The total amount of onion services is thus significantly smaller than current statistics suggest.
Chapter
Full-text available
Der Beitrag diskutiert die Frage, inwiefern das Darknet Zukunftspotentiale für digitale Kommunikation besitzt. Aufgrund staatlicher Machtstrukturen ist davon auszugehen, dass Überwachungs- und Kontrollmöglichkeiten zukünftig auf die digitale Sphäre übertragen werden.
Chapter
Tor Hidden Service is a widely used tool designed to protect the anonymity of both client and server. In order to prevent the predecessor attacks, Tor introduces the guard selection algorithms. While the long-term binding relation between hidden service and guard relay increases the cost of existing predecessor attacks, it also gives us a new perspective to analyze the security of hidden services.
Article
The World Wide Web is the most widely used service on the Internet, although only a small part of it, the Surface Web, is indexed and accessible. The rest of the content, the Deep Web, is split between that unable to be indexed by usual search engines and content that needs to be accessed through specific methods and techniques. The latter is deployed in the so-called darknets, which have been the subject of much less study, where anonymity and privacy security services are preserved. Although there are several darknets, Tor is the most well-known and widely analyzed. Hence, the current work presents an analysis of web site connectivity, relationships and content of one of the less known and explored darknets: Freenet. Given the special features of this study, a new crawling tool, called c4darknet, was developed for the purpose of this work. This tool is, in turn, used in the experimentation that was carried out in a real distributed environment. Our results can be summarized as follows: there is great general availability of websites on Freenet; there are significant nodes within the network connectivity structure; and underage porn or child pornography is predominant among illegal content. Finally, the outcomes are compared against a similar study for the I2P darknet, showing special features and differences between both darknets.
Chapter
Zusammenfassung Das vorliegende Paper verhandelt Anonymität und Privatheit im Kontext des Internets. Wir fassen Anonymität als Differenzierung von Handlungskontexten durch Verschleierung von Identitätsmarkern. Autonomie wird mit Rössler verstanden als autonom-authentische Wahl und ihr Ausdruck in sozialem Handeln. Privatheit schützt diese Autonomie. Anonymität bezieht sich auf Privatheit technisch als kompensatorischer Schutz kontextdifferenziert gewünschten Wissens. Das Internet wird charakterisiert als Fernkommunikation, soziales Gedächtnis und Arena technischer Beobachtung. Anonymität und Privatheit werden auf die Kommunikationssituation im Internet bezogen, dabei wird herausgestellt, wie sich Darknets zu diesem Komplex verhalten. Letztlich werden einige Anmerkungen zum Wert einer „Kultur der Anonymität“ gemacht.
Chapter
Zusammenfassung Ziel der vorliegenden Arbeit ist die Erstellung einer Anonymitätsmatrix. Der Fokus liegt hierbei insbesondere in der Verbindung der technischen und psychologischen Komponenten der Betrachtung. Ausgangssituation ist die Verwendung einer Privacy Enhancing Technology, konkret dem Tor-Browser. So ist das Ziel, die Tor-Nutzergruppe in Bezug auf ihre Online-Privatheitskompetenz, Nutzungsweise und Grad der Anonymität zu erforschen. Hierzu wurde eine Online-Befragung ( $$N = 120$$ N = 120 ) sowie ein Leitfadeninterview mit einem Experten aus der IT-Sicherheitsforschung durchgeführt.
Article
Full-text available
Hidden services are a feature of Tor(The Onion Router)[1]. It provides anonymity for the service requester while maintaining the anonymity of the service provider. Since it is quite difficult to trace back and locate both parties in the communication, the criminals use hidden services mechanisms to construct various illegal activities in the darknet, which has brought adverse effects to society. In order to prevent the abuse of Tor hidden services, the discovery and analysis of hidden services are particularly important. The aim of this survey paper is to review and compare the literature of the past five years, provide the readers with methods for discovering tor hidden services, along with the various content analysis methods developed and proposed from time to time. we explain their key ideas and show their interrelations.
Chapter
It has been argued that the anonymity the dark web offers has allowed criminals to use it to run a range of criminal enterprises, acting with impunity and beyond the reach of law enforcement. By designing a process that can identify sites based on their criminality, law enforcement officers can devote their resources to finding the people behind the sites, rather than having to spend time identifying the sites themselves. The scope of the study in this chapter is focused solely on Tor’s hidden services. The research problem was to identify what percentage of hidden services are accessible and how many of these are connected to criminal/illicit activities. Additionally, our research also aims to determine if it is possible to automate a system to identify sites of interest for law enforcement by categorising them based on the prevalent crime type of the hidden service. In this chapter, we look at how hidden services are set up. To facilitate this, an experiment was conducted where a hidden service was set up and hosted on the Tor network. It is connected to the Tor network and obtained an un-attributable IP address, identified over 12,800 .onion addresses from which it scraped the HTML from the home page, before checking this against a pre-determined list of keywords to identify illicit sites and categorise each of these dependant on their type of criminality. Our approach successfully identified criminal sites without the need for human interaction making it a very useful triage solution. Whilst further work is required before its categorisation process is sufficiently robust enough to provide an accurate, unquestionable strategic overview of hidden services, the tool in essence, works very well in achieving its primary function; to identify criminal sites across the dark web.
Chapter
Der Cyberraum als neue Dimension des Wirtschaftskriegs und auch von Konflikten im Bereich der Hochtechnologie wird im elften Kapitel untersucht. In ihm sind Angriff und Verteidigung schwer zu identifizieren, oft kaum zu unterscheiden, so dass der Ordnungsrahmen des Wettbewerbs verschwimmt. Die verschiedenen Waffensysteme werden dargestellt, die Bedeutung intellektueller Eigentumsrechte thematisiert und die Möglichkeiten für verdeckte Operationen aufgezeigt. Eine Darstellung des Patentkriegs zwischen Samsung und Apple und des chinesisch-amerikanischen Hochtechnologiekonflikts beschließt dieses Kapitel.
Conference Paper
Full-text available
We perform a comprehensive measurement analysis of Silk Road, an anonymous, international online marketplace that operates as a Tor hidden service and uses Bitcoin as its exchange currency. We gather and analyze data over eight months between the end of 2011 and 2012, including daily crawls of the marketplace for nearly six months in 2012. We obtain a detailed picture of the type of goods sold on Silk Road, and of the revenues made both by sellers and Silk Road operators. Through examining over 24,400 separate items sold on the site, we show that Silk Road is overwhelmingly used as a market for controlled substances and narcotics, and that most items sold are available for less than three weeks. The majority of sellers disappears within roughly three months of their arrival, but a core of 112 sellers has been present throughout our measurement interval. We evaluate the total revenue made by all sellers, from public listings, to slightly over USD 1.2 million per month; this corresponds to about USD 92,000 per month in commissions for the Silk Road operators. We further show that the marketplace has been operating steadily, with daily sales and number of sellers overall increasing over our measurement interval. We discuss economic and policy implications of our analysis and results, including ethical considerations for future research in this area.
Article
Full-text available
Not only the free web is victim to China's excessive censorship, but also the Tor anonymity network: the Great Firewall of China prevents thousands of potential Tor users from accessing the network. In this paper, we investigate how the blocking mechanism is implemented, we conjecture how China's Tor blocking infrastructure is designed and we propose countermeasures. Our work bolsters the understanding of China's censorship capabilities and thus paves the way towards more effective evasion techniques.
Conference Paper
Full-text available
We present the first analysis of the popular Tor anonymity network that indicates the security of typical users against reasonably realistic adversaries in the Tor network or in the underlying Internet. Our results show that Tor users are far more susceptible to compromise than indicated by prior work. Specific contributions of the paper include(1)a model of various typical kinds of users,(2)an adversary model that includes Tor network relays, autonomous systems(ASes), Internet exchange points (IXPs), and groups of IXPs drawn from empirical study,(3) metrics that indicate how secure users are over a period of time,(4) the most accurate topological model to date of ASes and IXPs as they relate to Tor usage and network configuration,(5) a novel realistic Tor path simulator (TorPS), and(6)analyses of security making use of all the above. To show that our approach is useful to explore alternatives and not just Tor as currently deployed, we also analyze a published alternative path selection algorithm, Congestion-Aware Tor. We create an empirical model of Tor congestion, identify novel attack vectors, and show that it too is more vulnerable than previously indicated.
Conference Paper
Full-text available
To date, there has yet to be a study that characterizes the usage of a real deployed anonymity service. We present observations and analysis obtained by participating in the Tor network. Our primary goals are to better understand Tor as it is deployed and through this understanding, propose improvements. In particular, we are interested in answering the following questions: (1) How is Tor being used? (2) How is Tor being mis-used? (3) Who is using Tor? To sample the results, we show that web traffic makes up the majority of the connections and bandwidth, but non-interactive protocols consume a disproportionately large amount of bandwidth when compared to interactive protocols. We provide a survey of how Tor is being misused, both by clients and by Tor router operators. In particular, we develop a method for detecting exit router logging (in certain cases). Finally, we present evidence that Tor is used throughout the world, but router participation is limited to only a few countries.
Conference Paper
Full-text available
Botnets, networks of malware-infected machines that are controlled by an adversary, are the root cause of a large number of security problems on the Internet. A particularly sophisticated and insidi- ous type of bot is Torpig, a malware program that is designed to harvest sensitive information (such as bank account and credit card data) from its victims. In this paper, we report on our efforts to take control of the Torpig botnet and study its operations for a period of ten days. During this time, we observed more than 180 thousand infections and recorded almost 70 GB of data that the bots col- lected. While botnets have been "hijacked" and studied previously, the Torpig botnet exhibits certain properties that make the analysis of the data particularly interesting. First, it is possible (with rea- sonable accuracy) to identify unique bot infections and relate that number to the more than 1.2 million IP addresses that contacted our command and control server. Second, the Torpig botnet is large, targets a variety of applications, and gathers a rich and diverse set of data from the infected victims. This data provides a new un- derstanding of the type and amount of personal information that is stolen by botnets.
Article
Full-text available
We present Tor, a circuit-based low-latency anonymous communication service. This second-generation Onion Routing system addresses limitations in the original design by adding perfect forward secrecy, congestion control, directory servers, integrity checking, configurable exit policies, and a practical design for location-hidden services via rendezvous points. Tor works on the real-world Internet, requires no special privileges or kernel modifications, requires little synchronization or coordination between nodes, and provides a reasonable tradeoff between anonymity, usability, and efficiency. We briefly describe our experiences with an international network of more than 30 nodes. We close with a list of open problems in anonymous communication.
Article
Full-text available
Efficiently determining the node that stores a data item in a distributed network is an important and challenging problem. This paper describes the motivation and design of the Chord system, a decentralized lookup service that stores key/value pairs for such networks. The Chord protocol takes as input an m-bit identifier (derived by hashing a higher-level application specific key), and returns the node that stores the value corresponding to that key. Each Chord node is identified by an m-bit identifier and each node stores the key identifiers in the system closest to the node's identifier. Each node maintains an m-entry routing table that allows it to look up keys efficiently. Results from theoretical analysis, simulations, and experiments show that Chord is incrementally scalable, with insertion and lookup costs scaling logarithmically with the number of Chord nodes.
Conference Paper
In August 2013, the Tor network experienced a sudden, drastic reduction in performance due to the Mevade/Sefnit botnet. This botnet ran its command and control server as a Tor hidden service, so that all infected nodes contacted the command and control through Tor. In this paper, we consider several protocol changes to protect Tor against future incidents of this nature, describing the research challenges that must be solved in order to evaluate and deploy each of these methods. In particular, we consider four technical approaches: resource-based throttling, guard node throttling, reuse of failed partial circuits, and hidden service circuit isolation.
Conference Paper
Tor is the most popular low-latency anonymity overlay network for the Internet, protecting the privacy of hundreds of thousands of people every day. To ensure a high level of security against certain attacks, Tor currently utilizes special nodes called entry guards as each client's long-term entry point into the anonymity network. While the use of entry guards provides clear and well-studied security benefits, it is unclear how well the current entry guard design achieves its security goals in practice. We design and implement Changing of the Guards (COGS), a simulation-based research framework to study Tor's entry guard design. Using COGS, we empirically demonstrate that natural, short-term entry guard churn and explicit time-based entry guard rotation contribute to clients using more entry guards than they should, and thus increase the likelihood of profiling attacks. This churn significantly degrades Tor clients' anonymity. To understand the security and performance implications of current and alternative entry guard selection algorithms, we simulate tens of thousands of Tor clients using COGS based on Tor's entry guard selection and rotation algorithms, with real entry guard data collected over the course of eight months from the live Tor network.
Conference Paper
Tor hidden services are commonly used to provide a TCP based service to users without exposing the hidden server's IP address in order to achieve anonymity and anti-censorship. However, hidden services are currently abused in various ways. Illegal content such as child pornography has been discovered on various Tor hidden servers. In this paper, we propose a protocollevel hidden server discovery approach to locate the Tor hidden server that hosts the illegal website. We investigate the Tor hidden server protocol and develop a hidden server discovery system, which consists of a Tor client, a Tor rendezvous point, and several Tor entry onion routers. We manipulate Tor cells, the basic transmission unit over Tor, at the Tor rendezvous point to generate a protocol-level feature at the entry onion routers. Once our controlled entry onion routers detect such a feature, we can confirm the IP address of the hidden server. We conduct extensive analysis and experiments to demonstrate the feasibility and effectiveness of our approach.
Article
Decentralized systems, such as structured overlays, are sub-ject to the Sybil attack, in which an adversary creates many false identities to increase its influence. This paper describes a one-hop distributed hash table which uses the social links between users to strongly resist the Sybil attack. The social network is assumed to be fast mixing, meaning that a random walk in the honest part of the network quickly approaches the uniform distribution. As in the related SybilLimit sys-tem [25], with a social network of n honest nodes and m honest edges, the protocol can tolerate up to o(n/ log n) at-tack edges (social links from honest nodes to compromised nodes). The routing tables contain O(√ m log m) entries per node and are constructed efficiently by a distributed proto-col. This is the first sublinear solution to this problem. Pre-liminary simulation results are presented to demonstrate the approach's effectiveness.
Conference Paper
Existing low-latency anonymity networks are vulnerable to trac analysis, so location diversity of nodes is essential to defend against attacks. Previous work has shown that simply ensuring geographical di- versity of nodes does not resist, and in some cases exacerbates, the risk of trac analysis by ISPs. Ensuring high autonomous-system (AS) diver- sity can resist this weakness. However, ISPs commonly connect to many other ISPs in a single location, known as an Internet eXchange (IX). This paper shows that IXes are a single point where trac analysis can be performed. We examine to what extent this is true, through a case study of Tor nodes in the UK. Also, some IXes sample packets flowing through them for performance analysis reasons, and this data could be exploited to de-anonymize trac. We then develop and evaluate Bayesian trac analysis techniques capable of processing this sampled data.
Conference Paper
A focused crawler is a web crawler that traverse the web to explore information that is related to a particular topic of interest only. On the other hand, generic web crawlers try to search the entire web, which is impossible due to the size and the complexity of WWW. In this paper we make a survey of some of the latest focused web crawling approaches discussing each with their experimental results. We categorize them as focused crawling based on content analysis, focused crawling based on link analysis and focused crawling based on both the content and link analysis. We also give an insight to the future research and draw the overall conclusions.
Conference Paper
Users' anonymity and privacy are among the major concerns of today's Internet. Anonymizing networks are then poised to become an important service to support anonymous-driven Internet communications and consequently enhance users' privacy protection. Indeed, Tor an example of anonymizing networks based on onion routing concept attracts more and more volunteers, and is now popular among dozens of thousands of Internet users. Surprisingly, very few researches shed light on such an anonymizing network. Beyond providing global statistics on the typical usage of Tor in the wild, we show that Tor is actually being is-used, as most of the observed traffic belongs to P2P applications. In particular, we quantify the BitTorrent traffic and show that the load of the latter on the Tor network is underestimated because of encrypted BitTorrent traffic (that can go unnoticed). Furthermore, this paper provides a deep analysis of both the HTTP and BitTorrent protocols giving a complete overview of their usage. We do not only report such usage in terms of traffic size and number of connections but also depict how users behave on top of Tor. We also show that Tor usage is now diverted from the onion routing concept and that Tor exit nodes are frequently used as 1-hop SOCKS proxies, through a so-called tunneling technique. We provide an efficient method allowing an exit node to detect such an abnormal usage. Finally, we report our experience in effectively crawling bridge nodes, supposedly revealed sparingly in Tor.
Conference Paper
Large-scale peer-to-peer systems face security threats from faulty or hostile remote computing elements. To resist these threats, many such systems employ redundancy. However, if a single faulty entity can present multiple identities, it can control a substantial fraction of the system, thereby undermining this redundancy. One approach to preventing these “Sybil attacks” is to have a trusted agency certify identities. This paper shows that, without a logically centralized authority, Sybil attacks are always possible except under extreme and unrealistic assumptions of resource parity and coordination among entities.
Conference Paper
Whanau is a novel routing protocol for distributed hash tables (DHTs) that is efficient and strongly resis- tant to the Sybil attack. Whanau uses the social connec- tions between users to build routing tables that enable Sybil-resistant lookups. The number of Sybils in the so- cial network does not affect the protocol's performance, but linksbetweenhonest usersandSybils do.When there are n well-connected honest nodes, Whanau can tolerate up to O(n/log n) such "attack edges". This means that an adversary must convince a large fraction of the honest users to make a social connection with the adversary's Sybils before any lookups will fail. Whanau uses ideas from structured DHTs to build routing tables that contain O( √ nlog n) entries per node. It introduces the idea of layered identifiers to counter clustering attacks, a class of Sybil attackschallengingfor previous DHTs to handle. Using the constructed tables, lookups provably take constant time. Simulation results, using social network graphs from LiveJournal, Flickr, YouTube, and DBLP, confirm the analytic results. Ex- perimental results on PlanetLab confirmthat the protocol can handle modest churn.
Conference Paper
The Tor network is one of the largest deployed anonymity networks, consisting of 1500+ volunteer-run relays and probably hundreds of thousands of clients connecting every day. Its large user-base has made it attractive for researchers to analyze usage of a real deployed anonymity network. The recent growth of the network has also led to performance problems, as well as attempts by some governments to block access to the Tor network. Investigating these performance problems and learning about network blocking is best done by measuring usage data of the Tor network. However, analyzing a live anonymity system must be performed with great care, so that the users’ privacy is not put at risk. In this paper we present a case study of measuring two different types of sensitive data in the Tor network: countries of connecting clients, and exiting traffic by port. Based on these examples we derive general guidelines for safely measuring potentially sensitive data, both in the Tor network and in other anonymity networks.
Detection measurement and deanonymisation
  • A Biryukov
  • P.-P Pustogarov I. Weinmann
Circuit fingerprinting attacks: passive deanonymization of tor hidden services
  • A Kwon
  • M Alsabah
  • D Lazar