ArticlePDF Available

Anonymity loves company: Usability and the network effect

Authors:

Abstract

A growing field of literature is studying how usability im-pacts security [4]. One class of security software is anonymizing networks— overlay networks on the Internet that provide privacy by letting users transact (for example, fetch a web page or send an email) without re-vealing their communication partners. In this position paper we focus on the network effects of usability on privacy and security: usability is a factor as before, but the size of the user base also becomes a factor. We show that in anonymizing networks, even if you were smart enough and had enough time to use every system perfectly, you would nevertheless be right to choose your system based in part on its usability for other users. 1 Usability for others impacts your security While security software is the product of developers, the security it provides is a collaboration between developers and users. It's not enough to make software that can be used securely—software that is hard to use often suffers in its security as a result. For example, suppose there are two popular mail encryption programs: Heavy-Crypto, which is more secure (when used correctly), and LightCrypto, which is easier to use. Suppose you can use either one, or both. Which should you choose? You might decide to use HeavyCrypto, since it protects your secrets better. But if you do, it's likelier that when your friends send you confidential email, they'll make a mistake and encrypt it badly or not at all. With LightCrypto, you can at least be more certain that all your friends' correspondence with you will get some protection. What if you used both programs? If your tech-savvy friends use HeavyCrypto, and your less sophisticated friends use LightCrypto, then everybody will get as much protection as they can. But can all your friends really judge how able they are? If not, then by supporting a less usable option, you've made it likelier that your non-savvy friends will shoot themselves in the foot. The crucial insight here is that for email encryption, security is a collabora-tion between multiple people: both the sender and the receiver of a secret email must work together to protect its confidentiality. Thus, in order to protect your own security, you need to make sure that the system you use is not only usable by yourself, but by the other participants as well.
Anonymity Loves Company:
Usability and the Network Effect
Roger Dingledine and Nick Mathewson
The Free Haven Project
{arma,nickm}@freehaven.net
Abstract. A growing field of literature is studying how usability im-
pacts security [4]. One class of security software is anonymizing networks—
overlay networks on the Internet that provide privacy by letting users
transact (for example, fetch a web page or send an email) without re-
vealing their communication partners.
In this position paper we focus on the network effects of usability on
privacy and security: usability is a factor as before, but the size of the
user base also becomes a factor. We show that in anonymizing networks,
even if you were smart enough and had enough time to use every system
perfectly, you would nevertheless be right to choose your system based
in part on its usability for other users.
1 Usability for others impacts your security
While security software is the product of developers, the security it provides is
a collab oration between developers and users. It’s not enough to make software
that can be used securely—software that is hard to use often suffers in its security
as a result.
For example, suppose there are two popular mail e ncryption programs: Heavy-
Crypto, which is more secure (when used correctly), and LightCrypto, which is
easier to use. Suppose you can use either one, or both. Which should you choose?
You might dec ide to use HeavyCrypto, since it protects your secrets better.
But if you do, it’s likelier that when your friends send you confidential email,
they’ll make a mistake and encrypt it badly or not at all. With LightCrypto,
you can at least be more certain that all your friends’ correspondence with you
will get some protection.
What if you used both programs? If your tech-savvy friends use HeavyCrypto,
and your less sophisticated friends use LightCrypto, then everybody will get as
much protection as they can. But can all your friends really judge how able they
are? If not, then by supporting a less usable option, you’ve made it likelier that
your non-savvy friends will shoot themselves in the foot.
The crucial insight here is that for email encryption, security is a collabora-
tion between multiple p eople: both the sender and the receiver of a secret email
must work together to protect its confidentiality. Thus, in order to protect your
own security, you need to make sure that the system you use is not only usable
by yourself, but by the other participants as well.
This observation doesn’t mean that it’s always better to choose usability over
security, of course: if a system doesn’t address your threat model, no amount
of usability can make it secure. But conversely, if the people who need to use a
system can’t or won’t use it correctly, its ideal security properties are irrelevant.
Hard-to-use programs and proto c ols c an hurt security in many ways:
Programs with insecure modes of operation are bound to b e used unknow-
ingly in those modes.
Optional security, once disabled, is often never re-enabled. For example,
many users who ordinarily disable browser cookies for privacy reasons wind
up re-enabling them so they can access sites that require cookies, and later
leaving cookies enabled for all sites.
Badly labeled off switches for security are even worse: not only are they more
prone to accidental selection, but they’re more vulnerable to social attackers
who trick users into disabling their security. As an example, consider the
page-long warning your browser provides when you go to a website with an
expired or otherwise suspicious SSL certificate.
Inconvenient security is often abandoned in the name of day-to-day effi-
ciency: pe ople often write down difficult passwords to keep from forgetting
them, and share passwords in order to work together.
Systems that provide a false sense of security prevent users from taking real
measures to protect themselves: breakable encryption on ZIP archives, for
example, can fool users into thinking that they don’t need to encrypt email
containing ZIP archives.
Systems that provide bad mental models for their security can trick users
into believing they are more safe than they really are: for example, many
users interpret the “lock” icon in their web browsers to mean “You can safely
enter personal information,” when its meaning is closer to “Nobody can read
your information on its way to the named website.”
1
2 Usability is even more important for privacy
We described above that usability affects security in systems that aim to pro-
tect data confidentiality. But when the goal is privacy, it can become e ven more
important. Anonymizing networks such as Tor [8], JAP [3], Mixminion [6], and
Mixmaster [12] aim to hide not only what is being said, but also who is com-
municating with whom, which users are using which websites, and so on. These
systems have a broad range of users, including ordinary citizens who want to
avoid being profiled for targeted advertisements, corporations who don’t want
to reveal information to their competitors, and law enforcement and government
intelligence agencies who need to do operations on the Internet without being
noticed.
1
Or more accurately, “Nobody can read your information on its way to s omeone who
was able to convince one of the dozens to hundreds of CAs configured in your browser
that they are the named website, or who was able to compromise the named website
later on. Unless your computer has been compromised already.”
Anonymity networks work by hiding users among users. An eavesdropper
might be able to tell that Alice, Bob, and Carol are all using the network, but
should not be able to tell which of them is talking to Dave. This property is
summarized in the notion of an anonymity set—the total set of people who,
so far as the attacker can tell, might be the one engaging in some activity of
interest. The larger the set, the more anonymous the participants.
2
When more
users join the network, existing users become more secure, even if the new users
never talk to the existing ones! [1, 2] Thus, “anonymity loves company.”
3
In a data confidentiality system like PGP, Alice and Bob can decide by
themselves that they want to get security. As long as they both use the software
prop e rly, no third party can intercept the traffic and break their encryption.
However, Alice and Bob can’t get anonymity by themselves: they need to par-
ticipate in an infrastructure that coordinates users to provide cover for each
other.
No organization can build this infrastructure for its own sole use. If a single
corporation or government agency were to build a private network to protect its
operations, any connections entering or leaving that network would be obviously
linkable to the controlling organization. The members and operations of that
agency would be easier, not harder, to distinguish.
Thus, to provide anonymity to any of its users, the network must accept
traffic from external users, so the various user groups can blend together.
In practice, existing commercial anonymity s olutions (like Anonymizer.com)
are based on a se t of single-hop proxies. In these systems, each user connects
to a single proxy, which then relays the use r’s traffic. Single proxies provide
comparatively weak security, since a compromised proxy can trivially observe
all of its users’ actions, and an eavesdropper only needs to watch a single proxy
to perform timing correlation attacks against all its users’ traffic. Worse, all users
need to trust the proxy company to have good security itself as well as to not
reveal user activities.
The solution is distributed trust: an infrastructure made up of many inde-
pendently controlled proxies that work together to make sure no transaction’s
privacy relies on any single proxy. With distributed-trust anonymity networks,
users build tunnels or circuits through a series of servers. They encrypt their
traffic in multiple layers of encryption, and each se rver removes a single layer of
encryption. No single server knows the entire path from the user to the user’s
chosen destination. Therefore an attacker can’t break the user’s anonymity by
compromising or eavesdropping on any one server.
2
Assuming that all participants are equally plausible, of course. If the attacker sus-
pects Alice, Bob, and Carol equally, Alice is more anonymous than if the attacker
is 98% suspicious of Alice and 1% suspicious of Bob and Carol, even though the
anonymity sets are the same size. Because of this imprecision, research is moving
beyond simple anonymity sets to more sophisticated measures based on the attacker’s
confidence [7, 14].
3
This catch-phrase was first made popular in our context by the authors of the
Crowds [13] anonymity network.
Despite their increased security, distributed-trust anonymity networks have
their disadvantages. Because traffic needs to be relayed through multiple servers,
performance is often (but not always) worse. Also, the software to implement a
distributed-trust anonymity network is significantly more difficult to design and
implement.
Beyond these issues of the architecture and ownership of the network, how-
ever, there is another catch. For users to keep the same anonymity set, they need
to act like each other. If Alice’s client acts completely unlike Bob’s client, or if
Alice’s messages leave the system ac ting completely unlike Bob’s, the attacker
can use this information. In the worst case, Alice’s messages stand out entering
and leaving the network, and the attacker can treat Alice and those like her as
if they were on a s eparate network of their own. But even if Alice’s messages
are only recognizable as they leave the network, an attacker can use this infor-
mation to break exiting messages into “messages from User1,” “messages from
User2,” and so on, and can now get away with linking messages to their senders
as groups, rather than trying to guess from individual messages [6, 11]. Some of
this partitioning is inevitable: if Alice speaks Arabic and Bob speaks Bulgarian,
we can’t force them both to learn English in order to mask each other.
What does this imply for usability? More so than with encryption systems,
users of anonymizing networks may nee d to choose their systems based on how
usable others will find them, in order to get the protection of a larger anonymity
set.
3 Case study: usability means users, users mean security
We’ll consider an example. Practical anonymizing networks fall into two broad
classes. High-latency networks like Mixminion or Mixmaster can resist strong
attackers who can watch the whole network and control a large part of the
network infrastructure. To prevent this “global attacker” from linking senders to
recipients by correlating when messages enter and leave the system, high-latency
networks introduce large delays into message delivery times , and are thus only
suitable for applications like email and bulk data delivery—most users aren’t
willing to wait half an hour for their web pages to load. Low-latency networks
like Tor, on the other hand, are fast enough for web browsing, secure shell, and
other interactive applications, but have a weaker threat model: an attacker who
watches or controls both ends of a communication can trivially correlate message
timing and link the communicating parties [5, 10].
Clearly, users who need to resist strong attackers must choose high-latency
networks or nothing at all, and users who need to anonymize interactive appli-
cations must choose low-latency networks or nothing at all. But what should
flexible users choose? Against an unknown threat model, with a non-interactive
application (such as email), is it more secure to choose security or usability?
Security, we might decide. If the attacker turns out to be strong, then we’ll
prefer the high-latency network, and if the attacker is weak, then the extra
protection doesn’t hurt.
But since many users might find the high-latency network inconvenient, sup-
pose that it gets few actual users—so few, in fact, that its maximum anonymity
set is too small for our needs. In this case, we need to pick the low-latency sys-
tem, since the high-latency system, though it always protects us, never protects
us enough; whereas the low-latency system can give us enough protection against
at least some attackers.
This decision is especially messy because even the developers who implement
these anonymizing networks can’t recommend which approach is safer, since they
can’t predict how many users each network will get and they can’t predict the
capabilities of the attackers we might see in the wild. Worse, the anonymity
research field is still young, and doesn’t have many convincing techniques for
measuring and comparing the protection we get from various situations. So even
if the developers or users could somehow divine what level of anonymity they
require and what their expected attacker can do, the researchers still don’t know
what parameter values to recommend.
4 Case study: against options
Too often, designers faced with a security decision bow out, and instead leave
the choice as an option: protocol designers leave implem entors to decide, and
implementors leave the choice for their users. This approach can be bad for
security systems, and is nearly always bad for privacy systems.
With security:
Extra options often delegate security decisions to those least able to under-
stand what they imply. If the protocol designer can’t decide whether the
AES encryption algorithm is better than the Twofish encryption algorithm,
how is the end user supposed to pick?
Options make code harder to audit by increasing the volume of code, by
increasing the number of possible configurations exponentially, and by guar-
anteeing that non-default configurations will receive little testing in the field.
If AES is always the default, even with several independent implementations
of your protocol, how long will it take to notice if the Twofish implementation
is wrong?
Most users stay with default configurations as long as they work, and only
reconfigure their software as necess ary to make it usable. For e xample, suppose
the developers of a web browser can’t decide whether to support a given exten-
sion with unknown security implications, so they le ave it as a user-adjustable
option, thinking that users can enable or disable the extension based on their
security needs. In reality, however, if the extension is enabled by default, nearly
all users will leave it on whether it’s secure or not; and if the extension is dis-
abled by default, users will tend to enable it based on their perceived demand
for the extension rather than their security nee ds. Thus, only the most savvy
and security-conscious users—the ones who know more about web security than
the developers themselves—will actually wind up understanding the security
implications of their decision.
The real issue here is that designers often end up with a situation where they
need to choose betwe en ‘insecure’ and ‘inconvenient’ as the default configuration—
meaning they’ve already made a mistake in designing their application.
Of c ourse, when end users do know more about their individual security
requirements than application designers, then adding options is beneficial, espe-
cially when users describe their own s ituation (home or enterprise; shared versus
single-user host) rather than trying to specify what the program should do about
their situation.
In privacy applications, sup erfluous options are even worse. When there are
many different p oss ible configurations, eavesdroppers and insiders can often tell
users apart by which settings they choose. For example, the Type I or “Cypher-
punk” anonymous email network uses the OpenPGP encrypted message format,
which supports many symmetric and asymmetric ciphers. Because different users
prefer different ciphers, and be cause different versions of encryption programs
implementing OpenPGP (such as PGP and GnuPG) use different cipher suites,
users with uncommon preferences and versions stand out from the rest, and get
little privacy at all. Similarly, Type I allows users to pad their messages to a
fixed size so that an eavesdropper can’t correlate the sizes of messages passing
through the network—but it forces the user to decide what size of padding to
use! Unless a user can guess which padding size will happen to be most popular,
the option provides attackers with another way to tell users apart.
Even when users’ needs genuinely vary, adding options does not necessarily
serve their privacy. In practice, the default option usually prevails for casual
users, and therefore needs to prevail for security-conscious users even when it
would not otherwise be their best choice. For example, when an anonymizing
network allows user-selected message latency (like the Type I network does),
most users tend to use whichever setting is the default, so long as it works.
Of the fraction of users who change the default at all, most will not, in fact,
understand the security implications; and those few who do will need to decide
whether the increased traffic-analysis resistance that comes with more variable
latency is worth the decreased anonymity that comes from splitting away from
the bulk of the user base.
5 Case study: Mixminion and MIME
We’ve argued that providing too many observable options can hurt privacy, but
we’ve also argued that focusing too hard on privacy over usability can hurt
privacy itself. What happens when these principles conflict?
We encountered such a situation when designing how the Mixminion anony-
mous email network [6] should handle MIME-encoded data. MIME (Multipur-
pose Internet Mail Extensions) is the way a mail client tells the receiving mail
client about attachments, which character set was used, and so on. As a stan-
dard, MIME is so permissive and flexible that different email programs are al-
most always distinguishable by which s ubsets of the format, and which types of
encodings, they choose to generate. Trying to “normalize” MIME by convert-
ing all mail to a standard only works up to a point: it’s trivial to convert all
encodings to quoted-printable, for example, or to impose a standard order for
multipart/alternative parts; but demanding a uniform list of formats for multi-
part/alternative messages, normalizing HTML, stripping identifying information
from Microsoft Office documents, or imposing a single character encoding on each
language would likely be an impossible task.
Other possible solutions to this problem could include limiting users to a
single email client, or simply banning email formats other than plain 7-bit ASCII.
But these procrustean approaches would limit usability, and turn users away
from the Mixminion network. Since fewer users mean less anonymity, we must
ask whether users would be better off in a larger network where their messages
are likelier to be distinguishable based on email client, or in a smaller network
where everyone’s email formats look the same.
Some distinguishability is inevitable anyway, since users differ in their inter-
ests, languages, and writing styles: if Alice writes about astronomy in Amharic,
her messages are unlikely to be mistaken for Bob’s, who writes about botany in
Basque. Also, any attempt to restrict formats is likely to backfire. If we limited
Mixminion to 7-bit ASCII, users wouldn’t stop sending each other images, PDF
files, and messages in Chinese: they would instead follow the same evolutionary
path that led to MIME in the first place, and encode their messages in a variety
of distinguishable formats, with each client software implementation having its
own ad hoc favorites. So imposing uniformity in this place would not only drive
away users, but would probably fail in the long run, and lead to fragmentation
at least as dangerous as we were trying to avoid.
We also had to consider threat models. To take advantage of format dis-
tinguishability, an attacker needs to observe messages leaving the network, and
either exploit prior knowledge of suspected senders (“Alice is the only user who
owns a 1995 copy of Eudora”), or feed message format information into traffic
analysis approaches (“Since half of the messages to Alice are written in English,
I’ll assume they mostly come from different senders than the ones in Amharic.”).
Neither attack is certain or easy for all attackers; even if we can’t defeat them
in the worst possible case (where the attacker knows, for example, that only
one copy of LeetMailPro was ever sold), we can provide vulnerable users with
protection against weaker attackers.
In the end, we compromised: we perform as much normalization as we can,
and warn the user about document types such as MS Word that are likely to re-
veal identifying information, but we do not forbid any particular format or client
software. This way, users are informed about how to blend with the largest pos-
sible anonymity set, but users who prefer to use distinguishable formats rather
than nothing at all still receive and contribute protection against certain attack-
ers.
6 Case study: Tor Installation
Usability and marketing have also proved important in the development of Tor,
a low-latency anonymizing network for TCP traffic. The technical challenges Tor
has solved, and the ones it still needs to address, are described in its design pap er
[8], but at this point many of the most crucial challenges are in adoption and
usability.
While Tor was in it earliest s tages , its user base was a small number of fairly
sophisticated privacy enthusiasts with experience running Unix services, who
wanted to experiment with the network (or so they say; by design, we don’t track
our users). As the project gained more attention from venues including security
conferences, articles on Slashdot.org and Wired News, and more mainstream
media like the New York Times, Forbes, and the Wall Street Journal, we added
more users with less technical expertise. These users can now provide a broader
base of anonymity for high-needs users, but only when they receive good support
themselves.
For example, it has proven difficult to educate less sophisticated users about
DNS issues. Anonymizing TCP streams (as Tor does) does no good if appli-
cations reveal where they are about to connect by first p erforming a non-
anonymized hostname lookup. To stay anonymous, users need either to configure
their applications to pass hostnames to Tor directly by using SOCKS4a or the
hostname-based variant of SOCKS5; to manually resolve hostnames with Tor
and pass the resulting IPs to their applications; or to direct their applications to
application-sp e cific proxies which handle each protocol’s needs independently.
None of these is easy for an unsophisticated user, and when they misconfigure
their systems, they not only compromise their own privacy, but also provide no
cover for the users who are configured correctly: if Bob leaks a DNS request
whenever he is about to connect to a website, an observer can tell that anybody
connecting to Alice’s website anonymously must not be Bob. Thus, experienced
users have an interest in making sure inexperienced users can use the system
correctly. Tor being hard to configure is a weakness for everybo dy.
We’ve tried a few solutions that didn’t work as well as we hoped. Improving
documentation only helped the users who read it. We changed Tor to warn users
who provided an IP address rather than a hostname, but this warning usually
resulted in several e mail exchanges to explain DNS to the casual user, who had
typically no idea how to solve his problem.
At the time of this writing, the most important solutions for these users have
been to improve Tor’s documentation for how to configure various applications
to use Tor; to change the warning messages to refer users to a description of the
solution (“You are insecure. See this webpage.”) instead of a description of the
problem (“Your application is sending IPs instead of hostnames, which may leak
information. Consider using SOCKS4a instead.”); and to bundle Tor with the
support tools that it needs, rather than relying on users to find and configure
them on their own.
7 Case study: JAP and its anonym-o-meter
The Java Anon Proxy (JAP) is a low-latency anonymizing network for web
browsing developed and deployed by the Technical University of Dresden in
Germany [3]. Unlike Tor, which uses a free-route topology where each user can
choose where to enter the network and where to exit, JAP has fixed-route cas-
cades that aggregate user traffic into a single entry point and a single exit point.
The JAP client includes a GUI:
Notice the ‘anonymity meter’ giving the user an impression of the level of
protection for his current traffic.
How do we decide the value that the anonym-o-meter should report? In JAP’s
case, it’s based on the number of other users traveling through the cascade at
the same time. But alas, since JAP aims for quick transmission of bytes from
one end of the cascade to the other, it falls prey to the same end-to-end timing
correlation attacks as we described above. That is, an attacker who can watch
both ends of the cascade won’t ac tually be distracted by the other users [5,
10]. The JAP team has plans to implement full-scale padding from every user
(sending and receiving packets all the time even when they have nothing to
send), but—for usability reasons—they haven’t gone forward with these plans.
As the system is now, anonymity sets don’t provide a real measure of security
for JAP, since any attacker who can watch both ends of the cascade wins, and
the numbe r of users on the network is no real obstacle to this attack. However,
we think the anonym-o-meter is a great way to present security information to
the user, and we hope to see a variant of it deployed one day for a high-latency
system like Mixminion, where the amount of current traffic in the system is more
directly related to the protection it offers.
8 Bootstrapping, confidence, and reputability
Another area where human factors are critical in privacy is in bootstrapping
new systems. Since new systems start out with few users, they initially provide
only small anonymity sets. This starting state creates a dilemma: a new system
with improved privacy properties will only attract users once they believe it is
popular and therefore has high anonymity sets; but a system cannot be popular
without attracting users. New systems need users for privacy, but need privacy
for users.
Low-needs users can break the deadlock [1]. The earliest stages of an anonymiz-
ing network’s lifetime tend to involve users who need only to resist weak attack-
ers who can’t know which users are using the network and thus can’t learn the
contents of the small anonymity set. This solution reverses the early adopter
trends of many security systems : rather than attracting first the most security-
conscious users, privacy applications must begin by attracting low-needs users
and hobbyists.
But this analysis relies on users’ accurate perceptions of present and future
anonymity set size. As in market economics, expectations themselves can bring
about trends: a privacy system which people believe to be secure and popular
will gain users, thus becoming (all things equal) more secure and popular. Thus,
security depends not only on usability, but also on perceived usability by others,
and hence on the quality of the provider’s marketing and public relations. Per-
versely, over-hyped systems (if they are not too broken) may be a better choice
than modestly promoted ones, if the hype attracts more users.
Yet another factor in the safety of a given network is its reputability: the
perception of its social value based on its current users. If I’m the only user of a
system, it might be socially accepted, but I’m not getting any anonymity. Add a
thousand Communists, and I’m anonymous, but everyone thinks I’m a Commie.
Add a thousand random citizens (cancer survivors, privacy enthusiasts, and so
on) and now I’m hard to profile.
The more cancer survivors on Tor, the be tter for the human rights activists.
The more script kiddies, the worse for the normal users. Thus, reputability is
an anonymity issue for two reasons. First, it impacts the sustainability of the
network: a network that’s always about to be shut down has difficulty attracting
and keeping users, so its anonymity set suffers. Second, a disreputable network
attracts the attention of p owerful attackers who may not m ind revealing the
identities of all the users to uncover the few bad ones.
While people therefore have an incentive for the network to be used for “more
reputable” activities than their own, there are still tradeoffs involved when it
comes to anonymity. To follow the ab ove example, a network used entirely by
cancer survivors might welcome some Communists onto the network, though of
course they’d prefer a wider variety of users.
The impact of public perception on security is esp e cially important during
the bootstrapping phase of the network, where the first few widely publicized
uses of the network can dictate the types of users it attracts next.
9 Technical challenges to guessing the number of users in
a network
In addition to the social problems we describe above that make it difficult for a
typical user to guess which anonymizing network will be most popular, there are
some technical challenges as well. These stem from the fact that anonymizing
networks are good at hiding what’s going on—even from their users. For example,
one of the toughest attacks to solve is that an attacker might sign up many users
to artificially inflate the apparent size of the network. Not only does this Sybil
attack increase the o dds that the attacker will be able to successfully compromise
a given user transaction [9], but it might also trick users into thinking a given
network is safer than it actually is.
And finally, as we saw when discussing JAP above, the feasibility of end-to-
end attacks makes it hard to guess how much a given other user is contributing
to your anonymity. Even if he’s not actively trying to trick you, he can still
fail to provide cover for you, either because his behavior is sufficiently different
from yours (he’s active during the day, and you’re active at night), because
his transactions are different (he talks about physics, you talk about AIDS), or
because network design parameters (such as low delay for messages) mean the
attacker is able to track transactions more easily.
10 Bringing it all together
Users’ safety relies on them behaving like other users. But how can they predict
other users’ behavior? If they need to behave in a way that’s different from the
rest of the users, how do they compute the tradeoff and risks?
There are several lessons we might take away from researching anonymity
and usability. On the one hand, we might remark that anonymity is already
tricky from a technical standpoint, and if we’re required to get usability right as
well before anybody can be safe, it will be hard indeed to come up with a good
design: if lack of anonymity means lack of users, then we’re stuck in a depressing
loop. On the other hand, the loop has an optimistic side too. Good anonymity
can mean more users: if we can make good headway on usability, then as long
as the technical designs are adequate, we’ll end up with enough users to make
everything work out.
In any case, declining to design a good solution means leaving most users to
a less secure network or no anonymizing network at all. Cancer survivors and
abuse victims would continue communications and research over the Internet,
risking social or employment problems; and human rights workers in oppressive
countries would continue publishing their stories.
The temptation to focus on designing a perfectly usable system before build-
ing it can b e self-defeating, since obstacles to usability are often unforeseen. We
believe that the security community needs to focus on continuing experimental
deployment.
References
1. Alessandro Acquisti, Roger Dingledine, and Paul Syverson. On the Economics
of Anonymity. In Rebecca N. Wright, editor, Financial Cryptography. Springer-
Verlag, LNCS 2742, January 2003.
2. Adam Back, Ulf oller, and Anton Stiglic. Traffic Analysis Attacks and Trade-Offs
in Anonymity Providing Systems. In Ira S. Moskowitz, editor, Information Hiding
(IH 2001), pages 245–257. Springer-Verlag, LNCS 2137, 2001.
3. Oliver Berthold, Hannes Federrath, and Stefan opsell. Web MIXes: A system for
anonymous and unobservable Internet access. In H. Federrath, editor, Designing
Privacy Enhancing Technologies: Workshop on Design Issue in Anonymity and
Unobservability. Springer-Verlag, LNCS 2009, July 2000.
4. Lorrie Cranor and Mary Ellen Zurko, editors. Proceedings of the Symposium on
Usability Privacy and Security (SOUPS 2005), Pittsburgh, PA, July 2005.
5. George Danezis. The traffic analysis of continuous-time mixes. In David Martin and
Andrei Serjantov, editors, Privacy Enhancing Technologies (PET 2004), LNCS,
May 2004. http://www.cl.cam.ac.uk/users/gd216/cmm2.pdf.
6. George Danezis, Roger Dingledine, and Nick Mathewson. Mixminion: Design of a
type III anonymous remailer proto col. In 2003 IEEE Symposium on Security and
Privacy, pages 2–15. IEEE CS, May 2003.
7. Claudia Diaz, Stefaan Seys, Joris Claessens, and Bart Preneel. Towards measuring
anonymity. In Paul Syverson and Roger Dingledine, editors, Privacy Enhancing
Technologies, LNCS, April 2002.
8. Roger Dingledine, Nick Mathewson, and Paul Syverson. Tor: The Second-
Generation Onion Router. In Proceedings of the 13th USENIX Security Symposium,
August 2004.
9. John Douceur. The Sybil Attack. In Proceedings of the 1st International Peer To
Peer Systems Workshop (IPTPS), March 2002.
10. Brian N. Levine, Michael K. Reiter, Chenxi Wang, and Matthew K. Wright. Timing
attacks in low-latency mix-based systems. In Ari Juels, editor, Proceedings of
Financial Cryptography (FC ’04). Springer-Verlag, LNCS 3110, February 2004.
11. Nick Mathewson and Roger Dingledine. Practical Traffic Analysis: Extending and
Resisting Statistical Disclosure. In Proceedings of Privacy Enhancing Technologies
workshop (PET 2004), volume 3424 of LNCS, May 2004.
12. Ulf oller, Lance Cottrell, Peter Palfrader, and Len Sassaman. Mixmaster Protocol
Version 2. Draft, July 2003. http://www.abditum.com/mixmaster-spec.txt.
13. Michael Reiter and Aviel Rubin. Crowds: Anonymity for web transactions. ACM
Transactions on Information and System Security, 1(1), June 1998.
14. Andrei Serjantov and George Danezis. Towards an information theoretic metric for
anonymity. In Paul Syverson and Roger Dingledine, editors, Privacy Enhancing
Technologies, LNCS, San Francisco, CA, April 2002.
... To realize the full benefits of MLEFlow, however, most or all bandwidth authorities should be upgraded to the new algorithm. This would result in improved performance from better load-balancing, which in itself can lead to better anonymity [8], and it would remove the reliance on the easy-to-manipulate self-reported bandwidth of TorFlow and sbws. An additional deployment security consideration is the ramp-up period of new relays [6]: MLEFlow learns correct relay capacities much more quickly (see Section 4), making it easier to add new relays to the network, which could potentially help adversaries. ...
... To simulate this, we adjusted simulation to add a bandwidth cap to each client flow, selected uniformly at random from the interval [8,18] KB/s. Since the average bandwidth of flows in the full utilization scenario was approximately 22 KB/s, the cap means that the flows can utilize at most about 60% of the Tor network capacity. ...
... We simulate 500 clients, 3 (c) Circuit bandwidth in a low-load network using MLEFlow, TorFlow-P, and TorFlow. Fig. 9. Simulation results in a low-load network, where client capacities are capped to [8,18] We've also set the duration of each consensus round to 10 minutes to reduce the time needed to simulate each round; in real-life Tor, directory authorities generate a new consensus every hour. Clients also reuse consensus documents for up to 3 rounds. ...
Article
Full-text available
Tor has millions of daily users seeking privacy while browsing the Internet. It has thousands of relays to route users’ packets while anonymizing their sources and destinations. Users choose relays to forward their traffic according to probability distributions published by the Tor authorities . The authorities generate these probability distributions based on estimates of the capacities of the relays. They compute these estimates based on the bandwidths of probes sent to the relays. These estimates are necessary for better load balancing. Unfortunately, current methods fall short of providing accurate estimates leaving the network underutilized and its capacities unfairly distributed between the users’ paths. We present MLEFlow , a maximum likelihood approach for estimating relay capacities for optimal load balancing in Tor. We show that MLEFlow generalizes a version of Tor capacity estimation, TorFlow - P , by making better use of measurement history. We prove that the mean of our estimate converges to a small interval around the actual capacities, while the variance converges to zero. We present two versions of MLEFlow : MLEFlow - CF , a closed-form approximation of the MLE and MLEFlow - Q , a discretization and iterative approximation of the MLE which can account for noisy observations. We demonstrate the practical benefits of MLEFlow by simulating it using a flow-based Python simulator of a full Tor network and packet-based Shadow simulation of a scaled down version. In our simulations MLEFlow provides significantly more accurate estimates, which result in improved user performance, with median download speeds increasing by 30%.
... However, for delay-sensitive applications, such as web browsing and instant messaging, the delay of Tor leads to poor user experience [9]. Furthermore, this may cause some users to exit, which will reduce the size of the anonymity set, thus affecting the anonymity for all users [10,11]. ...
Article
Full-text available
As the most popular anonymous communication system, Tor provides anonymous protection for users by sending their messages through a series of relays. Due to the use of the bandwidth-weighted path selection algorithm, many more users choose routers with high bandwidth as relays. This will cause the utilization of high bandwidth routers to be much higher than that of low bandwidth routers, which will bring congestion risk. The Quality of Service (QoS) is difficult to guarantee for users who need delay-sensitive services such as web browsing and instant messaging. To reduce the average load of routers and improve the network throughput, we propose a circuit construction method with multiple parallel middle relays and conduct a dynamic load allocation method. The experiment demonstrates that our proposed method can provide better load balancing. Compared with other multipath anonymous communication networks, our proposed method can provide better anonymity.
... That is, an internet censor may be able to know that some Tor user visited a blocked site, but not which Tor user. Because of this, the degree of anonymity Tor provides in practice grows with the total number of concurrent users on the network [25]. ...
Preprint
Full-text available
We present ShorTor, a protocol for reducing latency on the Tor network. ShorTor uses multi-hop overlay routing, a technique typically employed by content delivery networks, to influence the route Tor traffic takes across the internet. ShorTor functions as an overlay on top of onion routing-Tor's existing routing protocol and is run by Tor relays, making it independent of the path selection performed by Tor clients. As such, ShorTor reduces latency while preserving Tor's existing security properties. Specifically, the routes taken in ShorTor are in no way correlated to either the Tor user or their destination, including the geographic location of either party. We analyze the security of ShorTor using the AnoA framework, showing that ShorTor maintains all of Tor's anonymity guarantees. We augment our theoretical claims with an empirical analysis. To evaluate ShorTor's performance, we collect a real-world dataset of over 400,000 latency measurements between the 1,000 most popular Tor relays, which collectively see the vast majority of Tor traffic. With this data, we identify pairs of relays that could benefit from ShorTor: that is, two relays where introducing an additional intermediate network hop results in lower latency than the direct route between them. We use our measurement dataset to simulate the impact on end users by applying ShorTor to two million Tor circuits chosen according to Tor's specification. ShorTor reduces the latency for the 99th percentile of relay pairs in Tor by 148 ms. Similarly, ShorTor reduces the latency of Tor circuits by 122 ms at the 99th percentile. In practice, this translates to ShorTor truncating tail latencies for Tor which has a direct impact on page load times and, consequently, user experience on the Tor browser.
... With the same fraction of adversaries, for Dandelion++ the median entropy is about five bits (i.e., equivalent to 32 possible originators per transaction) while capturing 59% of transactions on average, thus demonstrating better anonymity in comparison to Dandelion even if still limited. Finally, we increase network size and analyze whether anonymity increases accordingly, as it would be expected given that network scaling enables larger anonymity sets [12]. We find however that the entropy offered by both Dandelion and Dandelion++ does not increase with network size, indicating that scaling the Bitcoin network will not result in better anonymity. ...
Preprint
Cryptocurrency systems can be subject to deanonymization attacks by exploiting the network-level communication on their peer-to-peer network. Adversaries who control a set of colluding node(s) within the peer-to-peer network can observe transactions being exchanged and infer the parties involved. Thus, various network anonymity schemes have been proposed to mitigate this problem, with some solutions providing theoretical anonymity guarantees. In this work, we model such peer-to-peer network anonymity solutions and evaluate their anonymity guarantees. To do so, we propose a novel framework that uses Bayesian inference to obtain the probability distributions linking transactions to their possible originators. We characterize transaction anonymity with those distributions, using entropy as metric of adversarial uncertainty on the originator's identity. In particular, we model Dandelion, Dandelion++ and Lightning Network. We study different configurations and demonstrate that none of them offers acceptable anonymity to their users. For instance, our analysis reveals that in the widely deployed Lightning Network, with just 5 strategically chosen colluding nodes the adversary can uniquely determine the originator for 67% of the transactions. In Dandelion, an adversary that controls 15% of the nodes has on average uncertainty among only 4 possible originators. Moreover, we observe that due to the way Dandelion and Dandelion++ are designed, increasing the network size does not correspond to an increase in the anonymity set of potential originators, highlighting the limitations of existing proposals.
... We expect that individual users are more likely to adopt a technology when they have a community that also uses the technology. For encrypted communication, the access to a community is partly due to the network effect [14]. For the network to stay private, all parties must also be using encryption correctly, thus the encrypted communication is only as strong as the weakest user. ...
Article
Full-text available
Existing end-to-end-encrypted (E2EE) email systems, mainly PGP, have long been evaluated in controlled lab settings. While these studies have exposed usability obstacles for the average user and offer design improvements, there exist users with an immediate need for private communication, who must cope with existing software and its limitations. We seek to understand whether individuals motivated by concrete privacy threats, such as those vulnerable to state surveil-lance, can overcome usability issues to adopt complex E2EE tools for long-term use. We surveyed regional activists, as surveillance of social movements is well-documented. Our study group includes individuals from 9 social movement groups in the US who had elected to participate in a workshop on using Thunder-bird+Enigmail for email encryption. These workshops tool place prior to mid-2017, via a partnership with a non-profit which supports social movement groups. Six to 40 months after their PGP email encryption training, more than half of the study participants were continuing to use PGP email encryption despite intervening widespread deployment of simple E2EE messaging apps such as Signal. We study the interplay of usability with social factors such as motivation and the risks that individuals undertake through their activism. We find that while usability is an important factor, it is not enough to explain long term use. For example, we find that riskiness of one’s activism is negatively correlated with long-term PGP use. This study represents the first long-term study, and the first in-the-wild study, of PGP email encryption adoption.
Article
Full-text available
TOR software covers the browsers securely. This option makes the journalists, militants and some authorities to be unknown over internet. TOR (onion router) is very successful in low latency anonymous communication. TOR gives more benefits like untraceable network to both local adversary controlling a small network and companies that are low enough to support anonymous use and also remote login. There is other software like AN.ON, crowds and anonymiser.com also provide anonymous communication over networks but TOR software had proved its efficiency in the networking market especially in field of network security ,The IP addresses will be hidden in the server.
Chapter
Privacy-seeking cryptocurrency users rely on anonymization techniques like CoinJoin and ring transactions. By using such technologies benign users potentially provide anonymity to bad actors. We propose overlay protocols to resolve the tension between anonymity and accountability in a peer-to-peer manner. Cryptocurrencies can adopt this approach to enable prosecution of publicly recognized crimes. We illustrate how the protocols could apply to Monero rings and CoinJoin transactions in Bitcoin.
Article
Full-text available
This paper explores, through empirical research, how values, engineering practices, and technological design decisions shape one another in the development of privacy technologies. We propose the concept of “privacy worlds” to explore the values and design practices of the engineers of one of the world’s most notable (and contentious) privacy technologies: the Tor network. By following Tor’s design and development we show a privacy world emerging—one centered on a construction of privacy understood through the topology of structural power in the Internet backbone. This central “cipher” discourse renders privacy as a problem that can be “solved” through engineering, allowing the translation and representation of different groups of imagined users, adversaries, and technical aspects of the Internet in the language of the system. It also stabilizes a “flattened,” neutralized conception of privacy, risking stripping it of its political and cultural depth. We argue for an enriched empirical focus on design practices in privacy technologies, both as sites where values and material power are shaped, and as a place where the various worlds that will go on to cluster around them—of users, maintainers, and others—are imagined and reconciled.
Conference Paper
Full-text available
A mix is a communication proxy that attempts to hide the correspondence between its incoming and outgoing messages. Timing attacks are a significant challenge for mix-based systems that wish to support interactive, low-latency applications. However, the potency of these attacks has not been studied carefully. In this paper, we investigate timing analysis attacks on low-latency mix systems and clarify the threat they pose. We propose a novel technique, defensive dropping, to thwart timing attacks. Through simulations and analysis, we show that defensive dropping can be effective against attackers who employ timing analysis.
Conference Paper
Full-text available
We apply the information-theoretic anonymity metrics to continuous-time mixes, that individually delay messages instead of batch- ing them. The anonymity of such mixes is measured based on their delay characteristics, and as an example the exponential mix (sg-mix) is anal- ysed, simulated and shown to use the optimal strategy. We also describe a practical and powerful trac analysis attack against connection based continuous-time mix networks, despite the presence of some cover traf- c. Assuming a passive observer, the conditions are calculated that make tracing messages through the network possible. We will present a new framework for analysing the anonymity provided by mix strategies that individually delay messages. In order to make the analysis easier, we assume that the rate of arrival of messages to the mixes is Poisson distributed. Using the work presented here, dieren t mix strategies can be anal- ysed but we choose to illustrate our method with an analysis of the exponential mix (sg-mix), both because it is relatively simple and because it has been exten- sively mentioned in the literature. Furthermore, a section is devoted to showing that given some latency constraints the exponential mix is the mixing strategy providing maximal anonymity. We then present a powerful attack that given enough packets, can break the anonymity provided by connection-based mix networks functioning in continuous- time. The attack relies on detecting an input trac pattern, at the outputs of the mixes or network, using signal detection techniques. A detailed description is given on how to perform this attack, and condence intervals are provided to assess the reliability of the results. The attack can be used eectiv ely against many proposed anonymous communications systems such as Onion Routing (13), Freedom (4), TARZAN (7) or MorphMix (14).
Conference Paper
Full-text available
We discuss problems and trade-offs with systems providing anonymity for web browsing (or more generally any communication system that requires low latency interaction). We focus on two main systems: the Freedom network [12] and PipeNet [8]. Although Freedom is efficient and reasonably secure against denial of service attacks, it is vulnerable to some generic traffic analysis attacks, which we describe. On the other hand, we look at PipeNet, a simple theoretical model which protects against the traffic analysis attacks we point out, but is vulnerable to denial of services attacks and has efficiency problems. In light of these observations, we discuss the trade-offs that one faces when trying to construct an efficient low latency communication system that protects users anonymity.
Conference Paper
Full-text available
Decentralized anonymity infrastructures are still not in wide use today. While there are technical barriers to a secure robust design, our lack of understanding of the incentives to participate in such systems remains a major roadblock. Here we explore some reasons why anonymity systems are particularly hard to deploy, enumerate the incentives to participate either as senders or also as nodes, and build a general model to describe the e#ects of these incentives. We then describe and justify some simplifying assumptions to make the model manageable, and compare optimal strategies for participants based on a variety of scenarios.
Conference Paper
Full-text available
We extend earlier research on mounting and resisting passive long-term end-to-end traffic analysis attacks against anonymous message systems, by describing how an eavesdropper can learn sender-receiver connections even when the substrate is a network of pool mixes, the attacker is non-global, and senders have complex behavior or generate padding messages. Additionally, we describe how an attacker can use information about message distinguishability to speed the attack. We simulate our attacks for a variety of scenarios, focusing on the amount of information needed to link senders to their recipients. In each scenario, we show that the intersection attack is slowed but still succeeds against a steady-state mix network. We find that the attack takes an impractical amount of time when message delivery times are highly variable; when the attacker can observe very little of the network; and when users pad consistently and the adversary does not know how the network behaves in their absence.
Article
Full-text available
We present Tor, a circuit-based low-latency anonymous communication service. This second-generation Onion Routing system addresses limitations in the original design by adding perfect forward secrecy, congestion control, directory servers, integrity checking, configurable exit policies, and a practical design for location-hidden services via rendezvous points. Tor works on the real-world Internet, requires no special privileges or kernel modifications, requires little synchronization or coordination between nodes, and provides a reasonable tradeoff between anonymity, usability, and efficiency. We briefly describe our experiences with an international network of more than 30 nodes. We close with a list of open problems in anonymous communication.
Article
In this paper we introduce a system called Crowds for protecting users' anonymity on the world-wide-web. Crowds, named for the notion of “blending into a crowd,” operates by grouping users into a large and geographically diverse group (crowd) that collectively issues requests on behalf of its members. Web servers are unable to learn the true source of a request because it is equally likely to have originated from any member of the crowd, and even collaborating crowd members cannot distinguish the originator of a request from a member who is merely forwarding the request on behalf of another. We describe the design, implementation, security, performance, and scalability of our system. Our security analysis introduces degrees of anonymity as an important tool for describing and proving anonymity properties.
Chapter
We present the architecture, design issues and functions of a MIX-based system for anonymous and unobservable real-time Internet access. This system prevents traffic analysis as well as flooding attacks. The core technologies include an adaptive, anonymous, time/volumesliced channel mechanism and a ticket-based authentication mechanism. The system also provides an interface to inform anonymous users about their level of anonymity and unobservability.