Asirra: A CAPTCHA that Exploits
Interest-Aligned Manual Image Categorization
Jeremy Elson, John R. Douceur, Jon Howell
We present Asirra (Figure 1), a CAPTCHA that asks users to iden-
tify cats out of a set of 12 photographs of both cats and dogs. Asirra
is easy for users; user studies indicate it can be solved by humans
99.6% of the time in under 30 seconds. Barring a major advance
in machine vision, we expect computers will have no better than a
1/54,000 chance of solving it. Asirra’s image database is provided
by a novel, mutually beneficial partnership with Petfinder.com. In
exchange for the use of their three million images, we display an
“adopt me” link beneath each one, promoting Petfinder’s primary
mission of finding homes for homeless animals. We describe the
design of Asirra, discuss threats to its security, and report early de-
ployment experiences. We also describe two novel algorithms for
amplifying the skill gap between humans and computers that can
be used on many existing CAPTCHAs.
Over the past few years, an increasing number of public web
services have attempted to prevent exploitation by bots and auto-
mated scripts, by requiring a user to solve a Turing-test challenge
(commonly known as a CAPTCHA1or HIP2) before using the ser-
vice. Because the challenges must be easy to generate but diffi-
cult (for non-humans) to solve, all CAPTCHAs rely on some secret
information that is known to the challenger but not to the agent
being challenged. For our purposes, we can divide CAPTCHAs
into two classes depending on the scope of this secret. In Class I
CAPTCHAs, the secret is merely a random number, which is fed
into a publicly known algorithm to yield a challenge, somewhat
analogous to a public-key cryptosystem. Class II CAPTCHAs em-
ploy both a secret random input and a secret high-entropy database,
somewhat analogous to a one-time-pad cryptosystem. A critical
with a sufficiently large set of classified, high-entropy entries.
Class I CAPTCHAs have many virtues. They can be concisely
described in a small amount of software code; they have no long-
1“Completely Automated Public Turing test to tell Computers and Humans
Apart.” CAPTCHA is a trademark of Carnegie Mellon University.
2“Human Interaction Proof”
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
CCS’07, October 29–November 2, 2007, Alexandria, Virginia,
Copyright 2007 ACM 978-1-59593-703-2/07/0011 ...$5.00.
Figure 1: An Asirra challenge. The user selects each of the 12 images
that depict cats. As the mouse is hovered over each thumbnail, a larger
image and “Adopt me” link appear. “Adopt me” first invalidates the
challenge, then takes the user to that animal’s page on Petfinder.com.
term secret that requires guarding; and they can generate a prac-
tically unbounded set of unique challenges. On the other hand,
their most common realization—a challenge to recognize distorted
text—evince a disturbingly narrow gap between human and non-
human success rates. Optical character recognition algorithms are
competitive with humans in recognizing distinct characters, which
has led researchers toward increasing the difficulty of segmenting
an image into distinct character regions . However, this in-
crease in difficulty affects humans as well. Although laboratory
experiments suggest that humans can segment text characters accu-
rately , CAPTCHAs deployed on commercial public web sites
continue to use cleanly segmented challenges (e.g., Fig. 2a), some
with difficult segmentation challenges (Fig. 2c). The owners of
commercial web sites have apparently decided that a user’s success
at navigating a CAPTCHA depends not only on whether they are
able to solve the challenge, but also on whether they are willing
to put forth the effort. Informal discussions with MSN and other
web site owners suggest even relatively simple challenges can drive
away a substantial number of potential customers.
Class II CAPTCHAs have the potential to overcome the main
weaknesses described above. Because they are not restricted to
challenges that can be generated by a low-entropy algorithm, they
Figure 2: A gallery of text CAPTCHAs. Simple text challenges, such
as a (register.com), are still common despite recent defeat by optical
character recognition. Researchers have begun to focus on schemes
that make letter segmentation difficult, as seen in b (Carnegie Mellon
) and c (Microsoft Research ). Webmasters, wary of what users
will tolerate, dial back researchers’ noise parameters, seen in d (Mi-
crosoft Hotmail) and e (Yahoo! Mail).
can exercise a much broader range of human ability, such as recog-
nizing features of photographic images captured from the physical
world. Such challenges evince a broad gulf between human and
non-human success rates, not only because general machine vision
is a much harder problem than text recognition, but also because
image-based challenges can be made less bothersome to humans
without drastically degrading their efficacy at blocking automatons.
A significant issue in building a Class II CAPTCHA is popu-
lating the secret database. Existing approaches take one of two
directions: (a) mining a public database or (b) providing entertain-
ment as an incentive for manual image categorization. Examples
of the first group include the seminal work by Chew and Tygar ,
which used Google Image Search ; hotcaptcha , which ref-
erences the HotOrNot database ; and KittenAuth , which
draws images from Wikimedia Commons . A problem with
these approaches is that the public source of categorized images is
small or available to attackers, so a small, fixed amount of effort
spent reconstructing the private database can return the ability to
solve an unbounded number of challenges. The second direction
was pioneered by the ESP-PIX CAPTCHA , whose database is
populated as a deliberate side effect of playing the ESP Game ,
a very clever mechanism for enticing people to label images accu-
rately. Although potentially powerful, it is not yet clear whether
this approach will yield a sufficiently large set of categorized im-
ages. Furthermore, many of the images in the current implemen-
tation are rather abstract, which may make the challenge difficult
enough to drive away users.
In this paper, we present a new direction for populating image
databases for Class II CAPTCHAs, namely re-purposing a large,
continually evolving, private database of images that are manually
categorized. Although this may seem trivial, it is not a priori clear
why the owner of such a database would be willing to release the
images for use in Turing-test challenges. The answer is that there
canexist—and, inatleastoneinstance, doesexist—analignmentof
interests between a database owner and web-service owners wish-
ing to secure their sites. Both parties can benefit from selective,
wide-scale display of categorized images: the latter for security
and the former for advertising.
We present Asirra3, a CAPTCHA that asks users to categorize
photographs depicting either cats or dogs. An example is shown in
Figure 1. Asirra’s strength comes from an innovative partnership
with Petfinder.com , the world’s largest web site devoted to find-
ing homes for homeless animals. Petfinder has a database of over
three million cat and dog images, each of which is categorized with
very high accuracy by human volunteers working in thousands of
animal shelters throughout the United States and Canada. Petfinder
has granted ongoing access to its database, which grows by nearly
10,000 images daily, to the Asirra project. In exchange, Asirra
provides a small “Adopt me” link beneath each photo, promoting
Petfinder’s primary mission of exposing adoptable pets to potential
new owners. This partnership is mutually beneficial, and also pro-
duces the dual social benefits of improving computer security and
This paper describes Asirra and an analysis of its strengths and
weaknesses. We also report our deployment experience, and the
results of two user studies involving 332 test subjects.
Asirra is easy for users; it can be solved by humans 99.6% of the
time in under 30 seconds (Section 6, Table 1). Barring a major ad-
vance in machine vision or compromise of our database, we expect
computers will have no better than a 1/54,000 chance of solving it
(Section 6, Table 2). Anecdotally, users seem to find the experience
of using Asirra much more enjoyable than a text-based CAPTCHA
that provides equal security.
The organization of this paper is as follows. In Section 2, we
review related work in more detail. In Section 3, we describe the
design of Asirra. §3.1 describes user experiments we performed
to quantify humans’ performance. §3.2 explores the other side of
the equation—potential attacks on Asirra, and how they can be re-
sisted. We developed two algorithms that can be used to improve
virtually all CAPTCHAs, including those that are text-based; these
improvements are described in Section 4. In Section 5 we describe
our scalable Asirra implementation. Finally, in Section 6, we sum-
marize our contributions and offer conclusions.
Asirra is available free at www.asirra.com.
2. RELATED WORK
Since the concept of a CAPTCHA was widely introduced by von
Ahn in 2000 , hundreds of design variations have appeared. By
far, most are text-based: The computer generates a challenge by se-
lecting a sequence of letters, rendering them, distorting the image,
and adding noise. Text CAPTCHAs are popular because they are
simple, small, and easy to design and implement. Challenges as
short as four characters are robust against random guessing; there
are 364≈ 1.7 million possible four-character challenges consisting
of case-insensitive letters and digits.
Unfortunately, computers can do far better than guess randomly.
Simard et al. showed that Optical Character Recognition (OCR)
can achieve human-like accuracy, even when letters are distorted,
as long as the image can be reliably segmented into its constituent
letters . Mori and Malik demonstrated that von Ahn’s original
GIMPY CAPTCHA  can be solved automatically 92% of the
Consequently, recent text-based CAPTCHAs have focused on
making image segmentation difficult. Figure 2c shows a challenge
designed by Chellapilla et al., who claim it is hard for OCR be-
cause the noise confounds known segmentation techniques . Mi-
crosoft’s Hotmail (free email service) deployed it; however, due
3“Animal Species Image Recognition for Restricting Access”
1 (≈15 sec)
2 (≈30 sec)
3 (≈45 sec)
Table 1: Expected user population that will pass Asirra after 1, 2, and
3 challenges, with and without PCA (§4.1). Assumes a 12-image chal-
lenge, 98.5% classification accuracy for an individual image (§3.1.3),
and 15 seconds required per challenge (§3.1.2).
50% (random guessing)
60% (known techniques)
70% (major AI advance)
Bot Success Rate
PCA + Tokens
Table 2: Expected success rates of bots with three hypothetical image
classifiers. Assumes a 12-image challenge. The effects of PCA (§4.1)
and Token Buckets (§4.2) with TB-Refill of 3 are also considered. A
detailed threat analysis can be found in §3.2.
are ever involved in servicing a single request: the machine which
receives the request from the client, and the machine that owns the
session state and receives the sub-request. The Asirra service is
therefore readily scalable; the overhead of parallelization will never
be more than 2x regardless of the total size of the farm.
In practice, we have observed lower overhead; request forward-
ing is not the common case. Many combinations of browser and
operating system continue to use the same IP address for a given
domain name for the duration of a browser session. This may be an
aspect of many browsers’ “DNS Pinning” policy meant to reduce
cross-site scripting attacks.
The web service is about 700 lines of Python code. It is currently
deployed on Amazon’s EC2  compute cloud platform, making
it easy to add resources as the service’s popularity increases.
6.SUMMARY AND CONCLUSIONS
In this paper, we presented Asirra, a CAPTCHA that asks users
to identify cats out of a set of 12 photographs of both cats and
dogs. Our image database is provided by a novel, mutually bene-
ficial partnership with Petfinder.com, which has a database of over
three million images of pets looking for new homes. In exchange
for the use of these images, Asirra displays an “adopt me” link be-
neath each one, promoting Petfinder’s primary mission of finding
homes for homeless animals. (Security implications of Adopt-Me
are discussed in Section 3.2.1.)
Asirra is attractive because humans can solve it quickly and ac-
curately (Table 1), but it provides significant protection from bots
(Table 2). Asirra provides a social benefit: improving animal wel-
fare. And, anecdotally, users report that Asirra is enjoyable: Spend-
ing a few moments looking at cute kittens is generally preferred to
squinting at deformed letters.
Asirra is a free web service, available at www.asirra.com.
Our contributions in this paper also include two techniques that
we use in Asirra, but can be added to virtually any CAPTCHA. The
of humans while only marginally improving the yield of bots. Con-
versely, our use of CAPTCHA Token Buckets (§4.2) significantly
decreases the yield of brute-force bots while having a minimal ef-
fect on humans.
CAPTCHA design seems to be both an art and a science. They
will never be strong in the sense of a cryptosystem or a proof; by
definition, they can be broken with nothing more than a few mo-
ments of casual human effort. In what may be an endless arms
race, perhaps the best we can do is ensure the latest weapon du jour
is fun for users. In this goal, at least, we hope Asirra succeeds.
 Frozen Bear. hotcaptcha. http://www.hotcaptcha.com.
 Kumar Chellapilla, Kevin Larson, Patrice Simard, and Mary
Czerwinski. Designing human friendly human interaction
proofs (HIPs). In Proceedings of ACM CHI 2005 Conference
on Human Factors in Computing Systems, volume 1 of Email
and security, pages 711–720, 2005.
 Monica Chew and J.D. Tygar. Image recognition
CAPTCHAs. In Proceedings of the 7th International
Information Security Conference (ISC 2004), pages
268–279. Springer, September 2004.
 Mark Everingham, Andrew Zisserman, Chris Williams, and
Luc Van Gool. The PASCAL visual object classes challenge
2006 (VOC2006) results. Technical report, University of
 Ralph Gross, Jianbo Shi, and Jeff Cohn. Quo vadis Face
Recognition? Technical Report CMU-RI-TR-01-17,
Carnegie Mellon University Robotics Institute, June 2001.
 Google Images. http://images.google.com.
 Eight Days Inc. Hot or not. http://www.hotornot.com.
 Greg Mori and Jitendra Malik. Recognizing objects in
adversarial clutter: Breaking a visual CAPTCHA. In
Conference on Computer Vision and Pattern Recognition
(CVPR ’03), pages 134–144. IEEE Computer Society, 2003.
 Jared Saul. Petfinder. http://www.petfinder.com.
 Amazon Web Services. Ec2 scalable compute cloud.
 Patrice Simard, David Steinkraus, and John C. Platt. Best
practices for convolutional neural networks applied to visual
document analysis. In International Conference on
Document Analysis and Recognition, pages 958–962. IEEE
Computer Society, 2003.
 Digg.com user DoubtfulSalmon. http://tinyurl.com/2stwu3,
 L. von Ahn, M. Blum, N.J. Hopper, and J. Langford. The
CAPTCHA web page. http://www.captcha.net.
 Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John
Langford. CAPTCHA: Using hard AI problems for security.
In Eli Biham, editor, Advances in Cryptology - EUROCRYPT
2003, International Conference on the Theory and
Applications of Cryptographic Techniques, Warsaw, Poland,
May 4-8, 2003, Proceedings, volume 2656 of Lecture Notes
in Computer Science, pages 294–311. Springer, 2003.
 Luis von Ahn and Laura Dabbish. Labeling images with a
computer game. In Elizabeth Dykstra-Erickson and Manfred
Tscheligi, editors, Proceedings of the 2004 Conference on
Human Factors in Computing Systems, CHI 2004, Vienna,
Austria, April 24 - 29, 2004, pages 319–326. ACM, 2004.
 Oli Warner. Kittenauth. http://www.thepcspy.com/kittenauth.
 WikiMedia Foundation. http://commons.wikimedia.org.
 Wen-Yi Zhao, Rama Chellappa, P. J. Phillips, and Azriel
Rosenfeld. Face recognition: A literature survey. ACM
Comput. Surv, 35(4):399–458, 2003.