Test-driven development is a
discipline of design and
programming where every
line of new code is written
in response to a test the programmer
writes just before coding. As TDD prac-
titioners, we think of what small step in
capability would be a good next addi-
tion to the program. We then write a
test specifying just how the program
should invoke that capability and what
its result should be. The test fails, show-
ing that the capability isn’t already pres-
ent. We implement the code that makes
the test pass and then verify that all
prior tests are still passing. Finally, we
24 IEEE SOFTWARE
Published by the IEEE Computer Society
0740-7459/07/$25.00 © 2007 IEEE
TDD: The Art
Programmers! Cast out your guilt! Spend half of your time in joyous testing and debugging!
Thrill to the excitement of the chase! Stalk bugs with care, and with method, and with reason.
Build traps for them. Be more artful than those devious bugs and taste the joys of guiltless
programming! —Boris Beizer, 1983
guest editors’ introduction
Ron Jeffries, independent consultant
Grigori Melnik, University of Calgary
review the code as it now stands, improving the
design as we go in an activity called refactoring.
Then we repeat the process, devising another test
for another small addition to the program.
As we follow this simple cycle, shown in
figure 1, the program grows into being and the
design evolves with it. At the beginning of
every cycle, the intention is for all tests to pass
except the new one, which is “driving” the
new code development. At the end of the cy-
cle, the programmer runs all the tests, ensur-
ing that each one passes and hence that every
planned feature of the code still works.
TDD is a design and programming activity,
not a testing activity per se. Its testing aspect is
largely confirmatory, through the regression
suite it produces. Professional testers must still
perform investigative testing. (The potential
for confusion is spawning new terms for the
discipline, such as behavior-driven develop-
ment1and example-driven development.2)
This special issue of IEEE Software in-
cludes seven feature articles on various aspects
of TDD and a Point/Counterpoint debate on
the use of mock objects in applying it. No-
tably, these articles demonstrate the ways
TDD is being used in nontrivial situations (da-
tabase development, embedded software de-
velopment, GUI development, performance
tuning). This signifies an adoption level for the
practice beyond the visionary phase and into
the early mainstream.
In practice, of course, even the best pro-
grammers make mistakes. TDD’s growing col-
lection of comprehensive tests (the regression
suite) tends to detect these problems. No
scheme is perfect, but TDD practitioners seem
to experience a reduction in defects shipped,
plus much faster problem detection.
Anyone who’s worked on legacy software
recognizes the situation where a system con-
tinues to function but becomes more and more
outdated until, at some point, it turns into a
house of cards. No one wants to touch it be-
cause even a minor code change will likely
lead to an undesired side effect. With TDD,
developers organically develop a test suite
while building their applications. This pro-
vides a safety net for the whole system, offer-
ing reasonable confidence that no part of the
code is broken. As a result, TDD helps allevi-
ate the fear of changing the code.
In the past, most developers programmed
by first writing code and then testing it. We of-
ten performed the tests manually and often
gave only a cursory look at whether we had
broken any past tests. In spite of what now
seems like careless work, we were always sur-
prised when someone found a bug in our code.
We might have chalked it up to “just a mis-
take” or vowed to try harder and be more
careful next time. Those tricks rarely worked.
With TDD, things are different. Automated
tests specify and constrain each functional bit
of the program. While these tests tend to pre-
vent errors and detect them when they do oc-
cur, when an error does come up, our best re-
sponse is to write the test that was missing—the
test that would have prevented the defect.
Programmers using TDD become justly
confident in the code. As we become more
confident, we can relax more as we work. Less
stressed, we can focus more on quality be-
cause we’re keeping fewer balls in the air. We
become practiced in thinking about what
might not work, at testing whether it does,
and at making it work.
The TDD approach extends the assertion
Boris Beizer made in 1983: “The act of de-
signing tests is one of the most effective bug
preventers known.”3As a practice, TDD first
appeared as part of the Extreme Program-
ming discipline, described in Kent Beck’s Ex-
treme Programming Explained, which came
out in 1999. In 2002, Beck released Test-
Driven Development: By Example, and Dave
Astels followed soon after with Test-Driven
Development: A Practical Guide. More books
appeared, covering various aspects of the tech-
nique, specific tools, and project experiences
(see the “Recommended Books” sidebar). TDD
IEEE SOFTWARE 25
for a new
Figure 1. The test-
step cycle: design a
failing test, implement
code to pass the test,
and improve the design
tools now exist for almost every computer lan-
guage you can imagine, from C++ to Visual
Basic, all the major scripting languages, and
even some of the more exotic languages—cur-
rent and past.
TDD has caught the attention of a large
software development community that finds it
to be a good, fast way to develop reliable
code—and many of us find it to be an enjoy-
able way to work. It embodies elements of de-
sign, testing, and coding in a cyclical style
based on one fundamental rule: never write a
line of code except what’s necessary to make
the current test pass. The process might sound
tedious in the telling, but the practice is rhyth-
mic, quite pleasant, and productive. The swing
from test to code to test occurs as frequently as
every five or 10 minutes. It’s been compared to
a waltz, to the smooth grace of skating, and to
the seemingly effortless movements of a yin-
style martial art. As in all these analogous situ-
ations, the practitioner is fully engaged and
concentrating, while the work just seems to
flow. And like these other arts, the only way to
really understand TDD is to practice it.
Applied at a higher level, TDD is known as
executable acceptance TDD4or storytest-
driven development,5and it helps with re-
quirements discovery, clarification, and com-
munication. Customers, domain experts, or
analysts specify tests before the features are
implemented. Once the code is written, the
tests serve as executable acceptance criteria. In
this issue, Jennitta Andrea makes a case for
better acceptance testing tools in her article,
“Envisioning the Next Generation of Func-
tional Testing Tools” (pp. 58–66).
Each test amounts to the first use of a
planned new capability. This helps the devel-
oper focus on the code in actual use, not just
26 IEEE SOFTWARE
TDD by Example
, Addison-Wesley, 2002 (intro-
■TDD primer with a basic example, idealized situation,
baby-step demonstration. Appropriate for both acade-
mia and practitioners new to TDD.
Test Driven Development: A Practical Guide
Prentice Hall, 2003 (intermediate)
■Relentlessly practical TDD “how-to” guide with real
problems, real solutions, and real code (including build-
ing a GUI test-first). Also includes an excellent overview
of tools and introduction to mock objects.
James Newkirk and Alexey Vorontzov,
ment in Microsoft .NET
, Microsoft Press, 2004 (intermediate)
■Extending TDD to more realistic scenarios (code with
Web interfaces, databases, and so on)
Extreme Programming Adventures in C#
crosoft Press, 2004 (introductory)
■Trip through a software engineering project with an ex-
pert sitting beside you, sharing the triumphs and failures.
Rick Mugridge and Ward Cunningham,
Fit for Developing
Software: Framework for Integrated Tests
, Prentice Hall,
■Primary resource on customer acceptance testing; half
the book written for business experts and the other half
Martin Fowler et al.,
Refactoring: Improving the Design of
, Addison-Wesley, 1999 (introductory)
■Fundamental introduction to refactoring.
Refactoring to Patterns
■Guide to improving the design of existing code with
Working Effectively with Legacy Code
Prentice Hall, 2004 (advanced)
■Start-to-finish strategies for working effectively with
large, untested legacy code bases.
, Manning Publications, 2004
■Concrete, practical expert advice for writing good pro-
XUnit Test Patterns
■Guide on refactoring of both programmer and customer
tests; includes a catalog of “smells” with root cause
analysis and possible solutions.
Everyday Scripting with Ruby: For Teams,
Testers, and Users,
Pragmatic Bookshelf, 2007 (introductory)
■Introduction to test-driving Ruby scripts and beyond.
as implemented. We can often improve a new
capability’s design when we have a chance to
see what we’re creating from the first user’s
viewpoint. Each test makes us think concretely
about how the proposed new feature will be-
have. What are suitable inputs? What behav-
ior will be executed? How will we know what
happened? When we turn to writing the code
a few minutes later, the concrete example
helps us focus on what the code needs to do.
The process is self-correcting. If the tests are
too simple, which is rare, the workflow will
feel choppy and without challenge. This will
encourage the developer to take larger bites.
On the other hand, if the tests are too difficult,
the longer time between successfully passed
tests will alert us that we might be off track.
Once developers gain some skill in TDD,
they commonly report less stress during devel-
opment, better requirements understanding,
lower defect insertion rates, less rework, and,
as a result, faster production of higher-quality
code. Once “test infected,” as TDD aficiona-
dos are called, a developer rarely wants to go
back to the old ways.
TDD practice has a special value as part of
agile methods, which are all characterized by
iterative delivery of increasingly capable sys-
tem versions in short cycles—usually fixed
lengths of a couple of weeks to a month. At
the beginning, a simple architecture and sim-
ple design are sufficient to support the sys-
tem’s capability. As it grows, however, the ar-
chitecture and design need continuous
improvement. Agile practitioners might or
might not have the complete design in mind
from the beginning. Either way, to deliver
working system versions every few weeks,
they must grow the complete design incremen-
tally, not all at once.
Improving the design incrementally is the
refactoring step in figure 1. It brings the whole
design back into alignment—now just a little
bit bigger and better. Changing the design in
continual small steps is a good thing, in that
we can deliver tangible features as we go
along. But frequent design changes also carry
the risk that we’ll break something that used
The tests we write with TDD have been
built, one by one, to cause some new software
property to exist and to show that it works.
So, as we refactor, we can run all the tests to
verify that everything that should work, in
fact, still does work. This makes TDD a pow-
erful asset to incremental software develop-
ment. It becomes a rule of software develop-
ment hygiene. Robert C. Martin argues for
that in his article, “Professionalism and Test-
Driven Development,” pp. 32–36.
The state of TDD research
TDD also intrigues the research commu-
nity, and a growing number of studies have in-
vestigated its effects. Tables 1 and 2 reflect the
current state of TDD research, summarizing
the productivity and quality impacts of indus-
try and academic work, respectively.6–23 The
results are sometimes controversial (more so
in the academic studies). This is no surprise,
given incomparable measurements and the dif-
ficulty in isolating TDD’s effects from many
other context variables. In addition, many
studies don’t have the statistical power to al-
low for generalizations. So, we advise readers
to consider empirical findings within each
study’s context and environment. We also in-
vite more researchers to methodically investi-
gate TDD practice and report on its effects,
both positive and negative.
All researchers seem to agree that TDD en-
courages better task focus and test coverage.
The mere fact of more tests doesn’t necessarily
mean that software quality will be better, but
the increased programmer attention to test de-
sign is nevertheless encouraging. If we view
testing as sampling a very large population of
potential behaviors, more tests mean a more
thorough sample. To the extent that each test
can find an important problem that none of
the others can find, the tests are useful, espe-
cially if you can run them cheaply.
TDD is also making its way to university
and college curricula. The IEEE/ACM 2004
guidelines for software engineering under-
graduate programs includes test-first as a de-
sirable skill.24 Educators report success stories
when using TDD in computer science pro-
gramming assignments. In this issue, Bas
Vodde and Lasse Koskela describe an effective
exercise for introducing TDD to novices—
practitioners or students—in their article,
“Learning Test-Driven Development by
Counting Lines” (pp. 74–79).
Other articles in this special issue give you a
taste of TDD’s use in diverse and nontrivial con-
texts: control system design (see Thomas
Dohmke and Henrik Gollee, “Test-Driven De-
IEEE SOFTWARE 27
to go back
28 IEEE SOFTWARE
A summary of selected empirical studies
of test-driven development: industry participants*
Develop- Organi- No. of
Family ment time Legacy zation Software Software partici- Productivity Quality
of studies Type analyzed project? studied built size pants Language effect effect
Sanchez Case 5 years Yes IBM Point-of- Medium 9–17 Java Increased 40%†
et al.6study sale device effort 19%
Bhat and Case 4 months No Microsoft Windows Small 6 C/C++ Increased 62%†
Nagappan7study networking effort 25–35%
Case ≈7 months No Microsoft MSN Web Medium 5–8 C++/C# Increased 76%†
study services effort 15%
Canfora Controlled 5 hours No Soluziona Text analyzer Very small 28 Java Increased Inconclusive
et al.8experiment Software effort by based on
Factory 65% quality of test
Damm and Multi-case 1–1.5 years Yes Ericsson Components Medium 100 C++/Java Total project 5–30%
Lundberg9study for a mobile cost increased decrease in
network by 5–6% fault-slip-
operator through rate;
application 55% decrease
Melis Simulation 49 days No Calibrated Market Medium 4‡Smalltalk Increased 36%
et al.10 (simulated) using information effort 17% reduction in
Klondike- project residual defect
Team and density
Mann11 Case 8 months Yes PetroSleuth Windows- Medium 4–7 C# n/a 81%
study based oil and customer and
gas project developers’
management perception of
with statistical improved
Geras Quasi- 3 hours No Various Simple Small 14 Java No effect Inconclusive
et al.12 controlled companies database- based on
experiment backed failure rates;
information based on no.
system of tests and
George Quasi- 4.75 hours No John Deere, Bowling Very small 24 Java Increased 18%#
and controlled Role Model game effort 16%
Williams13 experiment Software,
Ynchausti14 Case 8.5 hours No Monster Coding Small 5 n/a Increased 38–267%†
study Consulting exercises effort
*Green box = improvement; orange box = deterioration
†Reduction in the internal defect density
‡Simulated in 200 runs
§Reduction in external defect ratio (can’t be solely attributed to TDD, but to a set of practices)
Increase in percentage of functional black-box tests passed (external quality)
IEEE SOFTWARE 29
A summary of selected empirical studies of TDD: academic participants*
Develop- No. of
Family ment time Legacy Organization Software Software partici- Productivity Quality
of studies Type analyzed project? studied built size pants Language effect effect
Flohr and Quasi- 40 hours Yes University Graphical Small 18 Java Improved Inconclusive
Schneider15 controlled of Hannover workflow productivity
experiment library by 27%
Abrahamsson Case 30 days No VTT Mobile Small 4 Java Increased No value
et al.16 study application effort by 0% perceived by
for global (iteration 5) developers
markets to 30%
Erdogmus Controlled 13 hours No Politecnico Bowling Very 24 Java Improved No difference
et al.17 experiment di Torino game small normalized
Madeyski18 Quasi- 12 hours No Wroclaw Accounting Small 188 Java n/a –25 to –45%†
controlled University application
experiment of Technology
Melnik and Multi-case 4-month No University Various Web- Small 240 Java n/a 73% of
Maurer19 study projects of Calgary/ based systems respondents
over SAIT (surveying, event perceive TDD
3 years Polytechnic scheduling, price improves
Edwards20 Artifact 2–3 No Virginia CS1 Very 118 Java Increased 45%†
analysis weeks Tech programming small effort 90%
Panˇcur Controlled 4.5 No University 4 programming Very 38 Java n/a No difference
et al.21 experiment months of Ljubljana assignments small
George22 Quasi- 1-3/4 No North Bowling game Very 138 Java Increased 16%†
controlled hours Carolina small effort 16%
Müller and Quasi- 10 No University Graph Very 19 Java No effect No effect, but
Hagner23 controlled hours of Karlsruhe library small better reuse
experiment and improved
*Green box = improvement; orange box = deterioration
†Increase in percentage of functional black-box tests passed (external quality)
velopment of a PID Controller,” pp. 44–50),
GUI development (Alex Ruiz and Yvonne Wang
Price, “Test-Driven GUI Development with
TestNG and Abbot,” pp. 51–57), and database
development (Scott W. Ambler, “Test-Driven
Development of Relational Databases,” pp. 37–
43). In addition, Michael J. Johnson, Chih-Wei
Ho, E. Michael Maximilien, and Laurie Wil-
liams inspect the aspect of incorporating per-
formance testing in TDD (pp. 67–73). In Point/
Counterpoint (pp. 80–83), Steve Freeman and
Nat Pryce debate Joshua Kerievsky on the role
of mock objects in test-driving code.
TDD is becoming popular across all
sizes and kinds of software develop-
ment projects. A Cutter Consortium
survey of companies about various software
process improvement practices identified
TDD as the practice with the second-highest
impact on project success (after code inspec-
tions).25 Of course, like any other program-
ming tool or technique, TDD is no silver bul-
let. However, it can help you become a more
effective and disciplined developer—fearless
and joyful, too.
We thank the 32 groups of authors who re-
sponded to our call for papers. Selecting seven articles
from this pool would have been impossible without
reviewers who contributed their expertise and effort.
Our profound gratitude goes to all of them.
1. D. Astels, “A New Look at Test-Driven Development,”
2. B. Marick, “Driving Software Projects with Examples,”
3 July 2004, www.exampler.com.
3. B. Beizer, Software Testing Techniques, Van Nostrand
Reinhold Electrical, 1983.
4. F. Maurer and G. Melnik, “Driving Software Develop-
ment with Executable Acceptance Tests,” Cutter Con-
sortium Report, vol. 7, no. 11, 2006, pp. 1–30.
5. T. Reppert, “Don’t Just Break Software, Make Soft-
ware: How Storytest Driven Development Is Changing
the Way QA, Customers, and Developers Work,” Better
Software, vol. 9, no. 6, 2004, pp. 18–23.
6. J. Sanchez, L. Williams, and E.M. Maximilien, “A Lon-
gitudinal Study of the Use of a Test-Driven Develop-
ment Practice in Industry,” to appear in Proc. Agile
2007 Conf., IEEE CS Press, 2007.
7. T. Bhat and N. Nagappan, “Evaluating the Efficacy of
Test-Driven Development: Industrial Case Studies,”
Proc. Int’l Symp. Empirical Software Eng. (ISESE 06),
ACM Press, 2006, pp. 356–363.
8. A. Canfora et al., “Evaluating Advantages of Test Dri-
ven Development: A Controlled Experiment with Pro-
fessionals,” Proc. Int’l Symp. Empirical Software Eng.
(ISESE 06), ACM Press, 2006, pp. 364–371.
9. L. Damm and L. Lundberg, “Results from Introducing
Component-level Test Automation and Test-Driven De-
velopment,” J. Systems and Software, vol. 79, no. 7,
2006, pp. 1001–1014.
10. M. Melis et al., “Evaluating the Impact of Test-First
Programming and Pair Programming through Software
Process Simulation,” J. Software Process Improvement
and Practice, vol. 11, 2006, pp. 345–360.
11. C. Mann, “An Exploratory Longitudinal Case Study of
Agile Methods in a Small Software Company,” master’s
thesis, Dept. Computer Science, Univ. of Calgary, 2004.
12. A. Geras et al., “A Prototype Empirical Evaluation of
Test Driven Development,” Proc. 10th Int’l Symp. Soft-
ware Metrics, (METRICS 04), IEEE CS Press, 2004, pp.
13. B. George and L. Williams, “An Initial Investigation of
Test Driven Development in Industry,” Proc. ACM
Symp. Applied Computing, ACM Press, 2003, pp.
14. R.A. Ynchausti, “Integrating Unit Testing into a Soft-
ware Development Team’s Process,” Proc. 2nd Int’l
Conf. Extreme Programming and Flexible Processes in
Software Eng. (XP 01), 2001, pp. 79–83.
15. T. Flohr and T. Schneider, “Lessons Learned from an
XP Experiment with Students: Test-First Needs More
Teachings,” Proc. 7th Int’l Conf. Product Focused Soft-
ware Process Improvement (Profes 06), LNCS 4034,
Springer, 2006, pp. 305–318.
16. P. Abrahamsson, A. Hanhineva, and J. Jäälinoja, “Im-
proving Business Agility through Technical Solutions: A
Case Study on Test-Driven Development in Mobile
Software Development, Business Agility and Informa-
tion Technology Diffusion,” IFIP TC8 WG 8.6 Int’l
Working Conf., Int’l Federation for Information Pro-
cessing, 2005, pp. 1–17.
17. H. Erdogmus et al., “On the Effectiveness of the Test-
First Approach to Programming,” IEEE Trans. Soft-
ware Eng., vol. 31, no. 3, 2005, pp. 226–237.
18. L. Madeyski, “Preliminary Analysis of the Effects of
Pair Programming and Test-Driven Development on the
External Code Quality,” Software Engineering: Evolu-
tion and Emerging Technologies, K. Zielinski and T. Sz-
muc, eds., IOS Press, 2005, pp. 113–123.
19. G. Melnik and F. Maurer, “A Cross-Program Investiga-
tion of Students’ Perceptions of Agile Methods,” Proc.
2nd Int’l Conf. Software Eng. (ICSE 05), ACM Press,
2005, pp. 481–489.
20. S. Edwards, “Using Software Testing to Move Students
from Trial-and-Error to Reflection-in-Action,” ACM
SIGSCE Bull., 2004, pp. 26–30.
21. M. Panˇcur et al., “Towards Empirical Evaluation of
Test-Driven Development in a University Environment,”
IEEE Region 8 Proc. EUROCON 2003, vol. 2, IEEE
Press, 2003, pp. 83–86.
22. B. George, “Analysis and Quantification of Test Driven
Development Approach,” master’s thesis, Dept. Com-
puter Science, N. Carolina State Univ., 2002.
23. M. Müller and O. Hagner, “Experiment about Test-
First Programming,” IEE Proc. Software, vol. 149, no.
5, 2002, pp. 131–136.
24. Joint Task Force on Computing Curricula, Software En-
gineering 2004: Curriculum Guidelines for Undergradu-
ate Degree Programs in Software Engineering, tech. re-
port, IEEE CS and ACM, 2004; http://sites.computer.
25. K. El Emam, “Evaluating ROI from Software Quality,”
Cutter Consortium Report, vol. 5, no. 1, 2004, p. 20.
For more information on this or any other computing topic, please visit our
Digital Library at www.computer.org/publications/dlib.
30 IEEE SOFTWARE
About the Authors
Ron Jeffries is an independent Extreme Programming author, trainer, coach, and practi-
tioner and proprietor of www.xprogramming.com, one of the longest-running and certainly the
largest one-person site on XP, comprising over 200 articles at this time. His research interests
center on agile software development. He’s one of the 17 original authors and signatories of
the Agile Manifesto (www.agilemanifesto.org). He has master’s degrees in mathematics and in
computer and communication science. Contact him at email@example.com.
Grigori Melnik is a software engineer, researcher, and educator, currently affiliated with
the University of Calgary and SAIT Polytechnic. His research interests include empirical evalua-
tion of agile methods, executable acceptance-test-driven development, domain-driven design,
e-business software engineering, the Semantic Web, and distributed cognition in software
teams. He’s the research chair of the Agile 2007 conference and the program academic chair of
Agile 2008. Contact him at firstname.lastname@example.org.