Assessment: The bridge between teaching and learning

Volume 21 Number 2, December 2013
Wiliam | Assessment: The Bridge between Teaching and Learning
Dylan Wiliam
Assessment: The Bridge between
Teaching and Learning
Our students do not learn what
we teach. It is this simple and
profound reality that means
that assessment is perhaps the central
process in effective instruction. If our
students learned what we taught, we
would never need to assess. We could
simply catalog all the learning experiences
we had organized for them, certain in
the knowledge that this is what they had
But of course, anyone who has spent more than a
few hours in a classroom knows this hardly ever
happens. No matter how carefully we design and
implement the instruction, what our students
learn cannot be predicted with any certainty. It
is only through assessment that we can discover
whether the instructional activities in which we
engaged our students resulted in the intended
learning. Assessment really is the bridge between
teaching and learning.
Formative Assessment
Of course, the idea that assessment can help
learning is not new, but what is new is a growing
body of evidence that suggests that attention to
what is sometimes called formative assessment, or
assessment for learning, is one of the most pow-
erful ways of improving student achievement.
Different people have different views about what
exactly counts as formative assessment. Some
think it should be applied only to the minute-to-
minute and day-to-day interactions between stu-
dents and teachers, while others also see interim,
or benchmark, tests administered every six to ten
weeks as formative. For my part, I believe that
any assessment can, potentially, be formative,
which is why I suggest that to describe an assess-
ment as formative is what Gilbert Ryle (1949) de-
scribed as a “category mistake” (p. 16; ascribing
to something a property it cannot have).
The term formative should apply not to the
assessment but to the function that the evidence
generated by the assessment actually serves. For
example, a seventh-grade teacher had given her
students an English language arts test, under test
conditions, and collected the students’ test re-
sponses. Most teachers would then try to grade
the students’ responses, add helpful feedback,
and return the graded papers to the students the
following day. On this occasion, however, the
teacher did not grade the papers. She quickly
read through them and decided that the follow-
ing day each student would receive back her or
his own paper; in addition, groups of four stu-
dents would be formed, and each group would
be given one blank response sheet, so that they
could, as a group, produce the best composite pa-
per. When the groups had done this, the teacher
led a plenary discussion in which groups reported
back their agreed responses. What is interesting
about the example is that the assessment being
used had been designed entirely for summative
purposes, but the teacher had found a way of us-
ing it formatively.
If we accept that any assessment can be used
formatively, we need some way of defining for-
mative assessment in a way that is useful for class-
room practice. The way that I have found most
useful is to think of three key processes in learning:
Wiliam | Assessment: The Bridge between Teaching and Learning
Volume 21 Number 2, December 2013
1. Where the learner is right now
2. Where the learner needs to be
3. How to get there
It’s also vital to consider the respective roles
of teachers, students, and their peers. Regarding
the processes and the roles as independent (so
that teachers, students, and peers have a role in
each) suggests that formative assessment can be
thought of as comprising five “key strategies” as
shown in Figure 1.1 Each of the five strategies is
discussed in further detail in the sections that fol-
Key Strategies of Formative
Learning Intentions
The idea that teachers should share with their
students what it is intended that they learn from
a given instructional activity seems obvious, but
it is only within the past 20 years or so that this
has been routine in English language arts class-
rooms. While this is a welcome development, it
is also important to note that in many schools,
well-intentioned attempts to communicate learn-
ing intentions to students have made writing a
mechanistic process of checklist management.
It is true that rubrics can identify important ele-
ments of progression in writing, but they can too
easily become a straitjacket. To be sure, where
we can, with fidelity, specify what makes writing
good, we should do so, but we should also re-
member Albert Einstein’s advice: “Make things
as simple as possible, but no simpler.” Some-
times, we should accept that the best we can do
is help our students develop what Guy Claxton
has called “a ‘nose’ for quality” (1995, p. 339). In-
deed, some writers, such as Royce Sadler (1989),
have argued that this is an essential precondition
for learning:
The indispensable conditions for improvement are
that the student comes to hold a concept of quality
roughly similar to that held by the teacher, is able
to monitor continuously the quality of what is being
produced during the act of production itself, and has
a repertoire of alternative moves or strategies from
which to draw at any given point. (p. 121)
In recent years, it has been common for writ-
ers to advocate the “co-construction” of rubrics
with students. The idea is that rather than having
the teacher present the students with a rubric as
“tablets of stone,” the rubric is developed with the
students. A common method for doing this is for
the teacher to provide the students with a num-
ber of samples of work of varying quality (e.g.,
anonymous samples from a previous year’s class);
the students then rank them and begin to iden-
tify features that distinguish the stronger work
from the weaker work. This can be a very power-
ful process, but the end point of such a process
must reflect the teacher’s concept of quality. The
teacher knows what quality writing looks like;
students generally do not. Of course, the teach-
er’s views of quality work may shift in discussion
with a class, but the teacher is already immersed
in the discipline of English language arts and the
students are just beginning that journey.
Eliciting Evidence
Although feedback is considered by many to be
the heart of formative assessment, it turns out that
the quality of the feedback hinges on the qual-
ity of evidence that is elicited in the first place.
Knowing that a student has scored only 30% on
a test says nothing about that student’s learning
needs, other than that he or she has apparently
failed to learn most of what was expected. The
Where the
learner is going
Where the learner is
right now
How to get there
sharing, and
learning inten-
Engineering effective
discussions, activities,
and tasks that elicit
evidence of learning
Feedback that
moves learning
Peer Activating students as learning resources
for one another
Student Activating students as owners of their
own learning
Figure 1. The five “key strategies” of formative assessment
Volume 21 Number 2, December 2013
Wiliam | Assessment: The Bridge between Teaching and Learning
point is, effective feedback requires asking the
right questions. This may be obvious, but what is
less obvious is that effective feedback requires a
plan of action about what to do with the evidence
before it is collected.
Many schools and districts espouse a com-
mitment to data-driven decision making, but too
often, this entails the collection of large bodies of
data just in case they come in useful at some later
point. This is a particular problem when teachers
administer common formative assessments to all
students in a grade and then meet to discuss what
to do. By the time all the common assessments
have been graded and a meeting to discuss the
implications has been scheduled, the data are well
past their “sell-by” date; the teaching has moved
on. And even if the data were available in a timely
fashion, unless time has already been scheduled
for any additional instruction shown to be nec-
essary by the assessments, nothing useful can
happen. That is why data-driven decision mak-
ing is not a particularly helpful approach. What
is needed instead is a commitment to decision-
driven data collection. For example, rather than
an “end of unit test,” the teacher could sched-
ule a “three-fourths of the way through the unit
test.” Rather than grading the papers, the teacher
could use the information gleaned from the test
to decide which aspects of the unit need to be
re-taught or, if the students have all done well,
provide some extension material.
On an even shorter time-scale, a fifth-grade
teacher had been introducing students to five
kinds of figurative language: alliteration, hyper-
bole, onomatopoeia,
personification, and
simile. Five minutes
before the end of the
lesson, she listed the
five kinds of figura-
tive language on the
whiteboard. She then
read out a series of
sentences, asking the
students to use “fin-
ger voting” to indi-
cate what kinds of figurative language they had
heard (e.g., hold up one finger if you hear alliter-
ation, five fingers if you hear a simile, and so on).
These are the sentences she read out:
A. He was like a bull in a china shop.
B. This backpack weighs a ton.
C. The sweetly smiling sunshine warmed
the grass.
D. He honked his horn at the cyclist.
E. He was as tall as a house.
Most of the students responded correctly to
the first two, but most of them chose to hold up
either one finger or four fingers for the third.
The teacher pointed out to the class that a few
By the time all the common
assessments have been graded
and a meeting to discuss
the implications has been
scheduled, the data are well
past their “sell-by” date; the
teaching has moved on.
Wiliam | Assessment: The Bridge between Teaching and Learning
Voices from the Middle, Volume 21 Number 2, December 2013
students had held up one finger on one hand and
four on the other, because the sentence was an
example of both alliteration and personification,
while most students had assumed that a sentence
could only have one kind of figurative language.
With this misconception cleared up, most of the
students realized that the fourth statement was
both alliteration and onomatopoeia while the last
was both a simile and hyperbole.
The important point about this example is
that the teacher planned to collect the data only
once she had decided how she was going to use
it—in this case managing to administer, grade,
and take follow-up re-
medial action in less
than three minutes.
In 1996, two research-
ers in the psychology
department at Rutgers
University published an
extraordinary review of
research studies on the
effects of feedback in
schools, colleges, and
workplaces (Kluger &
DeNisi, 1996). They
began by tracking down a copy of every single
published study on feedback they could find,
going back to 1905! They found around 3,000
(2,500 journal articles and 500 technical reports).
They then analyzed the studies to see whether
the conclusions could be trusted. They elimi-
nated those without a control group that was not
given feedback, those for which the effects could
not be attributed only to the feedback given, and
those for which there was not sufficient detail to
quantify the impact of feedback on achievement.
They were surprised to discover that only 131
studies made the cut. Even more surprising was
that while feedback did increase achievement on
average, in 50 of the studies (i.e., 38%), feedback
actually made performance worse!
They concluded that, from a scientific point
of view, most of the studies that had been un-
dertaken were a waste of time because they com-
pletely failed to take into account the reactions
of the recipient. The question “What kind of
feedback is best?” is meaningless, because while
a particular kind of feedback might make one
student work harder, it might cause another stu-
dent to give up. There can be no simple recipe
for effective feedback; there is just no substitute
for the teacher knowing their students. Why?
First, knowing the students allows the teacher
to make better judgments about when to push
each student and when to back off. Second, when
students trust the teacher, they are more likely
to accept the feedback and act on it. Ultimately,
the only effective feedback is that which is acted
upon, so that feedback should be more work for
the recipient than the donor.
Students as Learning Resources for
One Another
There is a large body of literature on peer tu-
toring and collaborative learning—particularly
in English language arts—that it is neither pos-
sible nor necessary to review here (e.g., Brown &
Campione, 1995 [reciprocal instruction]; Slavin,
Hurley, & Chamberlain, 2003 [general models
of collaborative learning]). However, it is worth
noting that peers can be very effective assessors
of one another’s work, especially when the fo-
cus is on improvement rather than grading. One
sixth-grade class was working on suspense sto-
ries, and the teacher had co-constructed with the
students a checklist of four key phases that made
a good suspense story: establishment, build-up,
climax, and resolution. The class also decided
that it would be a good idea, just as an exercise if
nothing else, for the story to contain at least two
examples of figurative language.
The students worked on their stories and,
when everyone was done, exchanged their work
with a neighbor and switched roles from “au-
thor” to “editor.” The editor’s task was to “mark
up” the story by using four different colored pen-
cils to indicate the beginning of each phase, with
a fifth color to underline the two examples of
figurative language. With the editor’s approval,
First, knowing the students
allows the teacher to make
better judgments about
when to push each student
and when to back off.
Second, when students trust
the teacher, they are more
likely to accept the feedback
and act on it.
Volume 21 Number 2, December 2013
Wiliam | Assessment: The Bridge between Teaching and Learning
a story could be submitted to the teacher (the
“chief editor”). Because each editor was responsi-
ble for ensuring that the required elements were
present, students took the role very seriously (not
least because they were accountable to the chief
Of course, one could list many more exam-
ples of this kind, but what teachers routinely re-
port is that students tend to be much tougher on
one another than most teachers would dare to be.
This is important because it suggests that with
well-structured peer-assessment, one can achieve
better outcomes than would be possible with one
adult for every student.
Students Owning Their Own Learning
As Rick Stiggins (Stiggins, Arter, Chappuis, &
Chappuis, 2004) reminds us, the most important
instructional decisions are not made by teach-
ers—they are made by students. When students
believe they cannot learn, when challenging tasks
are just one more opportunity to find out that
you are not very smart, many students disengage.
And this is perfectly understandable. What stu-
dents are really doing when they disengage is de-
nying the teacher the opportunity to make any
judgment about what the student can do—after
all, it is better to be thought lazy than dumb. This
is why the most important word in any teacher’s
vocabulary is yet. When a student says, “I can’t do
it,” the teacher responds with “yet.” This is more
than just sound psychology. It is actually what
we are learning about the nature of expertise and
where it comes from.
Of course individuals vary in their natural
gifts, but these differences are very small to begin
with. What happens is the small initial advan-
tages of some students quickly become magnified
when the students with these small advantages
work hard, engage, and improve, while those who
are slightly behind avoid challenge, and thus miss
out on the chance to improve. As the title of a
recent book by Geoff Colvin (2010) makes clear,
“Talent is overrated.” It is practice that creates
expertise. Chess grandmasters don’t have higher
IQs than average chess players—they just prac-
tice more. Indeed, in almost all areas of human
expertise, from violin playing to radiography,
expertise is the result of ten years of deliberate
practice (Ericsson, Charness, Feltovich, & Hoff-
man, 2006).
When students come to believe that smart is
not something you are but something you get,
they seek challenging
work, and in the face
of failure, they increase
effort. Student athletes
get this. They know that
to improve, they must
practice things they
can’t yet do, rather than
just simply rehearse the
things they know how
to do. We need to get
students to understand
this in the English language arts classroom, too.
Ultimately, we would want students to resent
work that does not challenge them, because they
would understand that easy work doesn’t help
them improve. In the best classrooms, students
would not mind making mistakes, because mis-
takes are evidence that the work they are doing is
hard enough to make them smarter.
People often want to know “what works” in ed-
ucation, but the simple truth is that everything
works somewhere, and nothing works every-
where. That’s why research can never tell teach-
ers what to do—classrooms are far too complex
for any prescription to be possible, and varia-
tions in context make what is an effective course
of action in one situation disastrous in another.
Nevertheless, research can highlight for teachers
what kinds of avenues are worth exploring and
which are likely to be dead ends, and this is why
classroom formative assessment appears to be so
promising. Across a range of contexts, attend-
ing not to what the teacher is putting into the
instruction but to what the students are getting
out of it has increased both student engagement
and achievement.
In the best classrooms,
students would not mind
making mistakes, because
mistakes are evidence that
the work they are doing is
hard enough to make them
Wiliam | Assessment: The Bridge between Teaching and Learning
Volume 21 Number 2, December 2013
Different teachers will find different aspects
of classroom formative assessment more effec-
tive for their personal styles, their students, and
the contexts in which they work—so each teach-
er must decide how to adapt the ideas outlined
above for use in their practice. Of course, as al-
ways, “more research is needed,” but the breadth
of the available research suggests that if teachers
develop their practice focused on the principles
outlined above, they are unlikely to fail because
of the neglect of subtle or delicate features. There
will never be an optimal model, but as long as
teachers continue to investigate that extraordi-
narily complex relationship between “What did
I do as a teacher?” and “What did my students
learn?” good things are likely to happen.
Dylan Wiliam works with schools, districts, and state and national governments all over the world
to improve education. He lives virtually at and physically in New Jersey.
