ArticlePDF Available

The Case against the Zero

  • Leadership and Learning Center
THIS IS not a trick question. If you are using a
grading scale in which the numbers 4, 3, 2, 1,
and 0 correspond to grades of A, B, C, D, and F,
then what number is awarded to a student who
fails to turn in an assignment? If you responded
with a unanimous chorus of zero, then you may
have a great deal of company. There might be a
few people who are familiar with the research that
asserts that grading as punishment is an ineffective strategy,1but
many of us curmudgeons want to give the miscreants who failed
to complete our assignments the punishment that they richly
deserve. No work, no credit end of story.
Groups as diverse as the New York State United Teachers and the Thomas Ford-
ham Foundation rally around this position.2Let us, for the sake of argument, ac-
cept the point. With the grading system described above, the failure to turn in
work would receive a zero. The four-point scale is a rational system, as the incre-
ment between each letter grade is proportionate to the increment between each
numerical grade one point.
But the common use of the zero today is based not on a four-point scale but on
a 100-point scale. This defies logic and mathematical accuracy. On a 100-point
scale, the interval between numerical and letter grades is typically 10 points, with
the break points at 90, 80, 70, and so on. But when the grade of zero is applied
to a 100-point scale, the interval between the D and F is not 10 points but 60
points. Most state standards in mathematics require that fifth-grade students un-
DOUGLAS B. REEVES is the chairman and founder of the Center for Performance Assessment,
Boston, Mass. His most recent publications are Assessing Educational Leaders (Corwin Press, 2004)
and Accountability for Learning (Association for Supervision and Curriculum Development, 2004).
The Case Against the Zero
Even those who subscribe to the punishment theory of grading might
want to reconsider the way they use zeros, Mr. Reeves suggests.
Missing assignment: F
derstand the principles of ratios for example, A is
to B as 4 is to 3; D is to F as 1 is to zero. Yet the per-
sistence of the zero on a 100-point scale indicates that
many people with advanced degrees, including those
with more background in mathematics than the typ-
ical teacher, have not applied the ratio standard to their
own professional practices. To insist on the use of a
zero on a 100-point scale is to assert that work that is
not turned in deserves a penalty that is many times
more severe than that assessed for work that is done
wretchedly and is worth a D. Readers were asked ear-
lier how many points would be awarded to a student
who failed to turn in work on a grading scale of 4, 3,
2, 1, 0, but Ill bet not a single person arrived at the
answer minus 6. Yet that is precisely the logic that
is employed when the zero is awarded on a 100-point
There are two issues at hand. The first, and most im-
portant, is to determine the appropriate consequence
for students who fail to complete an assignment. The
most common answer is to punish these students. Evi-
dence to the contrary notwithstanding, there is an al-
most fanatical belief that punishment through grades
will motivate students. In contrast, there are at least a
few educators experimenting with the notion that the
appropriate consequence for failing to complete an as-
signment is to require the student to complete the as-
signment. That is, students lose privileges free time
and unstructured class or study-hall time and are
required to complete the assignment. The price of free-
dom is proficiency, and students are motivated not by
threats of failure but by the opportunity to earn greater
freedom and discretion by completing work accurately
and on time. I know my colleagues well enough to un-
derstand that this argument will not persuade many of
them. Rewards and punishments are part of the psyche
of schools, particularly at the secondary level.
But if I concede this first point, the second issue is
much more straightforward. Even if we want to pun-
ish the little miscreants who fail to complete our assign-
ments and I admit that on more than one occasion
with both my students and my own children, my emo-
tions have run in that direction then what is the fair,
appropriate, and mathematically accurate punishment?
However vengeful I may feel on my worst days, Im
fairly certain that the appropriate punishment is not the
electric chair. Even if I were to engage in a typically fact-
free debate in which my personal preference for pun-
ishment were elevated above efficacy, I would never-
theless be forced to admit that giving a zero on a 100-
point scale for missing work is a mathematical inaccur-
If I were using a four-point grading system, I could
give a zero. If I am using a 100-point system, however,
then the lowest possible grade is the numerical value
of a D, minus the same interval that separates every
other grade. In the example in which the interval be-
tween grades is 10 points and the value of D is 60,
then the mathematically accurate value of an F is 50
points. This is not contrary to popular mythology
—“giving students 50 points; rather, it is awarding
a punishment that fits the crime. The students failed
to turn in an assignment, so they receive a failing grade.
They are not sent to a Siberian labor camp.
There is, of course, an important difference. Sen-
tences at Siberian labor camps ultimately come to an
end, while grades of zero on a 100-point scale last for-
ever. Just two or three zeros are sufficient to cause fail-
ure for an entire semester, and just a few course failures
can lead a student to drop out of high school, incurring
a lifetime of personal and social consequences.
This issue is as emotional as anything I have encoun-
tered since the phonics versus whole language debate.
Scholars regress to the persuasive tactics of professional
wrestlers (no offense intended to wrestlers this arti-
cle will generate enough hate mail as it is), and research
and logic are subordinated to vengeance masquerading
as high standards. Because the emotional attachment to
the zero is so strong, I have given up advocating that
50 points should represent the lowest grade. What I do
think we can do to preserve some level of sanity in our
grading system is to return to a four-point system. As
no longer equal 100 points, but four points. If there is
a need for greater specificity, then we can choose an in-
finite number of digits to the right of the decimal point
and thus differentiate between the 3.449 and 3.448
to our hearts content. But at the end of the day in such
a system, the F is a zero one point below the D. It
is fair, accurate, and, some people may believe, moti-
vational. But at least the zero on a four-point scale is
not the mathematical travesty that it is when applied to
a 100-point system.
1. Thomas R. Guskey and Jane M. Bailey, Developing Grading and Re-
porting Systems for Student Learning (Thousand Oaks, Calif.: Corwin Press,
2. Clarisse Butler, Are Students Getting a Free Ride?, New York Teacher,
2 June 2004, available at
040602grading.html; and Thomas B. Fordham Foundation, Mini-
mum Grades, Minimum Motivation, The Education Gadfly, 3 June
2004, available at
DECEMBER 2004 325
Copyright Notice
Phi Delta Kappa International, Inc., holds copyright to this article, which
may be reproduced or otherwise used only in accordance with U.S. law
governing fair use. MULTIPLE copies, in print and electronic formats, may
not be made or distributed without express permission from Phi Delta
Kappa International, Inc. All rights reserved.
Note that photographs, artwork, advertising, and other elements to which
Phi Delta Kappa does not hold copyright may have been removed from
these pages.
Please fax permission requests to the attention of Kappan Permissions
Editor at 812/339-0018 or e-mail permission requests to
Douglas B. Reeves, "The Case Against the Zero," Phi Delta Kappan,
Vol. 86, No. 4, December 2004, pp. 324-325.
File Name and Bibliographic Information
... Two common types of grading scales are used in the American school system: the more traditional 0-100 numerical scale and the 1-4 proficiency-based scale (Reeves, 2004). Numerical 0-100 scales fall into two commonly used grading sub-scales: a 10-point numerical scale and a 7point numerical scale. ...
... Both 0-100 numerical scales have been criticized for having a large range of failing marks relative to passing marks (Reeves, 2004). A 0-4 scale has frequently been advocated for as an alternative grading scale that would decrease the failure range and promote more uniform grading practices (Reeves, 2004). ...
... Both 0-100 numerical scales have been criticized for having a large range of failing marks relative to passing marks (Reeves, 2004). A 0-4 scale has frequently been advocated for as an alternative grading scale that would decrease the failure range and promote more uniform grading practices (Reeves, 2004). ...
Teachers consider many different kinds of factors in determining student grades. They use a mix of achievement and non-achievement factors in grading decisions, to the criticism of educational measurement experts (Brookhart, 1991; Brookhart, 1993; Frary et al., 1993; Popham, 2009). The factors that influence this decision making are particularly salient when teachers consider borderline grades and make decisions about raising, holding, or lowering marks. The purpose of this phenomenological study was to explore the experiences and self-reported practices of high school physical science teachers’ decision making about borderline grades. Through a series of two semi-structured interviews, twelve participants described the contexts in which they graded and what factors they considered when making final grading decisions. Despite teaching at schools which used different guidelines and policies for grading, the teachers in this study exhibited common patterns of decision making. End-of-term marking periods and the required posting of grades often made teachers consider factors other than academic achievement in their final decisions about student grades. The teachers in this study considered factors like students’ effort or personal extenuating circumstances, but also sought to avoid perceived negative consequences for students when determining grades. In contrast to prior research, well-documented non-achievement factors: student ability and behavior, were absent in teachers’ decision-making process about borderline grades. Moreover, the teachers in this study expressed concerns about inaccuracy, bias, and subjectivity in their grading and viewed grades as having a margin of error, which appears to be less explored in research on teachers’ grading practices. Teachers utilized certain factors and reasoning consistently at high and low grade borders when deciding whether to increase the final end of term borderline grade. These practices suggest the teachers in this study framed (Tannen, 1978, 1979, 1993; Hammer et al., 2006) borderline grading differently for each border, applied different schemas to different borders, and viewed grading as more than solely the assessment of academic achievement. Previous research has characterized teachers’ overall grading practices are “hodgepodge” and idiosyncratic (Brookhart, 1991; Cizek et al, 1995; Cross & Frary, 1999). The results of this study contradict this characterization and found that teachers were consistent in their decision-making about students’ end-of -term borderline grades along particular borders, regardless of differences in school or district grading guidelines.
... B. Criticisms of the percent scale In recent years, the percent scale has come under criticism [15,[22][23][24] and alternative methods of grading have been introduced. One criticism of the percent scale is the portion of the scale devoted to failing grades. ...
... That the majority of the scale is devoted to F is potentially a philosophical problem; in fact, Guskey [22] asks, "What message does that communicate to students?" But the problems are mathematical as well because, as discussed by Connor and Wormeli [24], any of the F grades below 50 tend to skew an averaging procedure [15,23]. An example of the "skewing" downward in an averaging process is seen in Table II comparing a 4-point grade scale with a percent scale in averaging the grades on a hypothetical exam. ...
... The amount of the scale devoted to F grades is particularly important when considering awarding the lowest grade, a grade of 0 [4,15,[22][23][24]. Zero grades are often given to students who leave an exam answer blank or even skip an assignment altogether. ...
Full-text available
In deciding on a student’s grade in a class, an instructor generally needs to combine many individual grading judgments into one overall judgment. Two relatively common numerical scales used to specify individual grades are the 4-point scale (where each whole number 0–4 corresponds to a letter grade) and the percent scale (where letter grades A through D are uniformly distributed in the top 40% of the scale). This paper uses grading data from a single series of courses offered over a period of 10 years to show that the grade distributions emerging from these two grade scales differed in many ways from each other. Evidence suggests that the differences are due more to the grade scale than to either the students or the instructors. One major difference is that the fraction of students given grades less than C− was over 5 times larger when instructors used the percent scale. The fact that each instructor who used both grade scales gave more than 4 times as many of these low grades under percent scale grading suggests that the effect is due to the grade scale rather than the instructor. When the percent scale was first introduced in these courses in 2006, one of the authors of this paper, who is also one of the instructors in this dataset, had confidently predicted that any changes in course grading would be negligible. They were not negligible, even for this instructor.
... Scoring and interpreting of assessment results requires assessment expertise to avoid misinterpretations in the students' scores or grades (O'Connor, 2009). Interpretation of assessments can be undertaken by individual lecturers or as a group in order to make conclusive decisions about student performance (Reeves, 2004). Analysing results as a group builds consensus with regard to students' proficiency (Braney, 2010). ...
... The obtained assessment results can be used to make education decisions, refine the next test for proficiency, design policy, and to improve the learning process (Ainsworth & Viegut, 2006;Peterson et al., 1999a). In application or use of student assessment results, this might also involve giving feedback to students on what they have obtained in the previous assessment (Reeves, 2004). ...
Full-text available
There are many instruments that have been designed to measure assessment practices skills, but very few have been validated for their soundness and consistency in measuring lecturers' assessment practices skills. This study was undertaken to examine the psychometric properties of the Assessment Practices Inventory Modified (APIM) scale, and its soundness in measuring assessment practices skills among university lecturers. A quantitative survey research design was adopted for this study. The 50-item APIM scale on a five-point Likert scale was administered to a sample of 321 lecturers randomly selected from six universities in Uganda. The data collected was analysed using WINSTEPS Rasch Measurement Modelling Program for both Classical Test Theory (CTT) and Item Response Theory (IRT) to test the psychometric properties of the APIM scale. From the results of both the CTT (Cronbach's alpha and the point bi-serial coefficients) and IRT (category probability curve, item and persons' reliabilities, item characteristic curve, item difficulty, fit statistics, and principal component analysis) in this study, the APIM scale was found to have adequate psychometric properties in measuring assessment practices skills among university lecturers. The APIM scale was also found to be invariant to gender of the university lecturers. In conclusion, the APIM scale has been found to be sound and consistent in measuring university lecturers' assessment practices skills. This study has pronounced a sound and consistent instrument in measuring assessment practices skills among university lecturers in Uganda, and has provided universities in Uganda with a valid and reliable instrument which will measure assessment practices skills of their lecturers. The results of this study have highlighted that the APIM scale can universally measure assessment practices skills among university lecturers.
... Many researchers and experts emphasize the need to change the methods in which schools measure student grading through traditionally accepted grading scales and calculations towards methods that more accurately measure students' current level of mastery of specific standards or learning targets that rely less on averaging scores across a period of time and that eliminate one overall omnibus course grade that is an average of scores for all assignments for all content during a specified period of time (Guskey, 2013;Marzano, 2000;Marzano & Heflebower, 2011;Miller, 2013;O'Connor & Wormeli, 2011;Reeves, 2011). Other works advocate for eliminating grading practices considered to have negative impacts on learning outcomes, or that are considered toxic, such as giving "zeros" for late or missing assignments, not allowing students to retake tests or redo assignments, giving extra credit for tasks not directly aligned to intended learning outcomes, giving points for non-academic factors such as behavior or motivation, and grading assignments and tasks that are more formative in nature and meant to allow students to practice and develop their skills (Fisher et al., 2011;O'Conner & Wormeli, 2011;Reeves, 2004;Schoen et al., 2003;Townsely & Buckmiller, 2016;Wormeli, 2011). Finally, others point to a need for improved assessment literacy among educators and improved assessment design, including an increase in more authentic, performance-based, assessments (Guskey, 2005;McTighe, 2018;Stiggins, 2004). ...
Full-text available
This study explored the relationship between social and emotional learning variables on student success and retention during the sophomore year of the undergraduate experience. An examination of the affective levels of second- and third-year students commenced as measured by responses to a situational judgment test. A validated situational judgment instrument (SJT-AG) (Westring et al., 2009) was administered to a diverse sample of second- and third-year students at a four year state insitution on the east coast of the USA. The situational judgement test (SJT) assessed participants along two dimensions: behavioral responses and achievement goal orientation. The following research questions were addressed: RQ1: What affective factors/ psychosocial factors impact student outcomes of second year undergraduate students? RQ2: How do higher scores on the SJT (affective variables) affect student outcomes in terms of course performance and retention? The study concluded and offered recommendations for student support services and retention initiatives. Keywords: situational judgement tests, affective variables, collegiate success, retention
... An incomplete grade is not a zero and should not be calculated as such. (Reeves, 2004). ...
The use of the 1–4 grading scale is gaining popularity as schools transition to competency‐based education. However, independent interpretations and percentage conversions have led to general confusion and sometimes outright rejection of competency‐based assessment. The first step to establish clear communication is to anchor the existing four‐point GPA scale in cognitive performance, which provides a strong framework for improved student engagement. Applied research into the learning brain is shifting classroom expectations to actively engage learners with stimulating cognitive challenges, such as synthesis and creative transfer. When the cognitive performance curve is consistently applied in the development of assessment rubrics, educators will provide stronger analytical feedback through meaningful common grades.
... Some eliminate effort and behavior scores (Guskey & Bailey, 2001;Marzano, 2000;O'Connor, 2007). Others remove the zero (Reeves, 2004), adopt a four-point grading system, or decrease the weight of participation grades (O'Connor, 2007). Most current books, articles, and websites advocate standards-based grading as a means of minimizing noncognitive factors while still recognizing that classroom-based assessment offers more insight into student learning that do standardized measures (Guskey & Jung, 2009;Marzano, 2010;McKenna, 2018). ...
Full-text available
Despite the ubiquity and complexity of grading, there is limited contemporary research on grading students in schools. There is, however, an outpouring of publications and consultants promoting new approaches. Many eliminate effort and behavior scores, remove the zero, adopt a four-point system, advocate rubrics, or promote their own software packages. To study changes in grading, we collected data in two New York high schools undergoing a year-long professional development program on rethinking grading. We not only used existing literature on grading to frame our study but also relied on institutional theory and teacher identity as frameworks. We found that productive teacher change occurred, but it was partial, tentative, contingent on school-wide support, and not without frustration on the part of teachers.
... Giving a grade of zero for a blank answer [4] can have a very negative impact on a student's overall exam score, especially if their instructor is using a traditional form of the percent scale [5] where earning somewhere around 50 or 60% of the points is considered failing. In this paper we do not draw any conclusions on why students leave questions unanswered. ...
Full-text available
This explanatory sequential mixed methods study aimed at exploring the grading decision-making of Iranian English language teachers in terms of the factors used when assigning grades and the rationales behind using those factors. In the preliminary quantitative phase, a questionnaire was issued to 300 secondary school and private institute EFL teachers. Quantitative data analyses showed that teachers attached the most weight to nonachievement factors such as effort, improvement, ability, and participation when determining grades. Next, follow-up interviews were conducted with 30 teachers from the initial sample. The analyses of interview data revealed that teachers assigned hodgepodge grades on five major grounds of learning encouragement, motivation enhancement, lack of specific grading criteria, pressure from stakeholders, and flexibility in grading. Data integration indicated that teacher grading decision-making was influenced by both internal and external factors, with adverse consequences for grading validity. Eliciting explanations for the use of specific grading criteria from the same teachers who utilized those criteria in their grading in a single study added to the novelty of this research. Implications for grade interpretation and use, accountability in classroom assessment, and teachers' professional development are discussed.
This practical, hands-on guidebook offers support for your first years in the classroom by presenting strategies to overcome ten common challenges. Expertly curated by experienced educators, this book delivers quick access to timely advice, applicable across a range of educational settings. With contributions from National Board-Certified Teachers, National Teachers of the Year, and other educators involved in robust induction and mentoring programs, The New Teacher’s Guide to Overcoming Common Challenges provides: • Wise and practical tips from accomplished veterans and successful new teachers from across rural, suburban, and urban settings; • Web access to an online teacher community and customizable resources created by the book’s authors that can be quickly downloaded for immediate use in the classroom; • Newly commissioned material that addresses the shift to remote learning brought about by the world pandemic. Accessible and stimulating, this book is designed for a wide range of users, including PK-12 school districts who offer new teacher induction programming, traditional and alternative teacher preparation programs and teacher cadet programs, and individual in-service teachers. Don’t face the challenges alone—learn from those who have been there!
Because of the COVID-19 pandemic in 2020, Albright College changed its classroom courses to an online format starting at the middle of the semester. This communication describes the transformation of the remaining semester for the lecture of the general chemistry course for majors. Video, ungraded and graded quizzing activities, and discussion boards were developed in an asynchronous format, while video conferencing by Zoom provided a platform for synchronous office hours. Insights from the spring semester inform plans for the online fall semester. © 2020 American Chemical Society and Division of Chemical Education, Inc.
html; and Thomas B. Fordham Foundation
  • Clarisse Butler
Clarisse Butler, "Are Students Getting a Free Ride?," New York Teacher, 2 June 2004, available at 040602grading.html; and Thomas B. Fordham Foundation, "Minimum Grades, Minimum Motivation," The Education Gadfly, 3 June 2004, available at 151#1850.