ArticlePDF Available

Invisible Learnings: A commentary on John Hattie's book visible learning: A synthesis of over 800 metaanalyses relating to achievement

Authors:
INVISIBLE LEARNINGS? A COMMENTARY ON JOHN HATTIES BOOK
VISIBLE LEARNING: A SYNTHESIS OF OVER 800 META-ANALYSES RELATING
TO ACHIEVEMENT
1
Evidence does not supply us with rules for action but only with hypotheses for
intelligent problem solving, and for making inquiries about our ends in education. (John
Dewey, quoted in Hattie, 2008, p. 147)
INTRODUCTION
This book by Professor John Hattie of Auckland University is the result of decades of careful
research. He has synthesised some 800 meta-analyses comprising more than 50,000 studies
and involving some 146,000 effect sizes. The announcement of the book has already led to a
good deal of discussion both in New Zealand and overseas and seems to have captured the
attention of policy makers. It is, therefore, important that members of the educational
research community pay John Hattie the courtesy of subjecting his conclusions to critical
scrutiny in a spirit of mutual truth seeking to ensure that: (1) discussions are based on a
careful reading of the book, rather than on half-baked reactions in the popular media; (2) the
caveats which Hattie himself sets out are carefully noted so that decisions are not made in
opposition to the message of this book and (3) the findings are not appropriated by political
and ideological interests and used in ways which the data do not substantiate.
THE METHODOLOGY UNDERLYING THE BOOK
Hattie derives his results from working on a large sample of research studies. His method
involves a synthesis of a large number of meta-analyses of studies about education variables.
A meta-analysis is a statistical technique for amalgamating, summarising and reviewing,
primary research. It combines the results of various studies which address a set of research
hypotheses. It is used in many branches of knowledge such as medicine, psychotherapy,
business and education. All the findings in this book derive from John Hatties synthesis of
800 meta-analyses of more than 50,000 quantitative studies of variables affecting the
achievement of students.
1
Amended version published as I. Snook et al (2009). New Zealand Journal of Educational Studies 44(1):93-106
2
A major aim is to determine effect sizes. From looking at a large number of research studies it
is relatively easy to determine that there are certain effects: for example, overall, drug A is
more successful in lowering blood pressure than drug B. But the key question is, How much
more successful? Effect size is a way of answering this question. It involves comparing the
mean scores of the two variables and dividing them by the standard deviation (Coe & Rowe,
2004). Thus, studies can be plotted along a continuum from very low effect size to very high
effect size. In both cases a judgment is needed, for although it is not disputed that an effect
size of 1.0 is large, there are debates about where a small effect size ends and a moderate or
large effect size begins. Hattie adopts 0.4 as the cut off point, basically ignoring effects sizes
lower than 0.4. Thus, for example, class size is interpreted as a small effect size since it is 0.2
(In public debate this tends to turn into class size has no effect at all). Selecting a cut-off
point is a hazardous exercise, as it means that potentially important effects may be
overlooked. An effect size of 0.2 means that the difference between the two comparison
groups (e.g. small classes and large classes) is 0.2 (20%) of a standard deviation of the test or
measurement scores. Much depends, therefore, on the quality of the research studies in the
various meta-analyses. If the sample is large and random (hence increasing the validity and
reliability of the measurement), a small effect size is of considerable significance. On the
other hand, large effect sizes from small samples are meaningless at best and positively
dangerous when lumped together with other studies to produce an average.
Hattie claims that he has made a synthesis of 800 meta-analyses, and insists that his is not a
meta-analysis of meta-analyses. What is a synthesis? According to the Evidence Informed
Policy Network (undated), the term research synthesis is defined as a systematic and
transparent summary of the best available evidence relevant to a policy decision”. The key
point is that a synthesis must include the development of a protocol, the use of systematic
and explicit methods, data collection, analysis, interpretation and reporting of the results”.
Hattie says that he is not concerned with the quality of the research in the 800 studies but, of
course, quality is everything. Any meta-analysis that does not exclude poor or inadequate
studies is misleading, and potentially damaging if it leads to ill-advised policy developments.
He also needs to be sure that restricting his data base to meta-analyses did not lead to the
omission of significant studies of the variables he is interested in.
Just as this commentary was being finalised, the Ministry of Education and NZCER released
an excellent paper on effect sizes. It repeats many of the reservations which we express in our
3
commentary: Ian Schagen and Edith Hodgen, How Much Difference Does It Make? Notes on
Understanding, Using and Calculating Effect Sizes for Schools (2009). It is interesting that
the advice of John Hattie is acknowledged in this paper so we can perhaps assume that he
agrees with many of our concerns about the use of effect sizes.
QUALIFICATIONS OF HIS STUDY
John Hattie himself acknowledges some of the problems associated with his approach:
Social effects/background/context effects are ruled out
[This] is not a book about what cannot be influenced in schools - thus critical
discussions about class, poverty, resources in families, health in families, and nutrition
are not included but this is NOT because they are unimportant, indeed they may be more
important than many of the issues discussed in this book. It is just that I have not
included these topics in my orbit. (Hattie, 2008, pp. x-xi)
As we shall see, social class background is indeed more important than many of the issues
discussed in this book and hence policy decisions cannot be drawn in isolation from the
background variables of class, poverty, health in families and nutrition.
The various studies have not been appraised for their validity
[This] is not a book about criticism of research and I have deliberately not included
much about moderators of research findings based on research attributes (quality of
study, nature of design) not because these are unimportant… but because they have been
dealt with elsewhere. (p. ix)
However, he is not entirely consistent on this. In his discussion of extra-curricular activities,
he cautions against taking the finding (0.47) too seriously since it is based on a random
effects model, which may lead to inflated effect sizes. In relation to charter schools vs regular
schools, he cites a study which reports an effect size of 0.2, “but when the lower quality
studies were excluded, this difference dropped to zero(p. 66). In his treatment of learning
styles he is justifiably suspicious of the motives behind much of the research and
appropriately sceptical of many of the results. Thus, although he finds an effect size of 0.41
overall, he dismisses it as not credible (pp. 195-197). Might something like this not be the
case with some of the other effect sizes reported? Once again, it is Hatties right to define how
4
he will approach the data but he cannot complain if policy makers are cagey about drawing
policy conclusions from meta-analyses of studies, the merits of which have not been
investigated.
The research is limited to one dimension of schooling.
Of course there are many outcomes of schooling such as attitudes, physical outcomes,
citizenship, and a love of learning. This book focuses on student achievement and that is
a limitation of this review. (p. 6)
To be more accurate, he is concerned not with achievement but with achievement that is
amenable to quantitative measurement. New knowledge, skills and dispositions are all
achievements of one form or another but they are generally more difficulty to measure. At
times, his restricted scope leads to rather odd conclusions. Writing about the effects of
programmes of moral education, he says: The major outcome from moral education
programmes is the facilitation of moral judgement… and as this is not strictly achievement as
typically defined, these are not included in the tables (p. 149). He also has to concede that
the form of learning which he discusses is, itself, severely limited: Having distinguished
three levels of learning (surface, deep, and conceptual), he says in one of his conclusions: A
limitation of many of the results in this book is that they are more related to the surface and
deep knowing and less to conceptual understanding (p. 249). And yet, conceptual knowing
or understanding is what he thinks should be the result of good teaching. Clearly there is less
to be drawn from his synthesis than commentators have suggested. Much depends on the kind
of learning that is desired in formal education. Policy makers have to take a broad view of
schooling: they have to be interested not just in achievement on narrow tests or even on
deeper conceptual knowledge, important as this clearly is, but on the attitudes which students
bring to their lives as workers and citizens. Employers, for example, often stress the
importance of the attitudes which young people bring to work - perseverance, flexibility,
cooperation - rather than only the cognitive qualities that they can demonstrate.
The research may not be applicable to ordinary teachers
Most of the successful effects come from innovations and these effects from innovations
may not be the same as the effects of teachers in regular classrooms. (p. 6)
5
This is particularly telling when, as in the case of the Picking up the Pace studies, we are told
that the class size was kept artificially low for the duration of the study (Ministry of
Education, undated).
Correlation must not be confused with causation.
He also has a very interesting discussion on the importance of not confusing correlation with
causation and moving too readily from this is significant statistically to this is what
teachers should do (pp. 3-4). As an example, after finding that feedback is important,
Hattie adds: It would be an incorrect understanding of the power of feedback if a teacher
were to encourage students to provide more feedback (p. 4). He concedes, though, that the
fundamental word in meta-analysis, effect size, implies causation (what is the effect of a on b)
and this claim is often not defensible (p. 237).
PROBLEMS WITH THE USE OF META ANALYSIS
Hattie has set out some of the major problems with the methodology that he has used for this
study. First, comparing disparate studies can be like comparing apples and oranges. Each
study can be very different. Second, in seeking averages, studies ignore the complexity of
classrooms and the wide variety of results. Third, what is so sacred about an average score?
Fourth, the studies are historical; i.e. they report past findings and cannot show that the
future must be the same. Fifth, they do not distinguish the quality of different studies and
hence could merit Eysenks judgment: garbage in garbage out. Hattie tries to minimise these
criticisms of his methodology but they need to be taken into account before accepting the
analyses as sound enough for policy recommendations.
There are some other problems (not centrally acknowledged by Hattie) associated with meta-
analyses:
(i) Bias is not normally controlled in meta-analyses: thus a meta-analysis (however well
designed) of poorly designed studies will inevitably lead to unreliable conclusions. It is a
serious matter when government agencies use such unreliable conclusions to justify some
educational policy.
(ii) There is a heavy reliance on published results. As we know, particularly in relation to
studies commissioned by drug companies (but also from studies of lucrative educational fads
6
such as learning styles), this often means that studies which fail to support favoured
conclusions do not make it into publications or into the meta- analyses. Once again, this has
important ramifications for policy-making.
(iii) There is a particular problem in relation to education: the difficulty of clearly defining the
variables. In medicine, for example, Drug A can be carefully compared to Drug B in terms of
their respective chemical qualities but it is not nearly so easy when one is talking about such
things as child-centred teaching vs teacher-centred teaching. There is no clear operational
definition of either of the variables. In these matters there is usually a continuum and,
therefore, subjective judgments have to be made: where is the line to be drawn? It is
interesting that on one occasion, at least, Hattie himself draws attention to this problem.
Writing about the effects of whole language teaching in reading he notes discrepant results
from two meta-analyses in which there was much overlap in the studies used… and the
difference is a function of how the authors classified some key studies, and the coding of what
constituted whole language (p. 137). We suspect that this sort of problem might be
widespread.
(iv) There is also the difficulty which arises from amalgamating a large number of disparate
studies. When results of many studies are averaged, the complexity of education is ignored:
variables such as age, ability, gender, and subject studied are set aside. An example of this
problem can be seen in Hatties treatment of homework: does homework improve learning or
not? Overall, Hattie finds that the effect size of homework is 0.29. Thus a media
commentator, reading a summary might justifiably report: Hattie finds that Homework does
not make a difference. When, however, we turn to the section on homework we find that, for
example, the effect sizes for elementary (primary in our terms) and high schools students are
0.15 and 0.64 respectively. Putting it crudely, the figures suggest that homework is very
important for high school students but relatively unimportant for primary school students.
There were also significant differences in the effects of homework in mathematics (high
effects) and science and social studies (both low effects). Results were high for low ability
students and low for high ability students. The nature of the homework set was also
influential (pp. 234-236). All these complexities are lost in an average effect size of 0.29.
(v) There is also the issue of how generalisable the results are. Hattie points out that most of
the studies were carried out in highly developed English-speaking countries (mainly the USA)
7
and should not be generalised to non-English speaking or developing countries. It has been
shown, for example, that in developing countries, school effects (as against teacher effects)
are huge, due no doubt to the wide variety of schools. It could easily be that New Zealand
schools, teachers and students are in fact rather different from those of the USA and hence we
should exercise great care in relating the meta-analyses to New Zealand education.
SCHOOL EFFECTS
Hattie acknowledges the important role of socio-economic status and home background but
chooses to ignore it. That is his choice: but it is easy for those seeking to make policy
decisions to forget this significant qualification. There is some debate about the extent of the
contribution made by a students social background but the following conclusions are typical:
(i) Gray, Jesson and Jones (1986) summarised their large scale research in Britain:
Around 80% of the difference can be explained by the intake and they say that this
has held up over all the schools and LEAs studied. They went on to say that half the
remaining difference (the 20%) may be explained by the schools examination
policies. This would leave only 10% to be explained by other variables within the
school.
(ii) Based on his research in New Zealand (and consistent with many overseas studies)
Richard Harker has claimed that anywhere between 70-80% of the between schools
variance is due to the student mix which means that only between 20% and 30% is
attributable to the schools themselves (including, of course, the teachers) (Harker,
1995, p. 74). Certainly, he found quite significant differences between schools in their
results even after the influence of social background is controlled (the value added
effect) (Harker, 1996).
(iii) According to a recent OECD volume on the importance of quality teaching, it is
possible to draw three “broad conclusions” from the research on student learning.
The first and most solidly based finding is that the largest source of variation in
student learning is attributable to differences in what students bring to school
their abilities and attitudes, and family and community. Such factors are difficult
for policy makers to influence, at least in the short-run. (OECD, 2005, p. 2).
8
(iv) Hattie in fact seems to acknowledge this. Although he does not discuss social
background he refers to Student Influences on learning and Home Influences on
learning. In another publication he ascribes 50% of the variance to what the student
brings and 10% to the contribution of the home (Hattie, 2003, pp. 1-2). Of course,
under student influences he includes IQ but seems to see this as a fixed (inherited?)
quality rather than the largely socially determined one it is now known to be (Nash,
2004). This leaves only 40% to be explained by school and teacher influences. This
is, admittedly, rather larger than most other estimates, but still much smaller than the
influence of social background on achievement.
There are in fact, two different types of research on school effects. One compares the
relative contribution made by social variables on the one hand and school variables on the
other. The former includes social status, parental education, home resources and the like; the
latter includes all variables within the school: curriculum, principal, buildings, and the work
of teachers. These studies typically find that most of the variance comes from the social
variables and only a small part from the school (including the teachers).
The other kind of study is that which ignores the social variables and asks simply: which of
the school variables are most important: policies, principal, buildings, school size, curriculum,
teachers? These, unsurprisingly, tend to find that the teacher is the most important variable,
that is, more important than the principal, the curriculum, the school size or the policies. It is
easy to get these two types of studies confused. A former Minister of Education, badly
advised by his Ministry, made a fool of himself for some months before making the necessary
qualification: that in saying Teacher are the most important variables in student learning he
was talking about studies of the second type and after being publicly criticised he began to
add the crucial qualifier, within the school. Sadly, in our contemporary politicised and
uncritical social climate neither his egregious error nor his retraction was noted by the media,
the Ministry, or, by and large, academic commentators.
OTHER ISSUES
Isolating the variables to be analysed
9
For example, with small vs large classes: how does one define small and large? Similarly,
with open vs traditional classes: how to estimate the extent of openness, etc? Equally, with
streamed (tracked) vs unstreamed schools or classes: how much streaming or selective
grouping etc is acceptable while the class is still classified as unstreamed? Comparing such
abstract variables is not at all like comparing Drug A with Drug B in medical research or even
urban vs rural differences in sociology. Classrooms are very complex and relevant variables
are hard to pin down.
Interpretation of small vs big differences
Hattie adopts (arbitrarily) a cut off at 0.4 and above, but other researchers are content with a
lower cut off point. To some extent the choice is arbitrary but, as we said earlier, what is
important is not the effect size per se but the quality of the research underlying the meta-
analyses. This is what should make the difference when suggestions are made for policy. In
fact, Hattie concedes that in some areas a much lower threshold can be significant. In
medicine, it was demonstrated that the effect size of taking one low dose aspirin to decrease
the risk of heart attack was a mere 0.07 but it translates into the conclusion that 34 out of 1000
people would be saved from heart attack. This sounds worth it to me he says (p. 9). Indeed,
Hattie is not always thoroughly consistent in relying on an effect size of at least 0.4. Writing
of the studies of outdoor education he finds it most exciting that the follow up effects
were (untypically) “positive”. The effect size was 0.17, well below his usual cut off point (p.
157). Why is he so excited by this rather modest result when effect sizes higher than this are
often written off as insignificant?
TWO PARTICULAR ISSUES
Class size
Hattie has been cited as finding that class size is not important and this has excited the
attention of those concerned about financing of schools, who conclude that they can
economise on class size. In fact, the significance of class size is much more complicated than
that, even in terms of John Hatties synthesis. What is a small class: 5, 15, 20? What is a large
class: 25, 40, 80? (really large classes are common in tertiary education). It is interesting to
note that in the STAR studies (discussed below) classes of 22-25 were defined as large, when
in many studies these would be seen to be desirably small compared to, say, classes of 30+. It
is also important to determine how the assessment is made: on the basis of teacher/pupil
10
ratios in a whole school? (This is quite common, hence we do not know how large any actual
class is); on average attendance over a period of time? Or, an actual count on the days the
teaching is done and the testing carried out? (This would seem to be the most desirable
method.) Studies vary greatly in relation to these ways of estimating class-size. A meta-
analysis often ignores such problems.
Hattie concludes that the effect size for class size is around 0.2, which is in his category of a
small effect. On this basis he seems to dismiss it and commentators in the popular media have
played this up. However, some points can be made. First, this is not negligible; other
researchers believe that any difference above 0.0 is worth noting. Second, many studies have
suggested a much higher rating for class size. Prominent among these is the well-known
Student/Teacher Achievement Ratio Study (STAR) study.
STAR was set up as a result of some inconclusive debate about class size. Smith and Glass
(1980) did a meta-analysis of studies on class size and concluded that well-designed studies
produced quite different results from studies with minimal controls (p. 429). Adopting
stricter criteria they found that small classes have a decided advantage in relation to the
attitudes of students (0.47) and teachers (1.03) (A massive effect size in Hatties terms,
though, of course, he explicitly excludes attitudinal variables from his synthesis) and also in
relation to test performance in reading (0.30) and maths (0.32) (Hattie reports lower effect
sizes from this study). However, these findings were challenged and the STAR project was
set up to try to resolve the impasse. It studied 76 elementary schools in Tennessee in a
randomized experiment. Small was defined as 13-17, large as 22-25 students. Teachers
and students were randomized into small and big classes. The study of achievement was
carried out after two years when 6,750 children were subjected to standardised tests of
reading and maths on a pass/fail basis where 80% was a pass. Effects sizes varied but there
were some at 0.64, 0.66, and 0.62 which are clearly well above Hatties cut off for
significance (0.4) and about the same as most of the variables which he regards as very
important (Finn & Achilles, 1990). They claim that there was a clear positive effect,
particularly for minority groups and particularly in the early years. Predictably, their research
has also been criticised.
Similarly, in Britain, Blatchford and others came to the conclusion that previous studies
lacked the design features which would enable sound conclusions to be drawn and they set up
11
The Institute of London Class Size Study. They drew their sample from 8 LEAs, 199 schools,
330 classrooms and 7,142 students. They found many positive results for various process and
affective aspects of smaller classes and, in relation to attainment which is the focus of the
Hattie study, they found that There is clear effect of class size on childrens academic
attainment over the Reception year and there is a clear case for small class sizes during the
first year of schooling for both literacy and numeracy (Blatchford, 2003, p. 164). The
superior results for literacy were particularly obvious for lower ability children. While the
effects on individuals tended to continue into the second year, the researchers found no clear
evidence of class size differences beyond Year 1. Their data provide another cautionary tale:
in comparing classes of 15 with classes of 23, large differences were found; but there were
only negligible differences between classes of, say, 20 and 25 - sometimes in favour of the
larger class!). This again indicates that small and large are not clearly defined terms and
one must constantly be aware of what a particular researcher is studying.
Hattie concedes (2008, p. 86) that the low effects score for class size may be due to the fact
that teachers of smaller classes do not always vary their teaching to take advantage of the
smaller group. This is important. Simply reducing class size does nothing to the teaching-
learning process. Only if changes are also made to the teaching-learning interaction are any
achievement effects possible. This point was demonstrated by Murnane and Levy (1996) who
looked at the effects of additional resourcing (USD$300,000 per annum per school for five
years) in a sample of fifteen extremely poorly performing (as measured on mandatory state-
wide achievement tests) Texas primary schools serving low income, minority group children.
Thirteen of the fifteen schools showed no significant changes in student achievement over the
course of the study. In these schools, the additional resourcing was used primarily to reduce
class size by hiring additional teachers. This result is consistent with Hatties view that
reducing classes makes comparatively little difference to achievement. The other two schools
also used much of the money to reduce class sizes, but they also did other things: the principal
worked with parents and teachers to confront the problem of low achievement; children with
special needs were included in regular (now smaller) classes; teachers pedagogies were
changed by introducing reading and mathematics programmes previously only provided to
gifted and talented children in the district; health service provision was brought into the
schools; parents became heavily involved in school governance. After five years, attendance
at these two schools was among the highest in the city and test scores had risen to the city
12
average. In terms of accurately analysing the relationship between resources (including
smaller classes) and achievement, the study authors make three key points.
First, if the analysis of estimated effects had been conducted after only one year, the data
would have shown no effects because the changes in these two schools took several years to
take effect. Second, if estimated in conjunction with the data across all fifteen schools, the
analysis would have shown a small negative relationship to achievement (the average of large
effects in two schools and no effects in thirteen schools). Third, and most significantly, they
argued that if a model were devised that included interactions between class size,
instructional techniques, and investments in raising student attendance and increasing parental
involvement, the results would show that the package of changes had enormous effects. In
contrast, lowering class size and not changing anything else, especially not changing
instructional techniques, had no effect on achievement (Murnane & Levy, 1996, p. 95).
How would this study have been categorised by Hattie and where would it sit in his league
table of intervention effects? Was this Texas case a study of reduced class size, changed
instructional techniques, full-service schools, parent governance or something else? One
also has to assume that the class size studies Hattie reviewed did not all have the identical 0.2
effect size (it was an average), nor did they all have identical conditions (they did not all
replicate the one study design). In other words, even in a study of what he chose to classify as
class size, other confounding variables would necessarily have been at play, which would
also have had an impact on achievement. Hattie recognises that class size cannot usefully be
considered in isolation from other potentially important, pedagogically related variables.
Reducing class size may have only a small effect when considered in isolation but thats not
the issue. What matters is that reducing class size permits the teacher (and children) to do
things differently.
This is acknowledged by the Ministry of Education (undated) when commenting on the
PACE research.
The project findings point to a significant relationship between class sizes for new
entrants and the gains made in their achievement levels…. For maximum benefit from
this kind of approach, it is recommended that class sizes for children in their first year of
schooling in low decile schools should not exceed 18. The study showed that while
13
class size did make a difference, the smaller the classes the better the outcomes, but only
in conjunction with professional development. Without professional development, class
size may make no difference (emphasis ours).
Interestingly, the issue of class size was emphasised by the co-principal of one of the schools
in the PACE study:
The success of the programme has also been attributed to the board of trustees decision
last year to reduce junior class size from 28 students to 15 This has had an amazing
impact because the programme has to be done with groups of three children. When
youre involved with each group for 10-15 minutes at a time you cant have large
numbers in the classroom unless you have the support of a teacher aide. Smaller
numbers mean teachers are able to interact a lot easier with the groups and on a more
regular basis. (Stewart, 2001, unpaginated)
The claimed successes of the PACE programme have been ascribed to innovative teaching
techniques but could just as easily be ascribed to the smaller classes, or more likely, to the
interaction between the variables.
The point of mentioning these studies is not to prove that Hattie is wrong but to indicate
that drawing policy conclusions about the unimportance of class size would be premature and
possibly very damaging to the education of children, particularly young children and lower
ability children. A much wider and in- depth debate is needed.
Performance pay
Hatties conclusions about the importance of what teachers do has led some to advocate
performance pay (sometimes, in the past, called merit pay or payment by results). There
have been many attempts at instituting this, particularly in the USA. The judgment of a group
of researchers some years ago still stands: The promise of merit pay is dimmed by
knowledge of its history; most attempts to implement merit pay for public school teachers
over the past twenty-five years have failed (Murnane & Cohen, 1981).
The idea has been mooted in New Zealand. In 1985-86 a parliamentary select committee
produced the excellent Report on the Enquiry into the Quality of Teaching (The Scott Report)
14
(Education and Science Select Committee, 1986). Among the five members of this committee
was Ruth Richardson, who was campaigning for a voucher system of education. She would be
a dry Minister of Finance in the National Government after 1990 and was certainly no
bleeding heart liberal or lackey of the teacher unions. As was to be expected from its
composition, the committee produced a hard-hitting report which argued that measures of
teacher performance were urgently needed but acknowledged that the process of developing
such measures will be lengthy and complex and it advocated the setting up of a research unit
based at a university to try to develop sound measures. No such group has ever been set up
and no such measures have been developed for New Zealand schools. This might suggest that
rushing into a scheme in the 21st Century would not be a smart idea, particularly as the public
are rightly shocked at seeing huge (performance!) payouts to managers whose enterprises
have failed.
In the USA in particular there have been many more attempts to institute performance pay
over the past 25 years and there are varying reports of their successes and failures. However,
we have seen no evidence at all to support the claim that performance pay improves teaching
or learning and there is nothing in Hatties massive research which even remotely suggests
that it does. On the contrary, much of what he says suggests the very opposite. He says, for
example,
School leaders and teachers need to create school, staffroom, and classroom
environments where error is welcome as a learning opportunity, where discarding
incorrect knowledge and understandings is welcomed, and where participants can feel
safe to learn, re-learn, and explore knowledge and understanding. (p. 239)
He goes on to add that what is needed for school improvement is a caring, supportive staff
room, a tolerance for errors, and for learning from other teachers, a peer culture among
teachers of engagement, trust, shared passion, and so on (p. 240). Such a co-operative,
trusting, and self critical school atmosphere is the very kind of atmosphere which regimes of
performance pay destroy.
SIGNIFICANCE FOR POLICY AND PRACTICE
15
Teachers must learn to take account of research findings even when (particularly when) they
go against long-held beliefs. Hattie draws attention to a situation (p. 258) where teachers
ignored evidence in favour of their own deeply held beliefs. Teaching will never make
progress as a profession while this unwillingness persists.
However, the following comment of the late Roy Nash (whose contribution to debates on
these topics was unequalled and is deeply missed) is apposite.
There is something quite dangerous about the use of quantitative research for
propaganda purposes. It is likely that not one sociologist of education in ten is
competent to critique statistical methods in their own terms, and it is unlikely that the
proportion of teachers so equipped is any greater. (Nash, 2004, p. 49)
Policy makers must learn that research data cannot be automatically applied to practice.
Knowing, for example, that teachers should establish good feedback arrangements with pupils
does not tell any teacher what she is to do. Research knowledge has to be synthesised and
integrated with the teachers beliefs, values and experience. Hattie fully acknowledges this,
following Dewey in holding that: Evidence does not supply us with rules for action but only
with hypotheses for intelligent problem solving, and for making inquiries about our ends in
education (p. 247). There is also an irreducible value component to every teaching decision:
is the benefit of X sufficient to justify the cost (in terms of money and energy) of instituting
it? The presumed benefit must also be weighed against possible or proven harm, for example
attainment on a test might be improved a little by methods which inhibit the creativity of
students or damage their ability to relate to others.
Teacher educators must resist the temptation to simplify research evidence for students under
facile claims that research has shown …’. Unfortunately, the kind of conclusions presented
in this book readily lend themselves to such treatment even though Hattie explicitly warns
against using his material in this way.
CONCLUSION
16
In conclusion, we want to repeat our belief that John Hatties book makes a significant
contribution to understanding the variables surrounding successful teaching and think that it
is a very useful resource for teacher education. We are concerned, however, that:
(i) despite his own frequent warnings, politicians may use his work to justify policies which
he does not endorse and his research does not sanction;
(ii) teachers and teacher educators might try to use the findings in a simplistic way and not,
as Hattie wants, as a source for hypotheses for intelligent problem solving;
(iii) the quantitative research on school effects might be presented in isolation from the
historical, cultural and social contexts, and their interaction with home and community
backgrounds; and
(iv) there may be insufficient discussion about the aims of education and the purposes of
schooling without which the studies have little point.
It is important that students preparing for teaching learn about the research process and how
easily it leads to error rather than truth. They need to respect research but be acutely aware of
its limitations. The research that they need to know about goes beyond what happens in
schools and classrooms. As this review has shown, what students bring from their social class,
family, culture, home background and prior experiences is more important than what happens
in the school, even though what happens in the school (particularly what teachers are and do)
is very important. The secret of school improvement lies in the recognition of these factors
and their integration into a social, economic and educational programme.
REFERENCES
Blatchford, P. (2003). The class size debate: Is small better? Maidenhead, UK: Open
University Press.
Coe, R., & Rowe, K. (2004). What is an effect size? Camberwell, VIC.: Australian Council
for Educational Research.
Education and Science Select Committee (1986). Report on the enquiry into the quality of
Teaching (The Scott Report). Wellington: Government Printer.
Evidenced Informed Policy Network (Undated). Policy Synthesis. Retrieved 13 February
2009 from http://www.evipnet.org/php/level.php?lang=en&component=101&item=2
Finn, J.D. & Achilles, C.M. (1990). Answers and questions about class size: A statewide
experiment. American Educational Research Journal, 27(3), 557-577.
17
Gray, J., Jesson, D., & Jones, B. (1986). Towards a framework for interpreting examination
results. In R. Rodgers (Ed.), Education and social class (pp. 51-57). London: Falmer
Press.
Harker, R. (1995). Further comment on So Schools Matter?. New Zealand Journal of
Educational Studies, 30 (1), 73-76.
Harker, R. (1996). On First year university performance as a function of type of secondary
school attended and gender. New Zealand Journal of Educational Studies, 32 (2), 197-
198.
Hattie, J. (2003, October). Teachers make a difference: What is the research evidence? Paper
presented to the Australian Council for Educational Research annual conference: Building
teacher quality. Retrieved 13 February 2009 from
http://www.leadspace.govt.nz/leadership/articles/teachers-make-a-difference.php
Hattie, J. (2008). Visible learning: A synthesis of over 800 meta-analyses relating to
achievement. London: Routledge.
Ministry of Education (undated). Picking up the Pace. Retrieved 16 February 2009 from
http://www.minedu.govt.nz/educationSectors/PasifikaEducation/ResearchAndStatistics/Pi
cking UpThePace.aspx
Murnane, R, & Cohen D. (1986). Merit pay and the evaluation problem: Why most merit pay
plans fail and a few survive. Harvard Education Review, 56 (1), 1-17.
Murnane, R. & Levy, F. (1996). Evidence from fifteen schools in Austin, Texas. In G.
Burtless (Ed.). Does money matter? The effect of school resources on student achievement
and adult success (pp. 93-96). Washington: Brookings Institution Press.
Nash R. (2004). Teacher effects and the explanation of social disparities. New Zealand
Journal of Teachers Work, 1 (1), pp. 42-50.
OECD (2005). Teachers matter: Attracting, developing and retaining effective teachers.
Overview. Paris: OECD.
Retrieved 13 February 2009 from http://www.oecd.org/dataoecd/39/47/34990905.pdf
Schagen, I. & Hodgen, E. (2009). How much difference does it make? Notes on
understanding, using and calculating effects sizes for schools. Retrieved 30 March 2009
from http://www.educationcounts.govt.nz/publications/schooling/36097/36098
Smith M.L. & Glass, G.V. (1980). Meta-analyses of research on class size and its relationship
to attitudes and instruction. American Educational Research Journal, 17, pp. 419-433.
Stewart, K. (2001). Elated but not sated. Education Gazette, 80 (22). Retrieved 13 February
2009 from http://www.fulbright.org.nz/fulbrighthays-2003/projects/resources.html
... Like every other method, meta-analysesand especially the innovative attempt in Visible Learning to construct a synthesis of meta-analyses -are, of course, not without their flaws, and it is, therefore, important to refer to some of these criticisms (cf. Snook et al., 2009;Zierer, 2016b). ...
... Es müssen vielmehr Arten der Mediennutzung differenziert werden, bei denen sich vor allem die Lernaktivitäten zwischen der Lernsituation mit digitalen Medien von den Lernaktivitäten ohne digitale Medien unterscheiden. Hattie (2009) Arnold, 2011;Beywl & Zierer, 2015;Brügelmann, 2014;Snook et al., 2009;Terhart, 2011;Wecker, Vogel & Hetmanek, 2016), sodass begründete Zweifel an der Belastbarkeit an den von Hattie berichteten Resultaten bestehen (siehe auch Wecker, Vogel & Hetmanek, 2016). ...
Article
Im Rahmen eines systematischen Reviews von Metaanalysen zum Effekt des Einsatzes digitaler Medien im Schulkontext wird der Frage nachgegangen, inwieweit mit Hilfe digitaler Medien die Unterrichtsqualität bezüglich des Potentials der kognitiven Aktivierung gefördert werden kann. Zu diesem Zweck wurden die Kontroll- und Experimentalbedingungen von 79 Effektstärken aus 10 Metaanalysen an Hand der ICAP-Taxonomie codiert. Die ICAP-Taxonomie differenziert vier Ebenen der kognitiven Aktivierung: (1) passiv, (2) aktiv, (3) konstruktiv und (4) interaktiv. Die Effektstärken wurden nach Effekten digitaler Medien bei gleichen Aktivitätsebenen und Effekten digitaler Medien bei verschiedenen Aktivitätsebenen in Kontroll- und Experimentalbedingung geclustert. Die Befunde stützen die These, dass digitale Medien, die dazu genutzt werden die Wahrscheinlichkeit des Auftretens bestimmter kognitiver Prozesse innerhalb einer Aktivitätsebene zu erhöhen, einen positiven Effekt auf den Lernerfolg haben. Ebenfalls unterstützen die Befunde die These, dass digitale Medien, die dazu genutzt werden die Aktivitätsebene zu steigern, einen positiven Effekt auf den Lernerfolg haben.
Preprint
Full-text available
This study examines Cambodian higher education teachers' readiness for online teaching and learning (OTL) using a structural equation modeling (SEM) approach. The framework centers around three key dimensions: teachers' self-efficacy in technological, pedagogical, and content knowledge (TPACK), their perceived online teaching presence, and the institutional support they receive. A quantitative survey was administered to 140 teachers at the university level. Teachers' online teaching experience positively influenced their TPACK self-efficacy. Online teaching experience also had a significant positive impact on teachers perceived online teaching presences. Furthermore, teachers' perceptions of institutional support were positively associated with their online teaching experience. These results highlight the critical role of teachers' online teaching experience in shaping their readiness for OTL. The findings suggest that targeted professional development programs and institutional support mechanisms can effectively enhance teachers' self-efficacy, online teaching presence, and perceptions of institutional support in Cambodian higher education.
Article
Full-text available
ABSTRACT It is suggested in Education policy 2009 of Pakistan that an educational monitoring and inspection system should introduced in provincial and district level to make sure and to enhance the quality of education at the institutions. For this purpose the government of Khyber Pakhtunkhwa introduced a new educational monitoring authority (EMA). The monitoring authority is unique and it differs in its organization and mechanism. The purpose of this research is to find out the impact of educational monitoring authority on the quality of education in public sector schools. The data is collected through multi stage cluster sampling technique from five divisions randomly by using Logistic regression models and data was analyzed with the help of STATA Software. Majority of the findings are significant statistically, like, staff professionalism and attitude of education monitoring staff has a significant impact on dependent variables, result of students, regularity and absenteeism of teaching and non-teaching staff as well as dropout rate reduction. In addition to this, recruitment of teacher by national testing service has positive significant impact on student results, while students belonging to Kohat division have significant reduction in dropout rate. Due to this, it is suggested that education monitoring authority has to be implemented properly.
Book
This unique and ground-breaking book is the result of 15 years research and synthesises over 800 meta-analyses on the influences on achievement in school-aged students. It builds a story about the power of teachers, feedback, and a model of learning and understanding. The research involves many millions of students and represents the largest ever evidence based research into what actually works in schools to improve learning. Areas covered include the influence of the student, home, school, curricula, teacher, and teaching strategies. A model of teaching and learning is developed based on the notion of visible teaching and visible learning. A major message is that what works best for students is similar to what works best for teachers - an attention to setting challenging learning intentions, being clear about what success means, and an attention to learning strategies for developing conceptual understanding about what teachers and students know and understand. Although the current evidence based fad has turned into a debate about test scores, this book is about using evidence to build and defend a model of teaching and learning. A major contribution is a fascinating benchmark/dashboard for comparing many innovations in teaching and schools.
Article
Features of 59 studies of this relationship were coded and quantified and 371 findings were transformed into a common metric for statistical integration. Analysis, based on a logarithmic model, revealed a substantial relationship between class size and teacher and pupil attitudes as well as instruction. Favorable teacher effects (workload, morale, attitudes toward students) are associated with smaller classes as are favorable effects on students (self-concept, interest in school, participation). Smaller classes are associated with greater attempts to individualize instruction and better classroom climate. The results complement those of a previous meta-analysis that showed positive effects of class size on achievement.
Article
Richard J. Murnane and David K. Cohen use the framework of microeconomics to account for the short lives of most merit pay plans. They demonstrate that teaching is not an activity that satisfies the conditions under which performance-based pay is an efficient method of compensating workers. They then show that merit pay plans survive in a few school districts, in part because the districts are special and in part because the merit pay plans are quite different from conventional notions of performance-based pay.
Article
A large-scale experiment is described in which kindergarten students and teachers were randomly assigned to small and large classes within each participating school. Students remained in these classes for 2 years. At the end of each grade they were measured in reading and mathematics by standardized and curriculum-based tests. The results are definitive: (a) a significant benefit accrues to students in reduced-size classes in both subject areas and (b) there is evidence that minority students in particular benefit from the smaller class environment, especially when curriculum-based tests are used as the learning criteria. A longitudinal analysis of a portion of the sample indicated that students in small classes outperform their peers in kindergarten classes of regular size and also gain more in reading outcomes during the second year. The question of why these effects are realized remains largely unanswered, but in light of these findings, is particularly important to pursue.
Policy Synthesis. Retrieved 13 February
Evidenced Informed Policy Network (Undated). Policy Synthesis. Retrieved 13 February 2009 from http://www.evipnet.org/php/level.php?lang=en&component=101&item=2