ArticlePDF Available

How to Engage in Pseudoscience With Real Data: A Criticism of John Hattie’s Arguments in Visible Learning From the Perspective of a Statistician

Article

How to Engage in Pseudoscience With Real Data: A Criticism of John Hattie’s Arguments in Visible Learning From the Perspective of a Statistician

Abstract and Figures

This paper presents a critical analysis, from the point of view of a statistician, of the methodology used by Hattie in Visible Learning, and explains why it must absolutely be called pseudoscience. We first discuss what appears to be the intentions of Hattie’s approach. Then we describe the major mistakes in Visible Learning before reviewing the set of questions a researcher should ask when investigating studies and surveys based on data analyses, including meta-analyses. We give concrete examples explaining why Cohen’s d (the measure of effect size used in Visible Learning) simply cannot be used as some sort of universal measure of impact. Finally, we propose solutions to better understand and implement studies and meta-analyses in education.
Content may be subject to copyright.
McGill Journal of Education
Document généré le 13 sep. 2017 10:24
McGill Journal of Education
How to Engage in Pseudoscience With Real Data: A
Criticism of John Hattie’s Arguments in Visible
Learning From the Perspective of a Statistician
Pierre-Jérôme Bergeron et Lysanne Rivard
Volume 52, numéro 1, Winter 2017
URI : id.erudit.org/iderudit/1040816ar
DOI : 10.7202/1040816ar
Aller au sommaire du numéro
Éditeur(s)
Faculty of Education, McGill University
ISSN 0024-9033 (imprimé)
1916-0666 (numérique)
Découvrir la revue
Citer cet article
Pierre-Jérôme Bergeron et Lysanne Rivard "How to Engage in
Pseudoscience With Real Data: A Criticism of John Hattie’s
Arguments in Visible Learning From the Perspective of a
Statistician." McGill Journal of Education 521 (2017): 237–246.
DOI : 10.7202/1040816ar
Résumé de l'article
Cet article est paru en français dans l’édition 51-2. En raison
de la réception très favorable des lecteurs lors de sa
publication, l’équipe éditoriale de la RSÉM permet aux
lecteurs anglophones d’en lire une version en anglais. L’article
original est disponible à l’adresse suivante : http://
mje.mcgill.ca/article/view/9394.
Ce document est protégé par la loi sur le droit d'auteur. L'utilisation des services
d'Érudit (y compris la reproduction) est assujettie à sa politique d'utilisation que vous
pouvez consulter en ligne. [https://apropos.erudit.org/fr/usagers/politique-
dutilisation/]
Cet article est diffusé et préservé par Érudit.
Érudit est un consortium interuniversitaire sans but lucratif composé de l’Université
de Montréal, l’Université Laval et l’Université du Québec à Montréal. Il a pour
mission la promotion et la valorisation de la recherche. www.erudit.org
All Rights Reserved © Faculty of Education, McGill
University, 2017
McGILL JOURNAL OF EDUCATION • VOL. 52 NO 1 WINTER 2017
How to Engage in Pseudoscience With Real Data
237
HOW TO ENGAGE IN PSEUDOSCIENCE WITH
REAL DATA: A CRITICISM OF JOHN HATTIE’S
ARGUMENTS IN VISIBLE LEARNING FROM THE
PERSPECTIVE OF A STATISTICIAN
PIERRE-JÉRÔME BERGERON University of Ottawa
LYSANNE RIVARD (trans.) McGill University, Centre de recherche du Centre hospitalier
de l’Université de Montréal
ABSTRACT. This paper is a forum contribution that appeared in issue 51-2 in
French. Due to the “positive buzz” it garnered following its publication, the MJE
editorial team has made its translation available to our English readers. The
original version can be accessed here: http://mje.mcgill.ca/article/view/9394.
COMMENT FAIRE DE LA PSEUDOSCIENCE AVEC DES DONNÉES RÉELLES : UNE
CRITIQUE DES ARGUMENTS STATISTIQUES DE JOHN HATTIE DANS VISIBLE LEARNING
PAR UN STATISTICIEN
RÉSUMÉ. Cet article est paru en français dans l’édition 51-2. En raison de la
réception très favorable des lecteurs lors de sa publication, l’équipe éditoriale
de la RSÉM permet aux lecteurs anglophones d’en lire une version en anglais.
L’article original est disponible à l’adresse suivante : http://mje.mcgill.ca/
article/view/9394.
The work of John Hattie on education contains, seemingly, the most com-
prehensive synthesis of existing research in the field. Many consider his book,
Visible Learning (Hattie, 2008), to be a Bible or a Holy Grail: “When this work
was published, certain commentators described it as the Holy Grail of educa-
tion, which is without a doubt not too much of a hyperbole” (Baillargeon,
2014, para. 13).
Bergeron
238
REVUE DES SCIENCES DE L’ÉDUCATION DE McGILL • VOL. 52 NO 1 HIVER 2017
For those who are unaccustomed to dissecting numbers, such a synthesis does
seem to represent a colossal and meticulous task, which in turn gives the
impression of scientific validity. For a statistician familiar with the scientific
method, from the elaboration of research questions to the interpretation of
analyses, appearances, however, are not sufficient. According to the legend,
the Holy Grail is kept in the elusive castle of the Fisher King. When taking
the necessary in-depth look at Visible Learning with the eye of an expert, we
find not a mighty castle but a fragile house of cards that quickly falls apart.
This article offers a critical analysis of the methodology used by Hattie from
the point of view of a statistician. We can spin stories from real data in an
effort to communicate results to a wider audience, but these stories should
not fall into the realm of fiction. We must therefore absolutely qualify Hattie’s
methodology as pseudoscience. The researcher from New Zealand obviously has
laudable intentions, which we describe first and foremost. Good intentions,
nevertheless, do not prevent major errors in Visible Learning — errors which we
will discuss afterwards. The analysis process then leads to a list of questions
researchers should ask themselves when examining studies and enquiries based
on data analyses, including meta-analyses. Afterwards, in an effort to better
understand, we give concrete examples that demonstrate how Cohen’s d (Hat-
tie’s measure of effect size) simply cannot be used as a universal measure of
impact. Finally, to ensure that our quest does not remain unfinished, we offer
pathways of solutions with the objective of demystifying and encouraging the
correct usage of statistics in the field of education.
JOHN HATTIE’S INTENTIONS
The basic idea behind Hattie’s research, that is, to identify “what works best
in education” using scientific data, is not bad in and of itself. The desire for
rigor and concrete data is essential in order to describe the impact of mea-
sures on teaching and learning. Hattie draws from meta-analyses, which are
relatively complex statistical methods frequently used in, among many other
fields, medical and health research. The size of his synthesis appears impres-
sive: over 800 meta-analyses, comprising over 50,000 studies and millions of
individuals. Starting with over 135 effect sizes, it seems capable of measuring
a wide array of interventions with the potential to improve learning. Hattie is
not afraid of numbers, which is apparently not that common among researchers
in the field of education; this therefore gives the appearance of scientific rigor
to his work. Consequently, for a statistician, this seems like a very good start.
Unfortunately, in reading Visible Learning and subsequent work by Hattie
and his team, anybody who is knowledgeable in statistical analysis is quickly
disillusioned. Why? Because data cannot be collected in any which way nor
analyzed or interpreted in any which way either. Yet, this summarizes the New
Zealander’s actual methodology. To believe Hattie is to have a blind spot in
one’s critical thinking when assessing scientific rigor. To promote his work is
McGILL JOURNAL OF EDUCATION • VOL. 52 NO 1 WINTER 2017
How to Engage in Pseudoscience With Real Data
239
to unfortunately fall into the promotion of pseudoscience. Finally, to persist
in defending Hattie after becoming aware of the serious critique of his meth-
odology constitutes willful blindness.
METHODOLOGICAL ERRORS
Fundamentally, Hattie’s method is not statistically sophisticated and can be
summarized as calculating averages and standard deviations, the latter of which
he does not really use. He uses bar graphs (no histograms) and is capable of
using a formula that converts a correlation into Cohen’s d (which can be found
in Borenstein, Hedges, Higgins, & Rothsten, 2009), without understanding
the prerequisites for this type of conversion to become valid. He is guilty of
many errors, but his main errors correspond to two of the three major errors
in science cited by Allison, Brown, George, and Kaiser (2016) in Nature:
1. Miscalculation in meta-analyses
2. Inappropriate baseline comparisons
His most blatant calculation error is the case of the common language effects
(CLE), which take the form of a probability. Noticed in 2012 by Norwegian
researchers (Topphol, 2012), it is flagrant to the point of giving negative prob-
abilities or probabilities superior to 100%. Hattie only had to put together
a small table (see Table 1) to help the reader (and himself) see the relation
between the effect size and CLE.
TABLE 1. Correspondence between selected values of Cohen’s d and CLE equivalents
d0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 2.00 3.00
CLE 50% 56% 61% 66% 71% 76% 80% 84% 92% 98%
To not notice the presence of negative probabilities is an enormous blunder
to anyone who has taken at least one statistics course in their lives. Yet, this
oversight is but the symptom of a total lack of scientific rigor, and the lesser
of reasoning errors in Visible Learning. If Hattie had taken the trouble to
consult with an experienced statistician, he would not have committed such a
huge mistake. According to R. A. Fisher: “To consult the statistician after an
experiment is finished is often merely to ask him to conduct a post mortem
examination. He can perhaps say what the experiment died of” (Allison et
al., 2016, p. 28). The other calculation errors are not so much numerical as
they are related to inappropriate baseline comparisons and to the absence of
methodological rigor. Hattie believes that we can compare effect sizes because
Cohen’s d is a measure without a unit and gives examples of calculations:
Bergeron
240
REVUE DES SCIENCES DE L’ÉDUCATION DE McGILL • VOL. 52 NO 1 HIVER 2017
These two types of effects are not equivalent and cannot be directly compared.
We will come back to this later on. A statistician would already be asking many
questions and would have an enormous doubt towards the entire methodology
in Visible Learning and its derivatives.
QUESTIONS TO ASK
If there is a moral to the story of Perceval, or the Story of the Grail, Chrétien
de Troyes’ unfinished novel, it is that we must not hesitate to ask questions.
When confronted with any set of data, we must always know what is the main
question to which we are seeking an answer. Relatedly, we must know which
variables were measured and the way in which they were measured. What is
the target population? How was the sample collected? With comparison groups,
and especially when measuring an intervention, we must ask how individuals
were allocated to different groups. If individuals were not randomly assigned
to different groups, observed differences can result from the nature of the
groups rather than the treatment or the intervention. Another important ques-
tion is at which level were the variables measured (individual, group, school,
provincial, national)? These questions enable us to understand what a study
or a meta-analysis actually measured and in which context. Without knowing
the exact context, it is easy to misinterpret results and these misinterpreta-
tions can sometimes have significant consequences. The disaster of the space
shuttle Challenger is one example: the data selected to authorize the launch
indicated an absence of a relation between temperature and the risk of an ac-
cident because cases without any incidents had been excluded from the data
set (Kennet & Thyregod, 2006).
Hattie talks about success in learning, but within his meta-analyses, how do we
measure success? An effect on grades is not the same as an effect on graduation
rates. An effect on the perception of learning or on self-esteem is not neces-
sarily linked to “academic success” and so on. A study with a short timespan
will not measure the same thing as a study spanning a year or longer. And,
of course, we cannot automatically extend observations based on elementary
school students to secondary school or university students. The same applies to
the way we group different factors under a category without defining inclusion
and exclusion criteria. For example, the gender effect reported by Hattie is, in
fact, a mean of differences between boys and girls in the set of studies selected,
regardless of the duration, the level, or the populations studied.
A NON-EXISTENT UNIVERSAL MEASURE
Basically, Hattie computes averages that do not make any sense. A classic ex-
ample of this type of average is: if my head is in the oven and my feet are in
the freezer, on average, I’m comfortably warm. Another humoristic example
is: the average person has one testicle and one ovary and thus is a hermaph-
rodite. We wouldn’t say that the person making this kind of statement holds
McGILL JOURNAL OF EDUCATION • VOL. 52 NO 1 WINTER 2017
How to Engage in Pseudoscience With Real Data
241
the Holy Grail of biology research, yet this is exactly what Hattie does when
he aggregates every gender difference under the same effect. This is also true
for his other aggregations, whether they be “major” contribution sources (the
student, the home, the school, the teacher, the programme, or the teaching
method) or “individual” influences, such as the “disease” effect which com-
bines together disparate health problems, including cancer, diabetes, sickle-cell
anemia, and digestive problems. It goes without saying that certain of these
individual influences are much less frequent than others.
The fundamental problem here is that every effect size, despite the absence
of a unit, is a relative measure that provides a comparison to a set, group, or
baseline population, even if it may be implicit. To compare two independent
groups is not the same as comparing grades before and after an intervention
implemented with the same group. Hattie’s comparisons are arbitrary and he
is completely unaware of it. The selection of a baseline comparison defines the
direction (the positive or negative sign) of the effect size. In his “barometer,”
Hattie says that negative effects are reverse effects, which is not necessarily
the case since the comparison is often arbitrary. Would we say that the dif-
ferences in academic success that benefit girls are bad, whereas those that
benefit boys are good?
The effect of class size (under the “significant” bar according to Visible Learning,
which is 0.4) is positive and we suppose that we are comparing small classes
to larger classes (smaller classes have greater academic success). We could have
compared larger classes to smaller ones, and the effect would have been negative
(larger classes are less successful than smaller ones), and Hattie’s interpretation
(class size does not have a significant impact) would be completely different,
since a negative impact is considered to be harmful.
The same is true for socio-economic status. The effect size is large (0.59),
but since Hattie cannot change the socio-economic status of students, he
cares little about it. The implicit comparison is that wealthier students are
more successful than poorer students. As such, the baseline comparison is
comprised of poorer students. We could just as well compare poorer students
to wealthier students and, because the disadvantaged are less successful, the
socio-economic effect would be -0.59, the most negative of all, if nothing else
is changed. Subsequently, it becomes of interest to study how an education
system can help mitigate the effect of social inequalities, perhaps by drawing
from examples from Finland where this approach seems successful, according
to their PISA tests results (Reinikainen, 2012).
The other arbitrary decision is the creation of aggregates in order to calculate
average effects. Here, in addition to mixing multiple and incompatible dimen-
sions, Hattie confounds two distinct populations: 1) factors that influence
academic success and 2) studies conducted on these factors. As an analogy,
we could enumerate everything sold in a grocery store according to price
and conclude that seafood products have the greatest impact on one’s overall
Bergeron
242
REVUE DES SCIENCES DE L’ÉDUCATION DE McGILL • VOL. 52 NO 1 HIVER 2017
grocery bill because the price of caviar is exorbitant. Obviously, since the aver-
age consumer rarely if ever, purchases caviar, a weighted approach to prices is
needed in order to better reflect the actual products and the quantities pur-
chased by the average consumer. Now, let’s go back to the example of gender
and academic success. According to Hattie, the gender impact effect is 0.12
and therefore in favour of boys. If this number was representative of any sort
of reality, this would mean that boys are on average a little more successful in
school than girls. This is not the case in Quebec nor in most industrialized
countries (Legewie & DiPrete, 2012).
Hattie’s interpretation of effects is therefore not in the least objective. As men-
tioned earlier, according to his quadrant, effects below zero are bad. Between
0 and 0.4 we go from “developmental” effects to “teacher” effects. Above 0.4
represents the desired effect zone. There is no justification for this classifica-
tion. First of all, there is no reference point on a universal baseline to center
his null effect and to talk about development. Can a person who is alone and
without instruction learn by him/herself in a way that is measurable? If the
effects due to teachers fall between 0.15 and 0.4, why is the impact of teachers’
knowledge of subject matter only at 0.09? Can we say that someone unlearns
when the effect is negative? Does this mean that a person without sickle cell
disease or who is born full-term has inherent knowledge since Hattie decided
to put a positive effect on the absence of disease?
Finally, Hattie confounds correlation and causality when seeking to reduce
everything to an effect size. Depending on the context, and on a case by case
basis, it can be possible to go from a correlation to Cohen’s d (Borenstein et
al., 2009):
but we absolutely need to know in which mathematical space the data is
located in order to go from one scale to another. This formula is extremely
hazardous to use since it quickly explodes when correlations lean towards 1
and it also gives relatively strong effects for weak correlations. A correlation
of .196 is sufficient to reach the zone of desired effect in Visible Learning. In
a simple linear regression model, this translates to 3.85% of the variability
explained by the model for 96.15% of the unexplained random noise, there-
fore a very weak impact in reality. It is with this formula that Hattie obtains,
among others, his effect of creativity on academic success (Kim, 2005), which
is in fact a correlation between IQ test results and creativity tests. It is also
with correlations that he obtains the so-called effect of self-reported grades, the
strongest effect in the original version of Visible Learning. However, this turns
out to be a set of correlations between reported grades and actual grades, a set
which does not measure whatsoever the increase of academic success between
groups who use self-reported grades and groups who do not conduct this type
of self-examination.
McGILL JOURNAL OF EDUCATION • VOL. 52 NO 1 WINTER 2017
How to Engage in Pseudoscience With Real Data
243
EXAMPLE: THREE WAYS TO CALCULATE AN EFFECT SIZE
There are multiple valid ways to analyze a given data set; each of these methods
will illustrate a different aspect of the problem under study. For this reason,
one must absolutely ensure that they are using the right scale and the same
perspective when performing meta-analyses or computing effect size averages.
We can consider the following example: four independent groups with identi-
cal normal distributions (with, for example, an average of 75 and a standard
deviation of 5). The four groups are taught initially with the “standard” teach-
ing method. For the next teaching module, each group is randomly assigned
to one of three new teaching methods, labelled 1, 2, and 3, while one group
continues with the standard method, labelled method 0. At the end of the
module, the four groups pass an identical test and the results are compared
to measure an effect size. Let’s suppose that the increase in grades follows a
normal distribution and that, on average, method i increases individual grades
from point i with a standard deviation of i. The grades of the control group
do not change (actually, it can be seen as an increase of 0 with a standard
deviation of 0).
Like Hattie, the three effect size formulas rank the teaching methods in order
to identify the “best one.” To start, we can compare the experimental groups
to the control group (a). Then, we can look at the before and after grades of
each group (b), and finally, we can use a correlation between the before and
after grades of each group and convert into Cohen’s d (c). The effect sizes
are in Table 2.
TABLE 2. Comparison of the different methods used to calculate effect size
Group (a) (b) (c)
Control 0.00 N/A Infinity
Method 1 0.14 1.00 10.00
Method 2 0.27 1.00 5.00
Method 3 0.39 1.00 3.33
According to the effect sizes measured by formula (a), method 3 is the best
one and the only one that almost falls into the desired effect zone. Formula
(b) leads us to believe that the three methods are equivalent (even if in fact,
the real effect varies from one method to another), but all are very high in
the desired effect zone. Finally, according to formula (c), the standard method
is infinitely better than the others, and the order is completely reversed in
comparison to formula (a). What is going on?
Formula (a) compares independent groups between themselves and subse-
quently includes noise due to group variability. We are trying to distinguish
between the heavily overlapping four normal curves illustrated in Figure 1.
Luckily, the groups were identical before the intervention and the teaching
methods were randomly assigned. Thus, the measured effects are those of the
teaching methods.
Bergeron
244
REVUE DES SCIENCES DE L’ÉDUCATION DE McGILL • VOL. 52 NO 1 HIVER 2017
FIGURE 1. Grade distribution according to group
Since formula (b) measures grade increases within each group, we compare
each group to itself, which in turn makes the source of noise disappear (the
difference between groups). The measured effect is more “pure” but we lose the
capacity to compare groups between themselves since the standard deviation
changes from one group to another. By dividing the average increase by the
standard deviation, we lose a dimension. Normal distribution curves of grade
changes are represented in Figure 2. Although these curves are very different,
the measured effects are identical.
FIGURE 2. Distribution of differences between before and after grades according to group
McGILL JOURNAL OF EDUCATION • VOL. 52 NO 1 WINTER 2017
How to Engage in Pseudoscience With Real Data
245
Finally, as it is the case for many effects based on correlations, formula (c) does
not measure increases in grades (the effect of the teaching method) but measures
the noise surrounding this change. As the standard deviation of the increase
in grades grows, the correlation weakens. Subsequently, the conversion into d
results in a weaker effect for larger standard deviations (but enormous effects
in comparison to formulas (a) and (b)).
SOLUTION: CONSULT WITH A STATISTICIAN
The examples above describe but a small fraction of the fundamental reasoning
errors found in Visible Learning. We could spend ages dissecting each meta-
analysis, evaluating to which extent there are calculation and interpretation
errors, and describing the actual limits of the original analyses. There is also
a lack of space to explain the complexity and subtleties of proper modeling
of intervention effects calculated from different observational or experimental
studies, including questions on dose-effect relationships, geographic locations,
and time. All of this is completely lost when one decides to reduce everything
to one single number, because it is insufficient to represent reality.
In summary, it is clear that John Hattie and his team have neither the knowl-
edge nor the competencies required to conduct valid statistical analyses. No
one should replicate this methodology because we must never accept pseudo-
science. This is most unfortunate, since it is possible to do real science with
data from hundreds of meta-analyses.
Statistics and modern data science offer an array of rigorous tools that al-
low for a better understanding of collected data and to extract useful and
applicable conclusions. It goes without saying that the development of the
education system must be analyzed in a scientific manner, and for this, the
solution remains the same as the one proposed by Fisher many decades ago
(cited in Allison et al., 2016): we must consult with a statistician before data
collection. And during data collection. And after. But mostly, at each step of
the study. We cannot allow ourselves to simply be impressed by the quantity
of numbers and the sample sizes; we must be concerned with the quality of
the study plan and the validity of collected data.
For this, we must call upon experienced statisticians who will know how to
keep a watchful eye and to think critically. Every self-respecting university
offers a statistics consultation service to support scientific research. It is also
possible to obtain these services from private companies or consultants. There
is no reason why faculties of education should not call upon such services. It
is imperative to do so, because, as we have seen in Indiana Jones and the Last
Crusade, the consequences of choosing the wrong Grail are tragic.
Bergeron
246
REVUE DES SCIENCES DE L’ÉDUCATION DE McGILL • VOL. 52 NO 1 HIVER 2017
REFERENCES
Allison, D. B., Brown, A. W., George, B. J. & Kaiser, K. A. (2016). Reproducibility: A tragedy of
errors. Nature, 530, 27-29.
Baillargeon, N. (2014, 23 February). Visible learning [blogpost]. Retrieved from https://voir.ca/
normand-baillargeon/2014/02/23/visible-learning/
Borenstein, M., Hedges, L., Higgins, J., & Rothstein, H. (2009). Introduction to meta-analysis. Hobo-
ken, NJ: John Wiley & Sons.
Hattie, J. (2008). Visible learning: A synthesis of over 800 meta-analyses relating to achievement.
London, United Kingdom: Routledge.
Kennet, R., & Thyregod, P. (2006). Aspects of statistical consulting not taught by academia. Statistica
Neerlandica, 60(3), 396-411.
Kim, K. H. (2005). Can only intelligent people be creative? A meta-analysis. Prufrock Journal, 16(2-
3), 57-66.
Legewie, J., & DiPrete, T. A. (2012). School context and the gender gap in educational achievement.
American Sociological Review, 77(3), 463-485.
Reinikainen, P. (2012). Amazing PISA results in Finnish comprehensive schools. In H. Niemi, A.
Toom, & A. Kallioniemi (Eds.), Miracle of education (pp. 3-18). Rotterdam, Netherlands: Sense.
Topphol, A. K. (2012). Kan vi stole på statistikkbruken i utdanningsforskinga? [Can we rely on the
use of statistics in education research?]. Norsk Pedagogisk Tidsskrift, 95(6), 460-471.
PIERRE-JÉRÔME BERGERON is a private statistical consultant. He is also an adjunct profes-
sor at the Department of Mathematics and Statistics at the University of Ottawa and
holds a PhD in Statistics from McGill University. pierrejerome.bergeron@mail.mcgill.ca
LYSANNE RIVARD holds a PhD in Education from McGill University. She has conducted
research in a variety of elds including girls’ education, gender and development,
physical activity, and youth mental health. She is currently a Planning, Programming
and Research Ofcer for the Youth Mental Health and Technology Lab (CRCHUM)
and an Education Specialist for the International Baccalaureate Organization. lysanne.
rivard@mail.mcgill.ca
PIERRE-JÉRÔME BERGERON est consultant privé en statistique. Il est également pro-
fesseur auxiliaire au département de mathématiques et de statistique de l’Université
d’Ottawa et possède un doctorat en statistique de l’Université McGill.pierrejerome.
bergeron@mail.mcgill.ca
LYSANNE RIVARD détient un doctorat en éducation de l’Université McGill. Elle a effectué
des recherches dans plusieurs domaines dont l’éducation pour les lles, le genre et le
développement, l’activité physique et la santé mentale pour les jeunes. Présentement,
elle est agente de planication, de programmation et de recherche pour le laboratoire
Santé mentale des jeunes et technologies (CRCHUM) et spécialiste en sciences de
l’éducation pour l’Organisation du baccalauréat international. lysanne.rivard@mail.
mcgill.ca
... This facilitated an extrapolation of general regional trends when compared to the data from the forecasting pilot study. It is considered best practice to use between 3 to 5 years of data for trend analysis, whether this is for annual, quarterly, or monthly comparisons, with 3 years representing the minimum baseline that a statistician would typically use (Bergeron & Rivard, 2017;Santos, 2016, p. 419). The use of a larger timescale allows for better controls for outlier activity and mitigates any unique influences that may have occurred in isolation in any one year (Santos, 2016, p. 419). ...
... Cette synthèse de plus de 800 méta-analyses a surtout donné lieu à un classement de 138 facteurs influençant positivement ou négativement l'apprentissage et la réussite scolaire. Si la publication a connu un succès immense, de nombreuses critiques ont été émises à son encontre (voir par exemple, Bergeron et Rivard, 2017 ;Slavin, 2018), mettant en doute la scientificité et la pertinence de ses conclusions. En particulier, il a été reproché à Hattie de ne pas avoir vérifié la qualité scientifique des métaanalyses compilées ni des études incluses dans ces méta-analyses. ...
... Diese bemerkenswerte Korrelation zwischen Forschungsdesign und Effektstärke kompromittiert somit die Ergebnisse aller Metaanalysen, die über Studien mit unterschiedlichem Design mitteln. Das vermutlich bekannteste Beispiel für dieses problematische Vorgehen ist die Hattie-Studie (siehe auchBergeron und Rivard (2017) für eine detaillierte Kritik an den statistischen Methoden der Hattie-Studie). ...
... for an update. 2. Bergeron (2017), in a paper dedicated to critiquing Hattie's use of statistics, concluded, "In summary, it is clear that John Hattie and his team have neither the knowledge nor the competencies required to conduct valid statistical analyses" (p. 245). ...
Article
In recent years, John Hattie’s book Visible Learning (2009) has greatly influenced educational practitioners and policymakers. The visible learning approach has been deemed “the Holy Grail of teaching” (Mansell, 2008), and Hattie has been called the “messiah” of educational research (Evans, 2012). In this article, we outline some of the significant methodological problems embedded in Hattie’s work and relate them to his theoretical stance. We argue that his focus on single causal factors causes him to disregard important dimensions in educational practice. Furthermore, by analyzing parts of the primary research and the meta-analysis upon which Hattie grounds his conclusions, we find both serious methodological challenges and validity problems. We relate these problems to the technological rationality that informs Hattie’s work and implicitly constitutes his theoretical approach. Finally, we outline, among other things, how questions of human agency and intentionality for attending school become marginalized as the broader consequences of using Hattie’s approach to institutionally organize teaching processes. Another consequence of Hattie’s work is that educational research begins with questions of methods rather than research into schools’ everyday teaching practices.
Article
This research analyzed the state of metacognition in South African Physical Science classrooms, the extent to which South African Physical Science teachers possess metacognitive awareness, and how effective the teachers are in fostering metacognition. Assessment of the current state of metacognition in a sample of Physical Science classes at two KwaZulu-Natal districts, inferring the observed and analyzed level of metacognition of the participants, was based on the assumption that low levels of metacognitive awareness was the reason for achieved poor results. The major findings of this research found; there is a poor state of metacognitive awareness within the studied Physical Science classrooms.
Article
This article interrogates the kind of research that is useful for teachers. The teacher is a reflective practitioner who draws on research findings, but also uses personal reflection, to work with students, with a focus on evidence-based practice. The strengths, but also the controversies of this research paradigm are discussed. The role of context in all educational practice is highlighted, and research approaches that take this context into account are discussed. A reflection on the singular in education is conducted, with a focus on the term transduction, which concerns the passage from a singular to a singular by analogy. It is a type of logic, of reasoning, of mental approach that is not reduced to deduction and induction, and which allows the passage from one singular situation to another, from close to close, while leaving a good place to the imaginary. This article advocates an articulation between research aimed at the general and contextualized research, thereby allowing the teacher to grasp the singularity of a pedagogical situation.
Article
Global policy development in schooling often looks to solutions that embrace technology and articulate a ‘what works’ approach to practice. This paper takes as a case study the Australian Government’s Through Growth to Achievement report on achieving school excellence. It analyses the report through the lens of ‘sublimes’, which provide insight into the way certain policy rhetoric can become reified. Through Growth to Achievement emphasises technological solutions which align with and reify an aesthetic notion of good teaching that draws on the ‘what works’ literature. This analysis argues that certain assumptions about technology, teacher practice and the ‘evidence’ for good policy are based on these reifications. A sublimes analysis makes these unspoken assumptions explicit and reveals the way they reinforce globalised policy approaches and foreclose others.
Article
The idea that certain outdoor education (OE) programs consistently improve character traits has been a recurring theme not only in OE practice but also in some approaches to research and theory (Brookes, 2003a, 2003b). Sometimes referred to as “character building,” such approaches to OE persist although perhaps less prominently than in the past (Dyment & Potter, 2015). The idea of character-trait building is consistent with everyday beliefs about personality, but what is surprising is that strands of OE research and scholarship remain rooted in beliefs about personal traits that have been scientifically discredited for decades. This article considers some barriers which could help explain why OE research and scholarship have failed to exhibit a paradigm shift which should have eventuated had OE research more faithfully reflected key developments in psychological science and stayed within the bounds of scientific credibility.
Article
Full-text available
I denne rapporten settes fokus på om, og hvordan, bruk av teknologi i undervisningen påvirker kognitivt læringsutbytte. Utgangspunktet er et mindre utvalg eksperimentelle forskningsarbeid i form av primærstudier, meta-studier og andre ordens meta-studier publisert i perioden 1999–2017. I tillegg drøftes to ulike tilnærminger til teknologi og læring og hvilke forventninger som stilles til digital teknologi i statlige dokument. Rapportens hovedfunn er at man samlet sett finner en mindre, positiv sammenheng mellom teknologibasert undervisning og læringsutbytte målt som effektstørrelse. Dette synes å være et stabilt resultat over flere tiår og er et resultat som også er gyldig for bruk av analog teknologi i undervisningen (radio, TV o.a). Den store variasjonen i resultat, både når det gjelder primærstudier og meta-studier, gjør at gjennomsnittsresultatets betydning svekkes mens de forhold som forklarer variasjonene blir viktigere. Av de mulige forhold som påvirker effektstørrelsene viser det seg at: teknologibasert undervisning som supplement er bedre enn samme undervisning gitt som erstatning. teknologien i seg selv har marginal betydning, det er forskjell i undervisningsopplegget og tid til disposisjon som primært forklarer forskjellene mellom kontroll- og eksperimentgruppene. teknologien bidrar primært til en viss effektivisering av kjent undervisning, den bebudede transformasjon av undervisning og læring er det få spor av.
Research
Full-text available
Expert submission to the inquiry. https://library.nswtf.org.au/libero/WebOpac.cls?VERSION=2&ACTION=DISPLAY&RSN=20761&DATA=TFB&TOKEN=St3S2izxg67986&Z=1&SET=1
Article
Mistakes in peer-reviewed papers are easy to find but hard to fix, report David B. Allison and colleagues.
Chapter
This chapter highlights Finnish students’ outstanding success in PISA studies during the last decade. This success has been a great joy to educational practitioners and decision makers in Finland. It has been amazing how the Finnish education system, with only average monetary investments, a very small amount of homework and lesson hours and extremely light education evaluation (no inspection system) can reach such results high quality and equality in international comparisons.
Book
This unique and ground-breaking book is the result of 15 years research and synthesises over 800 meta-analyses on the influences on achievement in school-aged students. It builds a story about the power of teachers, feedback, and a model of learning and understanding. The research involves many millions of students and represents the largest ever evidence based research into what actually works in schools to improve learning. Areas covered include the influence of the student, home, school, curricula, teacher, and teaching strategies. A model of teaching and learning is developed based on the notion of visible teaching and visible learning. A major message is that what works best for students is similar to what works best for teachers - an attention to setting challenging learning intentions, being clear about what success means, and an attention to learning strategies for developing conceptual understanding about what teachers and students know and understand. Although the current evidence based fad has turned into a debate about test scores, this book is about using evidence to build and defend a model of teaching and learning. A major contribution is a fascinating benchmark/dashboard for comparing many innovations in teaching and schools.
Article
IntroductionIndividual studiesThe summary effectHeterogeneity of effect sizesSummary points
Article
Some research has shown that creativity test scores are independent from IQ scores, whereas other research has shown a relationship between the two. To clarify the cumulative evidence in this field, a quantitative review of the relationship between creativity test scores and IQ scores was conducted. Moderating influences of IQ tests, IQ score levels, creativity tests, creativity subscales, creativity test types, gender, age, and below and above the threshold (IQ 120) were examined. Four hundred forty-seven correlation coefficients from 21 studies and 45,880 participants were retrieved. The mean correlation coefficient was small (r = .174; 95% CI = .165 - .183), but heterogeneous; this correlation coefficient indicates that the relationship between creativity test scores and IQ scores is negligible. Age contributed to the relationship between intelligence and creativity the most; different creativity tests contributed to it secondly. This study does not support threshold theory.
Article
Today, boys generally under-perform relative to girls in schools throughout the industrialized world. Building on theories about gender identity and reports from prior ethnographic classroom observations, we argue that the school environment channels the conception of masculinity in the peer culture, and thereby either fosters or inhibits the development of anti-school attitudes and behavior among boys. Girls' peer groups, in contrast, do not vary as strongly with the social environment in the extent to which school engagement is stigmatized as “un-feminine.” As a consequence, boys are more sensitive to school resources that create a learning oriented environment than are girls. Our analyses use a quasi-experimental research design to estimate the gender difference in the causal effect on test scores, and focus on peer SES as an important school resource. We argue that assignment to 5th grade classrooms within Berlin schools is practically random, and we evaluate this selection process by an examination of Berlin's school regulations, by simulation analysis, and by qualitative interviews with school principles. Estimates of the effect of SES composition on male and female performance strongly support our central hypothesis, and other analyses support our proposed mechanism as the likely explanation of the gender differences in the causal effect. Download The paper is available on the SAGE - American Sociological Review website. For some reason, SSRN keeps removing my links to this page...
Article
Education in statistics is preparing for statistical analysis but not necessarily for statistical consulting. The objective of this paper is to explore the phases that precede and follow statistical analysis. Specifically these include: problem elicitation, data collection and, fol- lowing statistical data analysis, formulation of findings, and presenta- tion of findings, and recommendations. Some insights derived from a literature review and real-life case studies are provided.Areas for joint research by statisticians and cognitive scientists are outlined.
Visible learning [blogpost
  • N Baillargeon
Baillargeon, N. (2014, 23 February). Visible learning [blogpost]. Retrieved from https://voir.ca/ normand-baillargeon/2014/02/23/visible-learning/
Kan vi stole på statistikkbruken i utdanningsforskinga? [Can we rely on the use of statistics in education research?]. Norsk Pedagogisk Tidsskrift
  • A K Topphol
Topphol, A. K. (2012). Kan vi stole på statistikkbruken i utdanningsforskinga? [Can we rely on the use of statistics in education research?]. Norsk Pedagogisk Tidsskrift, 95(6), 460-471.