Australia has declared its ambition to be within the ‘top five’ in the Programme for International Student Assessment (PISA) by 2025. So serious is it about this ambition, that the Australian Government has incorporated it into the Australian Education Act, 2013. Given this focus on PISA results and rankings, we go beyond average scores to take a close look at Australia's performance in PISA, examining rankings by different geographical units, by item content and by test completion. Based on this analysis and using data from interviews with measurement and policy experts, we show how uninformative and even misleading the ‘average performance scores’, on which the rankings are based, can be. We explore how a more nuanced understanding would point to quite different policy actions. After considering the PISA data and Australia's ‘top five’ ambition closely, we argue that neither the rankings nor such ambitions should be given much credence.
Discourse: Studies in the Cultural Politics of Education
Leaning too far? PISA, policy and Australia's ‘top
five’ ambitions
Radhika Gorur & Margaret Wu
To cite this article: Radhika Gorur & Margaret Wu (2015) Leaning too far? PISA, policy and
Australia's 'top five' ambitions, Discourse: Studies in the Cultural Politics of Education, 36:5,
647-664, DOI: 10.1080/01596306.2014.930020
Published online: 30 Jun 2014.
Leaning too far? PISA, policy and Australia's top five ambitions
Radhika Gorur and Margaret Wu
The Victoria Institute, Victoria University, Melbourne, VIC, Australia
Australia has declared its ambition to be within the top fivein the Programme for
International Student Assessment (PISA) by 2025. So serious is it about this ambition,
that the Australian Government has incorporated it into the Australian Education Act,
2013. Given this focus on PISA results and rankings, we go beyond average scores to
take a close look at Australias performance in PISA, examining rankings by different
geographical units, by item content and by test completion. Based on this analysis and
using data from interviews with measurement and policy experts, we show how
uninformative and even misleading the average performance scores, on which the
rankings are based, can be. We explore how a more nuanced understanding would
point to quite different policy actions. After considering the PISA data and Australias
top fiveambition closely, we argue that neither the rankings nor such ambitions
should be given much credence.
Keywords: PISA; education policy; Australian education reforms; objectivity
Whats the good of [the rankings]? What is the benefit to the US to be told that it is number
seven or number 10? Its useless, meaningless, except for a media beat up and political
huffing and puffing. Its very important for the US to know, having defined certain goals like
improving participation rates for impoverished students from suburbs in large cities
whether in fact that is happening, and if it is, why it is happening and if not, why not. And it
is irrelevant whether Chile or Russia or France is doing better or worse that doesnt help
one bit in fact it probably hinders. Makes people feel uncertain, unsure, nervous, and they
rush over there and find out why they are doing better. (Malcolm Skilbeck, former Deputy
Director of Education, OECD; interview transcript)
In August 2012, Julia Gillard, then Prime Minister of Australia, declared that Australia would
strive to be ranked in the top fivein international education assessments by 2025. This
generated a great deal of media attention. Soon after this declaration, the results of two
international assessments Trends in Mathematics and Science Study (TIMSS) and Progress
in International Reading Literacy Study (PIRLS), both conducted by the International
Association for the Evaluation of Educational Achievement (IEA), were released and
Australia had ranked rather low in both these assessments: 18th and 25th in TIMSS
mathematics and science, respectively, and27thonPIRLS,forfourth-gradestudents
(Thomson et al., 2012). The TIMSS and PIRLS rankings heightened the anxiety that
Australian politicians had already been expressing with regard to the slidein the countrys
performance in another international assessment, the Programme for International Student
Assessment (PISA), conducted by the Organisation for Economic Co-operation and
Development (OECD), and it reinforced the governments determination to get into the
Discourse: Studies in the Cultural Politics of Education, 2015
Vol. 36, No. 5, 647–664
Vol. 36, No. 5, 647664,
© 2014 Taylor & Francis
top fivein international rankings. So strong is this ambition that it has been inscribed into
the Australian Education Act of 2013 as its very first objective, which reads: Australia to be
placed, by 2025, in the top 5 highest performing countries based on the performance of
school students in reading, mathematics and science(Australian Education Act, 2013)as
measured in PISA. This objective of being placed in the top fivehas led to an intensification
of Australias desire to learn from the systems that are currently in the top fivein PISA, so
that Australia may displace one of them on the PISA league table.
PISA rankings are based on the average performance scores of students in tests of
reading, mathematical and scientific literacy. Average performance score, however, is
only one of many possible measures on which education systems can be ranked on the
basis of PISA data. Just as the ranking of countries depends on what is tested, who is
tested and which tests are used, so too does it depend on what kinds of analyses are
performed. More relevantly for this discussion, the average score rankings obscure a great
deal of variation and are not particularly useful for developing strategies to improve the
So, given Australias ambition to be in the top five by 2025, this paper looks beyond
average performanceand interrogates PISA data in some detail, examining rankings by
different geographical units, by item content and by test completion. Based on this
analysis, and supported by interviews with expert informants policy-makers, OECD
officials and measurement experts we argue that PISA data are quite complex and need
to be examined very closely and understood with great nuance: aggregations such as
average scores hide more than they reveal. Using examples, we make the case that, in part
because of the complexity of international educational assessments and comparisons, the
leap from datato policyis a treacherous one.
We begin with a brief overview of Australias engagement with PISA and sketch our
recent history that led up to the declaration of Australiastop five in PISAambition.
Next, we present a brief survey of the critique of PISA and explain our methodological
and analytical approach. This is followed by a detailed analysis of Australias
performance, where three aspects of performance are examined unit of analysis, item
content and test completion to demonstrate how average scores could be quite
misleading, particularly if used as the basis for policy decisions. Finally, we explain why
the leap from PISA average score data to policy is problematic.
Australia, PISA and the top fiveambition
PISA was conceptualised in the late 1990s, and the first PISA survey was conducted in
2000. Australia has been actively involved with PISA from its very inception. An Australian
organisation, the Australian Council for Educational Research (ACER), led the consortium
that successfully bid for and later developed and managed PISA until the 2012 survey.
Australian professor, Barry McGaw, was at the helm, as Director of the Education at the
OECD, when PISA was introduced. Australian psychometricians, statisticians, analysts and
academics actively use PISA data to produce various reports and working papers for the
OECD, ACER and the federal and state governments in Australia.
Australia has been ranked high quality(i.e., having performance scores above the
average for OECD countries) consistently in each PISA survey so far. But as more
nations and systems joined the survey (43 in 2000, 58 in 2006 and 65 in 2012),
Australias rankings have slipped’–as elaborated in Table 1 below.
Australias scores
have also declined between 2000 and 2009 (see Table 2). The OECD also examines the
equity of school systems, based on the correlation between the performance of students
and their socio-economic status (SES). Systems where this correlation is higher than the
average for OECD countries are labelled low equity. Australia was rated low equityin
PISA 2000, but it recovered from this position and has been consistently placed in the
high quality, high equityquadrant in all the subsequent PISA surveys (for a detailed
account of how equity is measured in PISA, see Gorur, 2014; Rutkowski &
Rutkowski, 2013).
Australias performance in 2011 in another major international test, IEAs TIMSS,
also had not improved compared to the 2007 results (Thomson et al., 2012). Coinciding
with the 2011 TIMSS, Australia participated for the first time in PIRLS, which tests
studentsreading literacy in Grade 4 (Year 4 in Australia). Australia was placed 27 out of
45 systems in PIRLS with performance significantly lower than Ireland and Northern
Ireland, the USA, England, Canada, Hong Kong, Singapore and Chinese Taipei
(Thomson et al., 2012). Australias poor performance in the 2011 TIMSS and PIRLS
has reinforced the alarm over the state of the education system based on its PISA results.
Australiasdeclining performancehas been taken up by Australian politicians and
policy-makers and reported widely in the media.
In response to this challenge of declining scores, there is, currently, a huge appetite
in Australia for borrowing from the policies and practices of four of the PISA top five’–
the East Asian systems of Shanghai (China), Korea, Singapore and Hong Kong (China).
The recent focus in Australia on Asia with the Henry Report on the Asian Century
(Commonwealth of Australia, 2012) has accentuated this desire to learn from the high-
performingsystems of East Asia.
Interest in learning from the Asian PISA elite is illustrated by the process used in
developing an influential report that was published in 2012 the Grattan Institutes
Catching Up: Learning from the Best School Systems in East Asia (Jensen, Hunter,
Table 1. Ranking of Australia in PISA 2000, 2003, 2006 and 2009 on the reading, mathematics
and science literacy scales.
Year Reading Mathematics Science
2000 4 5 7
2003 4 11 6
2006 7 13 8
2009 9 15 10
Table 2. Average scores of Australia in PISA 2000, 2003, 2006 and 2009 on the reading,
mathematics and science literacy scales.
Year Reading Mathematics Science
2000 528 (3.5) 533 (3.5) 528 (3.5)
2003 525 (2.1) 524 (2.1) 525 (2.1)
2006 513 (2.1) 520 (2.2) 527 (2.3)
2009 515 (2.3) 514 (2.5) 527 (2.5)
Note: Standard error in brackets.
Sonnemann, & Burns, 2012). In the section titled How we wrote the report and how to
read it, the authors say:
In September, 2011, Grattan Institute brought together educators from Australia and four of
the worlds top five school systems: Hong Kong, Shanghai, Korea and Singapore. The
Learning from the Best Roundtable, attended by the Prime Minister, Julia Gillard, and the
Federal Minister for School Education, Early Childhood and Youth, Peter Garrett, sought to
analyse the success of these four systems, and what practical lessons it provided for Australia
and other countries.
Following the Roundtable, researchers from Grattan Institute visited the four education
systems studied in this report. They met educators, government officials, school principals,
teachers and researchers. They collected extensive documentation at central, District and
school levels. Grattan Institute has used this field research and the lessons taken from the
Roundtable to write this report. (2012,p.6)
While acknowledging that practices cannot simply be plucked from one context and
uncritically adopted in another, Jensen et al. go on, nevertheless, to explain what it is that
these nations are doing that places them at the top of the PISA tables, promising that their
report shows how these practices can be adopted to improve Australias performance.
There are several issues with both the substance and the process in this approach of
learning from the best practices. Rushing off to observe the practices of high-performing
systems and then concluding that these practices are the reason for their success can lead
to erroneous conclusions; the same practices could well be prevalent in low-performing
systems as well that information is not available, since only high-performing nations are
observed. Moreover, there is no way of knowing how much better the scores of these
nations might have been, had they been using other practices. So there is a basic flaw in
the premise upon which this kind of learning from the bestresearch rests. Further,
setting up such a forum where ministers from successfulsystems explain how they
achieved their excellent results renders the subsequent field work practically redundant
the lessonsare already presented to policy-makers before the field work has
This practice of learning from the bestis, however, neither new nor specific to
Australia. In 2007, McKinsey had done similar work in their report How the Worlds Best-
performing Schools come out on Top (Barber & Mourshed, 2007), based on the studies of
PISA high performers, and Americas Common Core published the report Why Were
Behind: What Top Nations Teach Their Students But We Dont(Common Core, 2009)
which examined the policies and practices of countries that performed better than the
USA in order to argue for particular approaches to effect improvement.
Critiquing PISA and its use in policy
The widespread influence of OECD and PISA on education policies has not gone
unnoticed by critics, and there is a large body of literature on the subject. Some of the
critique has contextualised PISA within broader discussions about globalisation and the
spread of neoliberalism, new public management and marketization (for example, Grek,
2007,2009; Rizvi & Lingard, 2010; Stronach, 2010). In these types of critique, PISA is
often a policy objectthat is a symptom, an example, a consequence or one of the causes
of the coalition of practices which make up the neoliberal imaginary. PISA is also well
covered in the press in many countries, and the effects of media attention on the
650 R. Gorur and M. Wu
discourses and public perceptions of PISA have also been discussed (for example, Whitty,
2009). Studies of the effects of PISA on the education policies and reforms in particular
nations also abound (for example, Breakspear, 2012; Simola, 2005). The focus in these
critiques is often not an engagement with the actual PISA data; more often, it is PISAs
uptake in policy and the media, its use in justifying policies and its influence on political
narratives and policy practices.
Assessment and measurement experts, on the other hand, focus their critiques on
technicalaspects of PISA examining the fitness and effects of particular methodo-
logical choices and the validity and reliability of the modeling and the calculations.
Bracey (2008), for example, has argued that PISAs use of One Dimensional Item
Response Theory limits its analytical possibilities. Bautier and Rayou (2007) demon-
strated that the reasons for studentscorrect or wrong answers cannot be predicted by an
a priori analysis of the items, thus calling PISAs reliability into question. The modeling
that underpins survey items related to interest in sciencehas been examined by Ainley
and Ainley (2011) who argue that constructs such as interestare premised upon Western
understandings, and conclusions drawn from the responses of students from other
cultures, whose responses are determined by different sets of social and cultural histories,
could be distorted. In such critique, PISA is seen as a technical exercise and the effort is
to assess the extent to which PISA produces accurate representations of realities. The goal
is to encourage PISA to become more precise and accurate in representing an already
existing world. Focusing on measurement as a technical exercise deflects attention from
the performativity and politics of such calculations (Barad, 2003; Desrosières, 1998;
Porter, 1995,2003; Stengers, 2011).
In this paper, we attempt to promote critique that does not make a cut between the
politicalon the one hand and the technicalon the other. The result of a collaboration
between a statistician and a sociologist of measurement with a common interest in PISA,
our analysis attempts to pick its way carefully, treating PISA neither as purely political,
nor as purely technical, but as a hybrid: a socio-technical object. Our symmetrical
analytical approach and its political impulse are based in the theoretical resources of
actor-network theory (ANT). In particular, we use the approach of a sociology of
measurement(Derksen, 2000; Gorur, 2014)a term abbreviated from Woolgars(1991)
use of the term sociology of measurement technologies, to draw attention, as Woolgar
did, to the social and instrumental nature of measurement, as well as its productive
capacity. Here instrumentality refers both to the influence of the instruments and
methodologies used in measurement, and to the way in which things are made to work,
through cajoling, persuading, coercing, compromising and so on what in ANT is called
the translation of interests. The instrumentality of apparently objectivestatistical
practices can be understood by observing the everyday practices of statisticians (see
Gorur, 2011, for an example). Productive capacityinvokes the idea that measurement is
not merely representative or descriptive but also productive of realities (Latour, 2005). In
other words, unlike technicalcritique, which takes measurement to be representational
or descriptive, our approach sees measurement as world-making. Whilst not criticalor
politicalin the traditional sense, our critique nevertheless seeks to influence policy. The
aim is to persuade by empirical analysis rather than through normative or theoretical
Our argument incorporates statistical analysis using the PISA database, which is
available online,
as well as methodologies more usually associated with policy
ethnographies, such as interviews. These interviews occurred over a period of several
years, starting in 2009, and were conducted across several related projects including the
ongoing strand of work of the first author. The interviewees were experts who were often
uniquely placed, by virtue of their specialised expertise and their official positions, to
provide insights about the phenomena being studied that were not available to others. As
such, these interviews were more in the nature of conversational and collegial
opportunities to explore the phenomena under discussion (large-scale comparisons,
PISA, contemporary issues in education policy and so on), than datato be analysed by
parsing out themes or for performing discourse analysis. In keeping with the ANT
tradition, there was no effort to overlay the interview data with particular social theories
or to seek to understand what lay behindthe words of the informants; the purpose of the
interviews was to let the actors narrate their own theories and to get them to explain how
they made sense of their worlds (Latour, 2005). The intervieweesexpertise often resulted
in explanations that were elegant and economical; so where appropriate, their explana-
tions have been presented verbatim in this paper. In some cases, the interview data
provoked our analysis. In others, they helped us interpret the analysis and to link it to the
way it played out in policy.
Beyond average performance
If a nation is looking to introduce reforms to raise its PISA performance, there is good
reason to look beyond the rankings and the average scores, as these provide no guidance
for policy reform. Average performancerankings provide little of practical benefit by
way of lessons to learn. They simply point out that there is room for improvement, but
they can provide no pointers about where to focus policy efforts. One OECD official
explained why top fiveis quite a complex idea, and why one needs to ask complex
questions to inform policy:
Top five in what? [F]or which students? The average student in Canada, in Korea,
Finland, Shanghai, China thats one thing. If you then look at high-performing students or
how low performing students do, then we may get a completely different picture. And thats
where policy efforts are most interesting for me.
To raise its average performance, Australia would need to develop a nuanced strategy,
targeting particular aspects of the system or particular groups, and focusing resources and
attention on areas of reform that would have the most gratifying and immediate effects on
its performance scores, particularly since resources are never unlimited. For example, it
could aim at raising the performance of the lowest-performing students, or it could focus
on the top 10% of achievers, or direct greater attention towards specific groups such as
Indigenous students or refugee migrants. Alternatively, it could focus attention on
particular content areas in which its students traditionally perform poorly and try to
improve instruction in those areas. So in this section, we examine PISA data more closely
to see what we can learn. We explore how these understandings might inform policy and
explore the complexities of the data as well as the difficulties of making inferences on
their basis.
Unit of analysis
Initially, PISA was designed for the purpose of measuring the educational performance of
the OECD nations. However, with each subsequent round of PISA, more and more
education systems have shown interest in participating in PISA. In some cases, these
education systems have been associated with cities or provinces rather than whole
countries. China, for example, does not participate as a whole, but Hong Kong and
Shanghai have begun to participate. So PISAs league tables report on jurisdictions of
various sizes, and comparisons include countries like Australia and provinces like
There are some issues with this kind of comparison. First, demographic characteristics
are often associated with geography, which is also typically tied to SES. The populations
of cities, in general, have some distinct characteristics urban populations generally fare
better than rural ones in terms of educational performance. In some cases, certain
provinces or cities might have a preponderance of a certain ethnic group or a particular
SES group and this also produces differences in performance. Importantly, governance
structures and the challenges of governing would vary greatly between, for example, a
country like Australia and a city like Shanghai. So the unit of analysis is an important
consideration in making comparisons. But in PISA rankings, differences in size,
demography and governance structures are ignored all systemsare ranked as if they
were the same, whether they are cities, provinces, small countries or very large nations.
Where entire countries participate, PISA presents country-level average performance
on its league tables (although it may provide state- or province-level data to the countries
where the sample size is adequate to produce such data). Where only specific provinces
participate, such as Shanghai-China or Hong Kong-China, the province data appear in the
table. But the performance of a country is not usually uniform throughout the country
the country average often masks wide variations between one state or province and
another within the country. Table 3 shows the variation in Australias reading
performance in PISA 2009, by jurisdiction.
As the table illustrates, while Australian Capital Territory (ACT) and Western
Australia (WA) have averages very close to that of some of the top fivesystems,
Tasmania (TAS) and Northern Territory (NT) score below the OECD average. This
variation could be attributed to demographic differences between Australian jurisdictions.
For example, ACT has proportionally more public servants than other states. In contrast,
Table 3. Australian jurisdiction mean scores in PISA 2009 reading.
State Mean score 95% confidence interval
ACT 531 520543
WA 522 510534
Queensland 519 505532
New South Wales 516 505527
Victoria 513 504523
South Australia 506 497516
TAS 483 472495
NT 481 469492
Australia 515 510519
NT has proportionally more remote schools than other states. There are also differences in
the SES characteristics between states. The PISA results are likely reflecting the
demographic differences between states than differences between education systems.
If we compare Australian jurisdictions with jurisdictions such as Shanghai or small
countries such as Singapore, we get some interesting results (see Table 4).
Comparing Australian jurisdiction results with those of the East Asian PISA elite,
ACT is ranked fifth internationally by mean score. So a part of Australia is already in the
PISA top five! Indeed, ACTs performance is not statistically significantly different
from second-ranking Korea. WA is ranked eighth internationally by mean score, and it is
also not significantly different to Koreas performance. On the other hand, TAS and NT
both have mean scores significantly below the OECD mean, with rankings close to those
of Greece and Spain. With such a diverse range of mean scores between Australian
jurisdictions, a focus on Australiasinternational ranking is not a very useful way to
assess Australian education systems.
What does this insight mean for policy reform? With ACT and WA already in the
PISA top five, perhaps Australia could use these states as role models to improve its
performance, rather than look to distant and culturally radically different systems such as
Shanghai or Korea. Given that education is deeply culturally embedded, practices and
policies might not travelthat well across cultures. So it would make eminent sense to
find within-country role models.
This contextargument i.e., the argument that because education is deeply
culturally embedded, what works in one context may not work the same way in another
context, so caution is to be exercised in such borrowing has been argued robustly in
education. Alexander (2012) has argued with great clarity, that the problem is not with the
desire to learn from others, but with the import from distant shores of miracle cures
advocated by school improvement experts. He endorses Sadlers(1990) idea that [t]he
practical value of studying in a right spirit and with scholarly accuracy the working of
foreign systems of education is that it will result in our being better fitted to study and
understand our own(our emphasis). In other words, observations of other nations
practices should be used as provocations and reference points to reflect on our own
Table 4. PISA 2009 reading literacy.
Rank Jurisdiction Mean score 95% confidence intervals
1. Shanghai-China 556 551561
2. Korea 539 532546
3. Finland 536 531540
4. Hong Kong-China 533 529537
5. ACT 531 520543
6. Singapore 526 524528
7. Canada 524 521527
8. WA 522 510534
9. New Zealand 521 516525
10. Japan 520 513527
11. Australia 515 510519
OECD average 493 492494
Scores for Australia, ACT and WA are given in bold.
practices. Similarly, Jasanoff (2005, p. 15) sees melioration through imitationas a
practical ambition that is not to be denigrated, but advocates going beyond seeking
prescriptions of decontextualized best practices for an imagined global administrative
elite, towards comparisons as opportunities to investigate the complex interplay of
science and politics and their implications for governance at particular locations. This
idea of the cultural specificity of education was reiterated by an OECD official:
You should compare yourself with countries which have the same way of living as you. If
you are in Korea, your parents will spend all their money for you to study and in France they
will keep their money to live we have some of the information but we need to interpret
correctly and to always give the context of what is the situation in each of the countries.
Another interviewee described this issue as the problem of the unmeasured, arguing that
looking at the differences between jurisdictions within countries in PISA was more useful,
because people within a country would have a good idea about the particular interplay of
factors that could be influencing student performance. But in many countries, such as the
USA, they do not have the sample size to make a state-by-state comparison of PISA
results. In the most recent Trends in International Mathematics and Science Study
(TIMSS) round, eight US states participated in enough numbers as to get state-level data,
allowing for a limited amount of within-country comparison. As a result, one senior US
Government official explained:
Weve been encouraging people to look at state comparisons as opposed to other countries
[I]f I were a state that was not doing well that was looking for [policy lessons] I think
I would look at Massachusetts instead of Singapore because I dont know what else is
happening in Singapore. Korea is a great example of that I think. Here, pretty frequently we
get questions about time spent on learning, and Korea is often thrown up as they dont spend
that much time’…but you know what is really happening? They are spending TWICE as
much time [in the after-school private coaching classes], and we dont have that measured
very well.
Australia over samplesin PISA in other words, more Australian students participate in
the PISA survey than the sample size stipulated by PISA for national-level results. As a
result, Australias PISA results can be differentiated in much greater detail than is
possible in many other countries. As one PISA expert explained:
In PISA 2000, Australias sampling was sufficient (roughly 6,500 students) to provide
reliable state estimates. From 2003 onwards, Australias sampling was even more extensive
(roughly 14,000 students), so there was sufficient data to support Australias major
longitudinal study, the Longitudinal Study of Australian Youth (LSAY). In 2012, Australia
increased the sample size even further (to roughly 18,000 students), and also changed the
design of the sampling, so that it included more schools with fewer students at each school.
Instead of the previous fifty students per school, it sampled 2530 per school and nearly
doubled the number of participating schools. This allows Australia to get quite detailed data
on sub-groups within states.
These extensive sample sizes provide Australia with a great deal of data, and make it
possible to analyse and understand the patterns of performance, identify issues and areas
upon which to focus and point towards examples of excellent practices within the country
itself. Comparison with other nations, therefore, is probably the least useful way for
Australia to use the PISA data.
Rankings by item content
Unlike TIMSS, PISA tests are not based on the curricula of the participant nations.
Instead, they are based on an ideal that expert committees deem students should know
and be able to do by the age of 15 in order to succeed in the world beyond school.
Experts in each domain devise the tests, with input from all the member nations, and
there are extensive field trials of the test items.
However, despite the participation of the foremost experts in developing test
questions, PISA questions are constrained by a number of limitations. It is not possible
to assess, in a standardised way, everything that is valued in terms of being well
preparedfor the world. As one senior PISA official put it:
Reading, science and maths are there [in the PISA test] largely because we can do it. We can
build a common set of things that are valued across the countries and we have the technology
for assessing them. There are other things like problem-solving or civics and citizenship
that kind of thing where there would just be so much more difficulty in developing
agreement about what should be assessed. And then there are other things like team work
and things like that. I just dont know how youd assess them in any kind of standardised
way. So you are reduced to things that can be assessed. Theyve tried writing but the
cross-cultural language effect seems too big to be comparable.
Even within each domain, there are constraints with regard to what can be included. Each
student takes a total of about two hours of the test one hour for the major domainand
a half hour each for the minor domains (the three literacies rotate to take turns at being
the major domain). Testing students’‘preparedness for lifein a certain domain of
knowledge within such constraints of time is very challenging, as one member of the
Science Functional Expert Group described:
Science was a minor domain in the first two tests and here is an international team with 7
or 8 people and we are told that the maximum testing time youve got is 30 minutes. To test
preparedness for life! So [a member of the Committee] said, this is ridiculous we cant
possibly do everything, so why dont we decide on one thing we will test one aspect of
scientific literacy, and we argued about what that one thing would be for quite a long time
but in the end we decided. we would try to construct a test about how well 15-year-olds
could critically appraise a media report involving science and technology.
This constraint on time is further exacerbated by the applicationfocus of PISA. Because
the questions are not curriculum based, students have to be presented with a situation and
asked questions based on the situation presented. This means that within the half hour,
time has to be made available for reading the information on which to base responses,
further reducing the number of questions which can be asked within the half hour. When
the number of test questions is small, the response to each question has a significant
impact on the overall score; in other words, every right or wrong answer can significantly
and perhaps disproportionately affect the average, thus making the assessment less
reliable. To overcome this problem of too few questions, PISA creates a larger set of
questions and distributes them across several students; in other words, a testis answered
by several students, each of whom is administered a different set of questions.
656 R. Gorur and M. Wu
These practical constraints and methodological choices also have bearing on how we
might understand the results. When a subject is in the minor domain, and therefore has
fewer test items, each particular item that features on the test will have a more significant
impact on a score. For example, if questions of probability feature in the minor domain
mathematics test, and students have not been exposed to much study of probability at
school, the average mathematics score would look poorer than if those questions had
been left out, or if they had been part of the test when mathematical literacy was the
major domain.
Australia was placed 15th in PISA 2009 in mathematical literacy. But student scores
in a domain could differ quite widely between items. The PISA database also publishes
results by item. Looking beyond Australias average score on mathematical literacy and
examining PISA results by item, we find that Australian students do exceptionally well in
answering certain questions, and exceptionally poorly in others. Table 5 shows
Australias performance on two PISA 2009 mathematics items.
As we can see, Australia performed extremely well on Items M408Q01TR and
M420Q01TR, ranking third and second, respectively, internationally. For Item
M408Q01TR, Shanghai-China ranked 20th, despite the fact that Shanghai took the top
spot internationally on mathematics literacy, with a mean score much higher than the
second-place country, Singapore. For Item M420Q01TR, Australia outperformed all top-
ranking countries.
Table 5. Percentage correct on two PISA 2009 mathematics items for a subset of countries.
Country Item M408Q01TR Country Item M420Q01TR
Hong Kong-China 0.60 New Zealand 0.66
Finland 0.56 Australia 0.64
Australia 0.56 Canada 0.64
Chinese Taipei 0.55 Ireland 0.62
UK 0.55 Shanghai-China 0.62
New Zealand 0.55 UK 0.60
Macao-China 0.53 USA 0.59
Iceland 0.52 Chinese Taipei 0.58
Ireland 0.51 Singapore 0.57
Singapore 0.50 Denmark 0.57
Canada 0.49 Netherlands 0.57
Spain 0.49 Norway 0.57
Germany 0.49 Czech Republic 0.56
Sweden 0.46 Finland 0.55
France 0.46 Belgium 0.55
Switzerland 0.45 Liechtenstein 0.55
Liechtenstein 0.45 Poland 0.54
Belgium 0.44 Hong Kong-China 0.54
Portugal 0.44 Germany 0.53
Shanghai-China 0.43 Hungary 0.52
More countries More countries
Score for Australia is given in bold.
In contrast, on Item M462Q01DR, Australia ranked 43rd internationally, with an
average score of only 0.1 out of a maximum of two, while Shanghai had an average score
of 1.5 out of a maximum of two.
How are we to understand this variation in Australias performance across the
different questions in the mathematics literacy survey? Given that Australian students
answer some questions exceedingly well, it is difficult to make the case, based on these
scores, that there is a crisis in mathematics literacy among Australian students, and that
wide-ranging reforms are required. One explanation could be that the Australian
curriculum does not cover, by the age of 15, some of the questions that are in the
PISA survey. Unfortunately, PISA does not publish all the actual items only a small set
of items is released so it is not possible to identify the particular skills or knowledge
that Australia needs to focus on in order to improve its scores. But perhaps at least
Australia could feel a bit more secure about its performance, knowing that it definitely is
in the top fivein at least some aspects of mathematical literacy. If Australia were keen
to raise PISA scores, further sample-based tests could be done to identify topics in which
students do well or badly, using PISA-like questions, and then curriculum and pedagogic
reforms could target those areas.
Another issue is relevant for discussion here. Mathematics was the major domain in
2003, when Australia ranked 11th. Australian media and policy-makers talk about the
slidein Australias mathematics performance comparing the results of 2003 with those
of 2000, 2006 and 2009, when mathematical literacy was a minor domain. This produces
a skewed comparison, because, as we noted earlier, the test content in the minor-domain
survey has a greater impact on the average scores. It would be more useful to compare, if
one must, between 2003 and 2012; i.e., across the two surveys when mathematical
literacy was the major domain.
But even this does not really solve the problem. Over a period of nine years, there are
many changes in the cohorts of students; for example, the proportion of immigrant
students might have increased. There are also many changes brought about by a range of
education reforms. In Australia, between 2003 and 2012, we have witnessed the
increasing centralisation through a national curriculum, the introduction of the National
Assessment Program Literacy and Numeracy (NAPLAN), the setting up of the My
School website, many attempts to introduce performance-based pay for teachers indeed,
we have had a whole Education Revolution(Gorur, 2013; Gorur & Koyama, 2013). The
education system has not remained stable, so such comparisons are problematic.
Introducing large-scale reforms on the strength of such apparent trends could result in
diverting resources from crucial areas and placing unnecessary, and perhaps impossible,
demands on teachers and schools.
Test completion
A well-known phenomenon in statistical data collection and measurement is that interest
and accuracy in responding to questionnaires and tests do not follow an even pattern
throughout the duration of the exercise. Generally, questions at the beginning of a
questionnaire tend to be answered with greater interest and accuracy than in the latter
parts, particularly in longer tests. Arguably, how well students maintain their motivation
to respond to the best of their ability depends on how important they deem their
performance on the test to be. Since there are no stakesattached to performance in PISA
for individual students, this motivation must come from elsewhere, and is often
influenced by cultural factors. Some cultural differences have been identified in the ways
students from different countries approach tests, and in how seriously they take such
tasks. Using the notion of perceived task value, Sjøberg (2007) argues that while tests
are premised upon the idea that all students will intend to do their best, students in
different countries vary in their behaviour in this regard. He claims that in many modern
societies, several students are unwilling to give their best performance if they find the
PISA items long, unreadable, unrealistic, and boring, in particular if bad test results have
no negative consequence for them(p. 203304). These variations in studentsapproach
to the tests may also be the result of how seriously these tests are taken by parents,
schools and society at large. Sjøberg describes the observations at one Taiwanese school
where students were taking the TIMSS test, which illustrates the importance accorded to
international tests and educational performance:
An observer from Times Educational observed the TIMSS testing at a school in Taiwan, and
he noticed that pupils and parents were gathered in the schoolyard before the big event, the
TIMSS testing. The director of the school gave an appeal in which he also urged the students
to perform their utmost for themselves and their country. Then they marched in while the
national hymn was played. Of course, they worked hard; they lived up to the expectations
from their parents, school and society. (Sjøberg, 2007, p. 221)
Sjøberg argues that interest in doing well and in persisting for two and a half hours
(2 hours for the test and 30 minutes for the student background survey) is therefore
uneven across cultures, and it could be another variable that explains differences in
One measure that can serve as a proxy for motivation to complete, or willingness to
answer(Torija, n.d.), is the number of unanswered items in each test booklet. PISA
provides a code of 8for these un-attempted items (termed not-reachedin PISA). In
Table 6, the average number of not-reached items is computed by country. The first
34 countries in order of the average number of not-reached items are shown in Table 6.
While it is possible that the number of not-reached items is related to students
proficiency in the test domain, the relationship is not so clear. Examining Table 6, we find
that top-performing countries do not necessarily have fewer not-reached items. PISA is
not designed to be a speed test, so the variation of the number of not-reached items across
different countries could at least in part reflect motivation issues. Australias rank is 34 in
this table, somewhat inconsistent with Australias ranks in reading, mathematics and
science domains (9th, 15th and 10th, respectively). In Australia, 91% of students reached
the end of the test. In contrast, 95% of US students and 98% of Shanghai students
reached the end of the test. We could conclude that Australias average score is negatively
affected by the 9% of students who did not complete the test. Australiasaverage score
could be as much a reflection of this willingness to answer or motivation to complete as it
is of the students’‘literacy.
If motivation can be raised and more Australian students prevailed upon to complete
the test, would Australia get a rank within the top five, negating the need to engage in the
kind of extensive reforms now being considered, including stringent accountability
measures and incentives like performance pay for teachers? This question is worth
From data to policy a treacherous leap
International comparisons, however sophisticated and rigorous, are beset with a number
of inevitable limitations. Comparability across a vast diversity of contexts, histories and
cultures can only be achieved through narrowing what is compared (Gorur, 2010; Scott,
1998). Moreover, the successof education systems, even when narrowly defined for the
purposes of comparison, is deeply affected by a wide variety of often interrelated
factors and statistical methodologies are not very good at performing analyses that are
sensitive to the relationality of phenomena. Many factors that affect educational
performance cannot be included in such analyses, leading to what some of our
Table 6. Average number of not-reached items.
Country Average number of not-reached items
1. Shanghai-China 0.11
2. Korea 0.16
3. Netherlands 0.18
4. Hong Kong-China 0.39
5. Croatia 0.43
6. Hungary 0.45
7. Chinese Taipei 0.46
8. Slovenia 0.48
9. Finland 0.49
10. Tamil Nadu-India 0.54
11. Poland 0.55
12. Austria 0.56
13. USA 0.57
14. Czech Republic 0.60
15. Estonia 0.61
16. Slovak Republic 0.61
17. Lithuania 0.61
18. UK 0.63
19. Switzerland 0.66
20. Japan 0.68
21. Germany 0.69
22. Singapore 0.69
23. Romania 0.78
24. Liechtenstein 0.82
25. Canada 0.85
26. Latvia 0.87
27. Turkey 0.89
28. Belgium 0.90
29. New Zealand 1.01
30. Denmark 1.03
31. Ireland 1.04
32. Serbia 1.05
33. Norway 1.10
34. Australia 1.11
More countries
Score for Australia is given in bold.
interviewees referred to as the problem of the unmeasured. This makes it difficult to
draw parallels or conclusions from these data. In any case, causation cannot be
established through numbers alone; it can only be attributed through expert interpretation.
In addition, surveys such as PISA are a snapshot of a point in time, and such cross-
sectional studies are limited in the information they can give and the conclusions they can
Longitudinal analyses using such surveys are also challenged by a range of issues.
With PISA, each round of the survey focuses on a different major domain(reading,
mathematical or scientific literacy), and comparisons from one three-year cycle to the
next are not like to like. The tests are also not sensitive enough to pick up small changes
in performance, and changes over a period of three years are usually small at a system
level. Depending on the size and nature of the system, it may take many years before the
effects of any reforms are reflected as changes in test scores on PISA. So PISA can at best
only be a very rough description of the state of an education system.
More broadly, our analysis demonstrates the complexity of translating the world into
numbers(Gorur, 2010) and the ongoing challenge of finding certainty and clarity
through these translations (Gorur & Koyama, 2013). The OECD claims that, thanks to
PISA, we now have an unprecedented comparative knowledge base of school systems
and their outcomes(OECD, 2007, p. 6). But turning PISA data into a meaningful and
useful knowledge is a challenging enterprise.
We are particularly anxious about Australias desire to emulate the East Asian
systems. The high scores of the East Asian nations are linked to an obsession with
educational success (Anderson & Kohler, 2012), driving families to invest heavily in
private cram-schools an investment that could be very costly in several ways:
[Private tuition] normally maintains or exacerbates social and economic inequalities; it may
dominate childrens lives and restrict their leisure times in ways that are psychologically and
educationally undesirable; and it can be perceived in some settings as a form of corruption
that undermines social trust. (Bray, 2009, pp. 1314)
The punishing schedule, with students spending long hours at coaching classes after
school; the high levels of competitiveness; and the shame experienced by students who
do less well are linked to high rates of depression and suicide (Ahn & Baek, 2013).
Anderson and Kohler (2012) have linked the East Asian education feverto significant
drops in fertility rates, as levels of parent investment required to raise successful
offspring have risen dramatically. Ironically, as the authors point out, a low fertility rate
could be a threat to future economic success of these countries.
Much of the response from critics to Australias desire to learn from the East Asian
nations is based on the argument that the contexts of the East Asian nations differ greatly
from those of Australia, and that practices effective in Shanghai or Singapore would not
necessarily work as well here (see, for instance, Buckingham, 2012; Dinham, 2012).
Similar arguments have been made by a host of critics on the issue of policy borrowing
or policy learning. Whilst we agree that the contextargument is both valid and
important, our concern is that the contextcritique leaves intact or at least, it offers no
challenge to the causal connection drawn between practices and test performance. The
context argument is also easily countered, as, for example, has been done by Jensen et al.
(2012), using a simple disclaimer, declaring that while we cannot unthinkingly adopt
policies from elsewhere, we can nevertheless learn from them. Crucially, debates about
context direct attention away from examining and understanding Australias PISA
performance in greater detail. As a result, the idea that Australias performance in
international studies is slipping, and that the system is heading towards a crisis, has
Our analysis demonstrates that some of our jurisdictions are already in the top fiveand
that there is no generalised crisis in education that can be inferred based on a detailed
reading of the PISA data. We suggest that the data point towards the need for a more
focused and targeted approach, rather than sweeping national reforms.
The fact that Australias performance varies markedly across mathematics items
provides a nuanced picture, pointing to possible differences between what is valued in
Australian curricula and what is tested in PISA. If we were to take seriously the
desirability of improving PISA scores, our analysis points to the need for further research
to locate the particular topics in which Australian students might be less well prepared,
rather than large-scale, system-wide reform.
In the case of willingness to respondalso, the analysis points away from the
conclusion that Australian education is in a generalised crisis. While reiterating that
raising PISA scores is not a self-evidently good policy objective, we would conclude that
one way of improving Australias scores would be to improve studentsattitude towards
and commitment to test completion, rather than engage in expensive, stressful reforms
such as NAPLAN and My School.
If Australia insists on using PISA to inform policy, it would do well to explore the
data in much greater detail. Average scores obscure far more than they reveal. Using
average performance scores and rankings to inform policy is leading to damaging policy
1. The next PISA survey in 2015 will be implemented by a consortium led by the Educational
Testing Service based in the USA.
Testing Service based in the USA. For details, see
2. Information for Tables 1 and 2was sourced from
... Globally, teaching has been shaped by the normalising and homogenising effects of The Programme for International Student Assessment (PISA) test, which, by the nature of its chosen areas of study, push a focus on literacy and numeracy (Kolber, 2022a;Kolber 2022b;Gorur & Wu, 2015). Surprisingly, despite including science within its remit, this 'back to basics' focus removes this concern in favour of literacy and numeracy as primary focuses of schooling. ...
Full-text available
With the ever present need to improve teaching and develop approaches that prepare our students as democratic citizens (Heggart, 2020), the challenge remains as to how best to achieve this lofty goal. In this theoretical and practical piece, Steven Kolber, a practising teacher explores the ways that teachers may combine the best of modern teaching technologies with ancient techniques. The Alice Springs (Mparntwe) Education Declaration sets lofty and important goals for Australian education, which requires teachers to take a different approach if they are ever likely to be achieved (Education Council, 2019; Moodie and Patrick, 2017). In light of these goals, and to deliver on the promises of Democratic education in light of the many challenges faced by modern teachers, a slightly shifted approach to pedagogy seems timely and crucial, especially amid post-pandemic education. By combining instructional video production with flipped learning, we can achieve a modern style of classroom teaching. These techniques and approaches make more classroom time available, with which teachers are encouraged to democratise their class spaces and adopt a Socratic approach to knowledge creation centred around questioning and discussion. The combination of these collections of approaches are selected as it is posited that they allow the teacher to face the significant workload challenges of teaching, increasing and expanding curriculum, and allows teachers to find time and space for human, collaborative, democratic interactions. This paper is a theoretical piece that aims to explore the boundaries and possibilities of combining ancient and modern pedagogies, that individually are still complexly debated and contested within the broader research literature.
... However, despite this growth, nations have taken different approaches to how information and data from such tests is used to inform policy. For example, Australia's mediocre performance on TIMSS and PIRLS in 2012 spurred an initiative to move Australia to the top five places on international assessments (Gorur & Wu, 2015). Despite performing relatively well on PISA assessments, some countries, including Canada and Finland, used the results to make changes to the education systems (Baird et al., 2016). ...
... For example, Türkiye is not in a good position regarding not-reached items. In a table provided by Gorur and Wu (2015), Türkiye ranks 27 th among thirty-four countries in terms of not-reached item rates. The authors explain this mainly in terms of test motivation and point out that better achievement can be possible even by improving motivation instead of many reforms. ...
Full-text available
In the 21st century, international interaction in social, economic, cultural, and educational fields has increased. Consequently, international standards have become essential in national education policies, reforms, and practices. As an international assessment, PISA has started to function as a prominent tool in this regard. However, the impact of PISA differs across the participating countries, depending on how the concept, methodology, and practices are handled. One of the domains where this difference is seen is reading literacy. Although this domain expresses a broader and richer phenomenon, the content and scope of the concept are not accurately understood, and its influences vary across the participating countries. In Türkiye, reading literacy is mainly considered and discussed in the scope of Turkish language lessons. This perspective, which focuses on the Turkish language lessons, leads to misunderstanding of the issue and the inaccurate change and transformation of the curriculum and content. Attempting to succeed in this domain through test language lessons deepens the problems instead of solving them. In this article, misconceptions and misuses about reading literacy are explained based on a literature review, and it is pointed out that reading literacy should be addressed in a broader context, including the curricula of other subjects, rather than test language lessons.
... International benchmarking assessments such as TIMSS and PISA promote competitiveness and fuel the desire to dominate in STEM education often leading to knee-jerk policy frameworks (Gorur and Wu, 2015). The competitiveness at national and international levels tragically means that STEM education and examination are rarely practiced as mutually inclusive (Blackley and Howell, 2015). ...
Full-text available
Science, technology, engineering, and mathematics (STEM) education is increasingly viewed as a vehicle for global dominance and a panacea to economic downturns, environmental challenges, and food security. However, divergences in STEM education agendas at regional and national levels imply disparities in policy formulation and implementation in the Global North and Global South. This study sought to explore what informs the drivers of STEM education in the two geo-economic blocks with a view to understanding contextual factors that inform practice. A focus on STEM education in the Global North and Global South becomes necessary, given the widespread calls for collaborative work, for example, shared interests in addressing sustainable development goals, and research on the COVID-19 pandemic. A theoretical approach, based on a review of relevant literature, was adopted. Ideology critique informed the analysis and was used to make sense of the salient themes. In the Global North, STEM education is historically driven by ambitions of political dominance, the need to curb economic slumps and address critical skills shortages, and growing desire for extra-terrestrial colonization. Within this context we argue that a neoliberal agenda drives the STEM education enterprise. In the Global South, massification with equity dominates policy formulation and implementation as countries battle to redress past colonial imbalances. The Global South countries generally sign up to regional and global STEM education agendas but financial constraints compounded by an unabated brain drain result in stagnation at policy adoption at vocational level. Convenient partnerships are increasingly fashionable as countries in the Global North seek to exploit the geographical advantage of those in the Global South in order to fully utilise the extra-terrestrial space, resources for biomedical science and indigenous natural resources, among others. Collaboration endeavors between the Global North and Global South need to be mutually beneficial. The Global North needs to redistribute the aspects of power it holds in relation to STEM to move towards more equitable policies and practices across these geopolitical realms. We recommend greater vocationalisation of STEM education hinged on STEM integration with the humanities in the Global South and balanced, mutually beneficial STEM collaboration endeavors with the Global North countries.
... The extent of the decline is widespread and equivalent to a generation of Australian school children falling short of their full learning potential (Australian Government 2018, vii-viii) In all the policy documents mentioned here, underperformance was identified using the mechanism of quantification of school education via the OECD's Programme of International Assessment (PISA). The OECD's PISA has been influential in shaping domestic education policy and in establishing the importance of large-scale learning metrics in education as noted extensively by education scholars (Grek 2009(Grek , 2014Gorur and Wu 2015;Lewis, Sellar, and Lingard 2016). ...
This chapter examines policies relating to the underperformance of school systems, schools and students through the lens of problematization. The problematization approach focuses on problem representations as constituted by the very issues they are attempting to address. In this view, problems do not exist independently from the political process through which they emerge. In policy studies, Carol Bacchi has operationalized problematization through the ‘What is the Problem Represented to be?’ (WPR) approach. WPR uses a question-based method to interrogate the ideas that lie behind the policy that is presented as a solution. Using a case study of the problem of underperformance in school education policy, this chapter uses Bacchi's approach to explore the effects that school education policy has on parents and caregivers as key policy subjects in school education.
This study explores the perceptions of government officials, teachers, and parents in Scotland regarding the use of Programme for International Student Assessment (PISA) results to evaluate national education performance. International large-scale assessments (ILSAs) such as PISA have been increasingly influencing education policymaking worldwide, but there is limited understanding of how education actors perceive these assessments. This study uses Scotland’s recent declining performance in PISA after the implementation of the Curriculum for Excellence (CfE) reform in 2010 as a case study. Interviews with key officials involved in implementing the CfE, and parents’ and teachers’ representatives reveal their doubts about the validity of PISA as a means of assessing Scottish education performance. This is because PISA data lack comparability and accountability, as students’ academic performance is not impacted by PISA scores and teacher evaluations are not based on them. They do not consider that the PISA results reflect the effects of the CfE reform. These views contradict the Scottish government policy that PISA, among other assessments, is meant to inform policymakers and evaluate reform effects. Our findings have implications for countries that routinely use PISA to assess their educational performance and do not consider the views of education actors.
This chapter reflects on the Global Childhoods project in the global city of Melbourne. After reiterating the focus and approach of the project, it provides a summary of the key points in each chapter of the book. The chapter then reflects on connections between children’s lifeworlds and educational “success,” focusing on a more productive view of educational “success” beyond solely academic achievement; the need to move beyond learning concepts and skills relating to English and mathematics to applying these to demonstrate literacy and numeracy skills; and connecting learning in and beyond school. The chapter ends by looking to the future, including considering the impact of the global COVID-19 pandemic.KeywordsChildren’s lifeworldsEducational “success”Children’s learningHome-school connections“Authentic” learningCOVID-19 pandemic and children
Full-text available
In many countries, including the UK, the potential of international student achievement surveys such as TIMSS and PISA is being subverted by political and media fixation on the resulting league tables. These prompt not just well-founded efforts to learn from others’ success but also ill-founded assertions about educational cause and effect, inappropriate transplanting of the policies to which success is attributed, and even the reconfiguring of entire national curricula to respond less to national culture, values and needs than to the dubious claims of ‘international benchmarking’ and ‘world class’ educational standards – the latter equated with test scores in a limited spectrum of human learning. Informing such responses are the attractively simple nostrums of high profile and highly selective literature reviews that massage policymakers’ urge for the quick fix by playing down the complex interplay of culture and schooling and ignoring the kinds of evidence that can provide a truer and more nuanced picture of education systems in action. Using a typology developed by the US National Research Council, the paper critiques three recent and influential examples of this paradigm before illustrating an alternative approach. This draws on the author’s comparative studies of culture and pedagogy to show how explicating the principles that underpin observed classroom practice, rather than copying national policies, can lead to genuine transformation of the quality and outcomes of student learning. The paper ends by contending that PISA panic and the supremacist mindset it feeds have dangerously distorted the debate about what a ‘world class’ education should entail. With PISA 2012 now in progress, policymakers are urged to redress the balance.
Full-text available
As part of its flagship educational study - the Programme for International Student Assessment (PISA) - the Organisation for Economic Co-operation and Development (OECD) has undertaken extensive work to create an internationally relevant composite indicator aimed at measuring socioeconomic background. However, the degree to which a single measure of socioeconomic background is reliable and valid for all participating countries is not widely discussed. To fill this gap, the authors examine the home possessions index, which is a key component of PISA's socioeconomic indicator, and highlight a number of issues surrounding this index. In particular, they take a psychometric approach to investigating the reliability and some facets of the validity of the home possessions index in a number of participating PISA countries. Their findings suggest that there are notable concerns with the current index, including highly variable reliability by country, poor model-to-data consistency on a number of subscales, and evidence of poor cultural comparability. They couch their discussion in the context of educational and policy research and propose one possible method for improving these measures for participating countries.
Full-text available
The Organisation for Economic Cooperation and Development (OECD) has developed impressive machinery to produce international comparative data across more than 70 systems of education and these data have come to be used extensively in policy circles around the world. In many countries, national and international comparative data are used as the bases for significant, high-stakes policy and reform decisions. This article traces how international comparability is produced, using the example of equity measurement in OECD's Programme for International Student Assessment (PISA). It focuses on the construction of the objects of comparison and traces the struggles to produce equivalence and commensurability across diverse and complex worlds. Based on conversations with a number of measurement experts who are familiar with the OECD and PISA, the article details how comparability is achieved and how it falters and fails. In performing such an analysis, this research is not concerned with ‘exposing’ the limitations of comparison or challenging their validity. Rather, based on the work of Steve Woolgar and other scholars, it attempts to mobilise a ‘sociology of measurement’ that explores the instrumentalism and performativity of the technologies of international comparisons.
It is hard to deny that Koreans’ faith and investment in education have played a pivotal role in rebuilding the nation after a devastating war and achieving its economic growth. To Koreans, education has meant the best way to improve the quality of life. Ironically, Koreans’ zeal for education became a burden, and because of it, the quality of life, especially that of youth, is being compromised. Since high academic achievement becomes the most important goal for Korean youth, other aspects of their development are largely ignored, and this can have serious effects on the overall quality of their life. In this chapter, we highlight the serious state of psychological well-being of Korean youth by examining Korean society’s orientation toward academic achievement and various aspects of the psychological state of Korean adolescents and the connection between them.
Despite the frequent, recurrent and often damaging criticisms of citation analysis in recent years, the use of citations continues apace. In the context of contemporary anxieties about the increasing use of 'performance indicators’ this paper argues that it is important to understand the sources of resistance to criticisms of the use of citations. A series of insights from the sociology of scientific knowledge are used to develop the basis for a sociological understanding of the citation debate. In particular, it is suggested that when construed as a measurement technology, we can see how the very system of measuring and manipulating citations redefines the phenomenon it is supposed to measure. Finally, some brief thoughts are offered on possible practical responses to the increasing institutionalisation of citation analysis.