Metacognitive Activities in
Text-Studying and Problem-Solving:
Development of a taxonomy
*, Marcel V. J. Veenman
Bernadette H. A. M. van Hout-Wolters
Graduate School of Teaching and Learning and SCO-Kohnstamm Institution, University
Department of Developmental and Educational Psychology, University of
Leiden and Graduate School of Teaching and Learning, University of Amsterdam, and
Graduate School of Teaching and Learning, University of Amsterdam, The Netherlands
(Received 13 January 2005; accepted 7 July 2005)
This article describes the construction of a hierarchical taxonomy of metacognitive activities for the
interpretation of thinking-aloud protocols of students in secondary education, who studied texts on
history and physics. After testing an initial elaborate taxonomy on a restricted number of protocols
by multiple raters, it appeared that the interrater correspondence was well below par. The categories
in the taxonomy were too highly speciﬁed. Categories were combined and tested on new protocols
in a cyclic fashion. The revised taxonomy was then used for coding 16 history protocols and 16
physics protocols. Frequencies of occurrence of metacognitive activities were obtained, as well as
judgements of the quality of the metacognitive activities of the participants. There is a reasonable
correlation between the frequency method and the quality method for coding thinking-aloud
protocols. Also, there is a substantial covariation of the number of metacognitive activities across
Metacognitive activity is essential in the strategic application of metacognitive
knowledge to achieve cognitive goals. It enables the regulation and control of
cognitive processes (Alexander, Carr, & Schwanenﬂugel, 1995). Components of
*Corresponding author. Graduate School of Teaching and Learning and SCO-Kohnstamm
Institution of the Faculty of Social and Behavioural Sciences, Wibautstraat 4, 1091 GM, University
of Amsterdam, The Netherlands. E-mail: J.Meijer@Uva.nl
Educational Research and Evaluation
Vol. 12, No. 3, June 2006, pp. 209 – 237
ISSN 1380-3611 (print)/ISSN 1744-4187 (online)/06/030209–29
Ó 2006 Taylor & Francis
metacognitive activity can be distinguished at various levels of speciﬁcity (Van Hout-
Wolters, Simons, & Volet, 2000). For instance, at the highest level, components such
as planning, monitoring, and evaluation can be discriminated. At an intermediate
level, more speciﬁc components are found, such as selection of information,
recapitulation, and reﬂection on the learning process. On the lowest level,
metacognitive activity is usually deﬁned at task-level, for instance deciding to infer
the meaning of an unknown word from its context, conﬁrming or rejecting former
inferences based on subsequent text, or examining a special case of a problem and
considering a slightly modiﬁed problem (Pressley, 2000; Schoenfeld, 1987).
Although the relation of metacognition with learning results is the subject of many
educational studies (Sperling, Howard, Miller, & Murphy, 2002; Veenman, Elshout, &
Meijer, 1997; Wang, Haertel, & Walberg, 1990), it is by no means clear which
particular metacognitive activities are related to learning results. Identifying these
activities may render suggestions for metacognitive training.
In the present article, a description is given of an attempt to build a hierarchical
taxonomy of metacognitive activities. The taxonomy should be suitable for the
interpretation of statements in thinking-aloud protocols. Although it is possible to
look for metacognitive activities by means of questionnaires, it appears more likely
that they can be found in part of the activities that persons execute during certain
tasks (Veenman, in press; Veenman, Prins, & Verheij, 2003). These metacognitive
activities can be captured by analysing thinking-aloud protocols. When the taxonomy
is going to be applied for coding such protocols, there should be a reasonable degree
of correspondence with other existing methods of coding thinking-aloud protocols,
which emphasise metacognitive activity.
Metacognitive activity need not always be task-speciﬁc. There is evidence of the
generality of metacognitive activity across various tasks and domains (Schraw,
Dunkle, Bendixen, & Roedel, 1995; Veenman et al., 1997; Veenman & Verheij,
2001; Veenman, Wilhelm, & Beishuizen, 2004). This raises the question to what
extent a taxonomy of metacognitive activities should be task-speciﬁc. It may be
expected that at least some metacognitive activities can be applied to various tasks,
whereas others may be applied only to speciﬁc tasks or domains. The generality
versus speciﬁcity of metacognitive activity is an important issue, because it may have
consequences for designing programmes for training such activity. If metacognitive
activities are entirely context-bound, as some researchers contend (Kelemen, Frost, &
Weaver, 2000), there is little hope that transfer of these skills between domains will
occur spontaneously. If, however, such activities are by nature partly domain-general,
this opens up new perspectives for learning to transfer these activities.
By looking very closely at the statements in thinking-aloud protocols, concrete
activities rather than reasonably abstract skills are under investigation. Many of these
activities will appear to be cognitive in nature rather than metacognitive. However, it
is not uncommon to infer metacognitive activity from cognitive activity. For example,
rereading a certain part of a text is not a metacognitive activity as such, but the
decision to do so because that particular part was not completely understood, is.
Thus, overt cognitive activities are sometimes taken to represent covert metacognitive
210 J. Meijer et al.
activities. Of course, such inferences should be based on some clue in a thinking-
Summarising, we seek answers to the following questions. First, how useful is the
constructed taxonomy of metacognitive activities for the interpretation and scoring of
thinking-aloud protocols? For instance, by using the taxonomy it should be possible
to categorise each statement in thinking-aloud protocols. Another point of interest in
this respect concerns the interrater correspondence. Second, what is the domain-
speciﬁcity of metacognitive activities? And ﬁnally, is there convergence with other
existing methods for interpreting and scoring thinking-aloud protocols? Before we
turn to the development of the taxonomy, a brief overview of other classiﬁcations of
metacognition will be given. An outline of these classiﬁcations, which is by no means
exhaustive, is given in Table 1.
Classiﬁcations of Metacognition
Flavell (1979) distinguished metacognitive knowledge, that is declarative knowledge
about the interplay between person, task, and strategy characteristics, from regulation
of activities. In the latter, Flavell made a distinction between planning, monitoring,
and evaluation as the main components at the highest hierarchical level of
metacognitive activities before commencing a task, during execution of the task,
and upon completion of the task, respectively. Schraw and Moshman (1995)
described metacognitive control processes as the way people use their metacognitive
knowledge to regulate their cognition. These control processes were subdivided into
planning, monitoring, and evaluation, as in Flavell’s distinction. Veenman (1993)
and Veenman et al. (1997) distinguish four categories of metacognitive skills. Three
of these are closely related to the aforementioned categories. The ﬁrst category is
orientation, which is supposed to precede planning. The second category is
systematical orderliness, which includes planning. The third is evaluation, which
includes monitoring. The fourth category is elaboration, comprising recapitulating,
drawing conclusions, and reﬂection.
Pintrich, Wolters, and Baxter (2000) distinguish three main aspects of metacogni-
tion, that is metacognitive knowledge, metacognitive judgements and monitoring,
and self-regulation and control. Likewise, Alexander et al. (1995) distinguish
declarative metacognitive knowledge, cognitive monitoring, and strategy regulation
and control. Nelson (1996) and Winne (1996) speak of metacognitive monitoring
and metacognitive control. In their view, metacognitive monitoring refers to the ﬂow
of information from the object level to the metacognitive level, whereas control
processes concern the ﬂow of information in the opposite direction. At the object
level, information refers to the external world or the internal cognitive world, and
information at the metacognitive level refers to information on the object level. For
instance, monitoring of calculation errors during mathematics problem-solving may
generate control processes at the metacognitive level, which may lead to correction of
these errors. Related to this conception of the existence of an object level and a
metacognitive level, Fernandez-Duque, Baird, and Posner (2000) state that
Metacognitive Activities 211
Table 1. Taxonomies of metacognition
et al., 2000)
et al., 1999)
at level 1
Planning x x x x Revision
x x Orientation Orientation
Monitoring x Cognitive
x x x Monitoring
Evaluation x x x x Veriﬁcation Veriﬁcation
Table 1. (Continued )
et al., 2000)
et al., 1999)
Analysis x Organisa-
Exploration x Execution
at level 2
Table 1. (Continued )
et al., 2000)
et al., 1999)
Note: An ‘‘x’’ means that authors use the term listed in the far left column, other entries give different terminology, but high resemblance with term listed in far left column.
metacognitive regulation refers to processes which coordinate cognition. According
to these authors, bottom-up cognitive monitoring includes error detection and source
monitoring in memory retrieval among others, whereas top-down control processes
include conﬂict resolution, error correction, inhibitory control, planning, and
resource allocation. These latter processes are metacognitive activities at a more
speciﬁc or subordinate level.
In line with Pintrich and De Groot (1990), O’Neil and Abedi (1996) view
metacognition as consisting of planning, monitoring, cognitive strategies, and
awareness. It should be mentioned that Pintrich and De Groot also emphasise the
existence of metacognitive strategies for modifying one’s cognitions. Following
Brown, Bransford, Ferrara, and Campione (1983), Ford, Weissbein, Smith, Gully,
and Salas (1998) also draw attention to the revision of goal-appropriate behaviour as a
major component of metacognition apart from planning and monitoring. Kincannon,
Gleber, and Kim (1999) used the Metacognitive Awareness Inventory (Schraw &
Dennison, 1994) to assess the effect of teaching metacognitive strategies. The
instrument comprises scales for metacognitive knowledge and regulation, wherein
knowledge comprises declarative knowledge, procedural knowledge, and conditional
knowledge, and regulation is divided into planning, information management,
monitoring, debugging, and evaluation. In a meta-analysis of learning skill
interventions, Hattie, Biggs, and Purdie (1996) construe metacognition as the self-
management of learning, wherein planning, implementing, and monitoring learning
efforts, as well as conditional knowledge of the use of tactics and strategies, play a
dominant role. In her description of self-regulated learning, Butler (1998) focuses on
task-analysis, interpreting task requirements, goal-setting, the selection, adaptation,
and invention of appropriate strategies, monitoring of progress, generation of internal
feedback, the adjustment of learning approaches, and the use of motivational and
volition-control strategies. Leaving from a theoretical distinction between metacog-
nitive knowledge, metacognitive skills, metacognitive beliefs, and metacognitive
conditional knowledge, Desoete, Roeyers, Buysse, and De Clercq (2002) factor-
analysed measures of these components and concluded that metacognitive knowledge
and skills cluster together and that metacognitive conditional knowledge mainly
involves prediction and evaluation, which they call ‘‘off-line’’ metacognition.
In the realm of mathematical problem-solving, somewhat different categories are
found. For instance, Schoenfeld (1957, 1987) distinguishes between analysis,
exploration, and veriﬁcation. In the older literature, these were called heuristics
(Polya, 1957). Lester and Garofalo (1982) are of the opinion that orientation,
organisation, and execution precede veriﬁcation rather than analysis and exploration.
Looking at the manifold deﬁnitions of metacognition and the great number of
components that are distinguished by the above authors, similarities can still be
identiﬁed. Metacognitive activities which involve an executive control function are
often mentioned under the labels self-regulation and control, strategy regulation, self-
management of learning, metacognitive regulation, self-regulated learning, and so on.
Self-regulation appears to be used as a rather general category, encompassing ﬁner-
grained distinctions such as monitoring and evaluation. Planning, monitoring, and
Metacognitive Activities 215
evaluation appear to be principal executive control functions in this respect, although
some authors use orientation rather than planning, and veriﬁcation rather than
evaluation. Some authors add other components on the higher level, for example
implementation, analysis, exploration, and even beliefs, although the latter do not
refer to activities. Furthermore, many more speciﬁc components are described. Some
of these concern subordinate types of metacognitive knowledge, such as declarative,
procedural, and conditional metacognitive knowledge. These are of less interest in
the context of the present study, because it focuses on activities rather than
declarative knowledge. Other distinctions concern more speciﬁc executive control
activities, such as error detection, resource allocation, and goal-setting.
The focus of this study is the management or executive control function of
metacognition. In other words, we are concerned with metacognitive activities, not
with metacognitive knowledge as such, although some of the authors mentioned in
Table 1 use the latter in their classiﬁcation. Taxonomies concerning motivational and
affective regulation were not included in the table, neither were classiﬁcations of
metacognitive conﬁdence ratings, such as feelings of knowledge and judgements of
learning. The table only includes classiﬁcations that focus partly on self-regulation
As mentioned before, researchers differ in their views on what constitutes the
executive control function of metacognition. However, it is clear that related concepts
are listed at the right of the entry ‘‘self-regulation and control’’ in Table 1. It must be
emphasised that the hierarchical arrangement of the taxonomies is by no means clear.
For example, although Flavell (1979) saw planning, monitoring, and evaluation as
subordinate components of metacognitive regulation, his initial superordinate
categories were metacognitive knowledge, experiences, goals, and strategies.
Development of a Taxonomy of Metacognitive Activities
The main assumption is that metacognitive activities comprise many components,
which can be modelled hierarchically in a taxonomy. The overview in Table 1 shows
that there are few distinctions made on different hierarchical levels in other, existing
systems. Consequentially, these existing systems provide little opportunity to identify
metacognitive activities on a detailed level which describe the exact nature of these
activities. For example, many different activities could be subsumed under
metacognitive control, regulation, or skill. In short, there are two related problems
with the existing classiﬁcations. The ﬁrst is that hierarchical levels in the
classiﬁcations are often confounded, making it impossible to decide on which level
certain activities belong. The second problem is that the labels given to the categories
in the taxonomies can easily describe a whole set of activities, whereas it is desirable to
use quite concrete descriptions of activities, particularly because the taxonomy to be
developed will be used as a framework for coding thinking-aloud protocols of
students in secondary education. Because one of the research questions concerns the
generality versus the domain-speciﬁcity of metacognitive activity, it was decided to
choose two scholastic task-domains, which were diverse in character. During their
216 J. Meijer et al.
school careers, students in secondary education are confronted with various subjects.
A common division of subjects is sciences versus social studies. In social studies, text-
studying is a frequent activity, whereas problem-solving occurs mainly in sciences.
These diverse domains were considered appropriate areas. Within social studies,
the study of a history text was chosen as an assignment and within sciences, problem-
solving in physics. Therefore, the ﬁrst requirement is that the taxonomy should
be suitable for metacognitive activities during text-studying in history as well as
problem-solving in physics. Second, the taxonomy should be complete in terms of its
components in order to be able to cover all statements that go beyond the literal texts
in each assignment. Third, the taxonomy should be related to other known
taxonomies of metacognitive activity in contemporary literature. If this relation is
absent, there will most probably be only slight convergence of the new method for
coding thinking-aloud protocols with already existing methods. Finally, the taxonomy
should focus on metacognitive activities rather than cognitive activities.
However, this may cause a problem in the sense that these are often very hard to
distinguish. In many cases, the execution of metacognitive activity is not literally put
into words, especially where children are concerned. As was already noted,
metacognitive activity is mostly inferred from thinking-aloud protocols when it is
apparent that the execution of certain cognitive activities has been decided upon.
Although metacognition can simply be deﬁned as cognition about cognition
(Dunlosky & Hertzog, 1998), more elaborate descriptions are usually given, also
involving processes and affective states. The abilities to monitor and regulate one’s
knowledge, processes, and affective states in a deliberate and conscious fashion are
construed as important facets of metacognition as well (Hacker, Dunlosky, &
Graesser, 1998). In other words, metacognition does not only refer to passive,
reﬂective knowledge, but is thought to include active monitoring and control.
Another issue pertains to the distinction between metacognition and problem-
solving. The relation between metacognition and problem-solving has been described
by Sternberg (1985), who emphasises the existence of basic information processing
components, which operate on internal representations or symbols. Sternberg
distinguishes knowledge acquisition components, execution components, and
metacomponents. The latter are processes on a higher level, which are used in
planning, monitoring, and in taking decisions. These strategic processes obviously
play an important role in problem-solving. Planning a solution strategy, deciding to
execute it and subsequently monitoring progress are essential features of successful
problem-solving, but they must be distinguished from knowledge acquisition and
execution. In this view, such metacognitive activities form a part of problem-solving.
Metacognitive activities operate at what Nelson (1996) calls the metacognitive level,
whereas all cognitive activity takes place at the object level (see above).
Construction of the Taxonomy
First, six superordinate categories of metacognitive activity were postulated. A
parsimonious main distinction is Flavell’s (1979) original three-fold categorisation of
Metacognitive Activities 217
planning, monitoring, and evaluation. Sometimes, orientation precedes actual
planning, as seen for instance in expert problem-solving behaviour (Schoenfeld,
1992; Van Streun, 1990). Experts spend much more time on orientation activities
such as trying to identify the type of problem, compared to beginners, who spend
more time on execution activities. Consequentially, this orientation category was
added. Furthermore, execution processes were added as an extra main category.
These are supposed to occur mostly directly after planning and before monitoring.
Examples are executing an action plan, note taking, reading only a part of a text, and
estimating a solution to a problem. Some subordinate categories of execution
activities appear to be of a cognitive nature rather than a metacognitive nature.
However, they are mostly overt cognitive activities from which covert metacognitive
activities are inferred. For example, to read only particular sections of a text is a
cognitive activity in itself, but the decision to select only those sections is of a
Finally, evaluation is sometimes followed by reﬂection, which is usually less bound
to the particular task under consideration, but rather aims at the learning experience
and consequences for future occasions. Thus, reﬂection was added as the last
superordinate category to the taxonomy.
The preliminary frame of the taxonomy contained six categories, that is orientation,
planning, execution, monitoring, evaluation, and reﬂection, which more or less reﬂect
the temporal course of the reading and problem-solving process. However, certain
shifts in this temporal organisation will undoubtedly occur. For instance, persons will
often resort to intermediate evaluative activities before they have ﬁnished a task.
Construction of the Taxonomy for Text-Studying
The next step in building the taxonomy was to look for elaborate descriptions of
metacognitive activities in reading. Pressley (2000) offers a very detailed, presumably
exhaustive, overview of metacognitive activities whilst reading. The overview is based
on the analysis of thinking-aloud protocols of persons of various levels of expertise
who were engaged in reading tasks of various natures. It is based on the so-called
constant comparison method, which is mostly used in qualitative research. The
method renders what Pressley calls ‘‘grounded’’ theories. It involves an interaction
between data collection, data analyses, and drawing conclusions. On the basis of 40
studies wherein verbal protocols were gathered, Pressley and Afﬂerbach (1995)
started out looking for regularities in the protocols, and then labelled and categorised
these. Subsequently, evidentiary support for the categories is sought by reviewing data
that were already processed or by gathering and analysing new data. The process is
recursive in that the researcher is always attentive to new regularities, which are then
labelled and categorised again and tested further. The process ends if no new
regularities are encountered. The constant comparison method appears to render
elaborate, presumably exhaustive results. Because our taxonomy should not preclude
any metacognitive activities, it was decided to use Pressley’s (2000) classiﬁcation as a
basis for its development.
218 J. Meijer et al.
The superordinate categories in Pressley’s (2000) overview are identifying and
learning text content, monitoring, and evaluating. Like the six-fold distinction of the
hierarchical taxonomy we are developing, Pressley’s three main categories as well as
the subordinate categories are ordered temporally. The taxonomy for metacognitive
activities in reading as adopted from Pressley was subsequently checked for
completeness and redundancies. This led to the addition and omission of some
categories. The resulting taxonomy for text-studying was much more speciﬁc than the
initial review of literature on metacognition suggested. Undoubtedly, this is due to the
circumstance that Pressley’s results of the constant comparison method were used
from the onset. Instead of condensing Pressley’s categories, it was chosen to extend the
description with any other categories which were found in the literature. Empirical
examination of the validity of the taxonomy may reduce the number of relevant
categories later on. The categories in the taxonomy were ordered hierarchically. In
most cases, the hierarchy was three or four levels deep. For instance, ‘‘recapitulation’’
was subsumed under ‘‘evaluation’’. ‘‘Summarising’’ was subsumed under ‘‘recapi-
tulation’’ and the subordinate categories of ‘‘summarising’’ were ‘‘listing pieces of
information in text’’ and ‘‘construction of a cohesive summary of the text’’.
Construction of the Taxonomy for Problem-Solving
Problem-solving is obviously quite different from reading a text and trying to
understand its meaning. Nevertheless, reading in the sense of tackling the task to
grasp the meaning of a text, can surely be described as a problem-solving process as
well. Bearing this in mind, the ﬁrst step in the construction of the taxonomy of
metacognitive activities in problem-solving was the translation of the taxonomy for
reading into problem-solving terms, if appropriate at all. Sometimes, categories from
the reading taxonomy could be transferred to the taxonomy for problem-solving
without any modiﬁcation, such as ‘‘activating prior knowledge and related
knowledge’’, together with its subordinate categories ‘‘instantiating prior knowledge
schemata’’, and ‘‘stating relation with own background knowledge or experience’’.
These belong to the superordinate orientation category in both taxonomies. Other
categories of the reading model needed little modiﬁcation, such as ‘‘noting
interrelationships between (ideas in) parts of text, writing them down or keeping
them in working memory’’. This was translated into ‘‘noting interrelationships
between variables in problem, keeping them in working memory’’.
Many categories, which belonged to the superordinate execution category in
reading were moved to orientation in problem-solving. For example: ‘‘reading parts
of text very carefully’’ was translated into ‘‘reading parts of problem statement very
carefully’’ and moved from the execution category in the reading model to the
orientation category in the problem-solving model. Although readers may refer back
or ahead to sections other than the one they are currently reading, the typical
recursive nature of most science problem-solving processes is less present in text-
studying processes. The above move reﬂects that reading appears to be a much more
linearly organised process than problem-solving.
Metacognitive Activities 219
Categories in Veenman’s (1993) system for coding thinking-aloud protocols in
problem-solving were added and classiﬁed into the six major categories in the new
taxonomy. The works of experts in mathematics education such as Lester and Garofalo
(1982) and Schoenfeld (1985, 1987) were consulted in order to test the completeness of
the preliminary taxonomy. The orientation and reﬂection categories needed no
additions. The planning category was extended with quite a few metacognitive activities,
which are typical for problem-solving, such as ‘‘examining a special case of the problem’’,
with its subordinate: ‘‘ﬁlling in a particular value for some variable in the problem’’.
Other additions to the planning category, with increasing speciﬁcity, were: ‘‘considering
essentially equivalent problems’’, ‘‘reformulating the problem’’, and ‘‘assuming a
solution and determining its properties’’. There was only one addition to the execution
category, namely: ‘‘holding all variables ﬁxed except one to determine its impact’’. The
monitoring category also required only one addition: ‘‘trade-off decisions, such as speed
versus accuracy or degree of elegance’’. Together with the category ‘‘reading parts of
problem statement very carefully; rereading’’,theseweretheonlyuniquecategoriesthat
were derived from the work of Lester and Garofalo (1982). All others showed overlap
with the works of Schoenfeld (1987).
Apart from recapitulation and inferring, the evaluation category was extended with a
typical metacognitive problem-solving skill, namely verifying, which plays an important
role in Schoenfeld’s (1987) as well as Lester and Garofalo’s (1982) work. Verifying
comprises checking activities, intermediate as well as ﬁnal, such as checking whether
outcomes of calculations are correct, and more com plex activities, such as checking
whether the ﬁnal solution can be used to generate a known case. Analogous to the
taxonomy for text-studying, the categories were classiﬁed hierarchically, mostly on three
to four levels. For example, ‘‘concluding’’ was subordinate to ‘‘recapitulation’’, which
was in its turn subordinate to ‘‘evaluation’’. Subordinate categories of ‘‘concluding’’ were
‘‘constructing alternative solutions’’ and ‘‘taking another perspective at a problem’’.
As explained earlier, it was decided to investigate the suitability of the taxonomy by
analysing thinking-aloud protocols. Pressley’s (2000) classiﬁcation scheme, which lay
at the basis of the taxonomy, was also constructed by a thorough analysis of thinking-
aloud protocols. Moreover, we believed that the metacognitive activities that were
described in the taxonomy could be observed better while participants were working
on actual scholastic tasks rather than they would emerge from interviews or
questionnaires. In order to articulate these activities, thinking aloud is a tried and
After the construction of the taxonomies for history text-studying and problem-
solving in physics, tasks in the domains of history and physics were administered to
participants. The tasks consisted of studying a text and answering a few questions
about the text. This was especially important for the history assignment. The
questions might elicit metacognitive activity, whereas merely reading out the text
aloud could have resulted in very meagre protocols. After the transcription of
220 J. Meijer et al.
the protocols, every statement made by participants that went beyond the literal text,
was interpreted by means of the categories in either taxonomy.
Participants were sixteen 13-year-olds in senior secondary general education and pre-
university education, 10 girls and 6 boys. They were recruited from two separate
schools in the northern part of The Netherlands. Their average age was 13 years and
3 months; the youngest participant was 12 years and 10 months old, the eldest 13
years and 7 months (standard deviation 0.28 years). All students were paid e20, – for
their participation in three sessions, which lasted about 2 hours each.
Procedure, Tasks, and Instruments
Participants were confronted with two tasks, which were presented in counter-
balanced order. One was a task in the domain of history, the other in the domain of
physics. The ﬁrst task required the participant to study a history text about the
economic depression in the United States of America in the period around 1930 and
answer two questions at the end of the text. The text contained 1,621 words; the
average number of words per sentence was 12.9. The second task required the
participant to study a physics text about motion and make four assignments that were
embedded in the text. The text about motion required understanding linear
association, that is the relation between velocity, time, and distance covered. It
contained 1,144 words; the average number of words per sentence was 10.5. There
were ﬁve tables, ﬁve diagrams and ﬁgures, and two formulae in the text. Participants
were required to read both texts aloud and study them as they would when preparing
for a test. They were required to verbalise any thoughts that arose during reading as
well. They were also asked to carry on thinking aloud while they were answering the
questions that accompanied the history text, as well as while making the assignments
in the physics text. During the sessions, the experimenter who was present
encouraged the participant to report his or her thoughts without offering any help
with respect to the content of the task. Experimenters used four prompts to stimulate
participants: ‘‘Keep on thinking aloud’’, ‘‘Will you think aloud?’’, ‘‘Keep on talking,
please’’, and ‘‘What are you doing now?’’. Students were allowed to use calculators
for the physics assignments in the text. All sessions took place in separate classrooms
with the experimenter sitting opposite to the student. Micro tape recorders were used
to record the sessions.
Revision of the Preliminary Taxonomies
In an investigation of the usefulness of both taxonomies, ﬁve thinking-aloud protocols
were scored by four judges, working independently. Two of them were research
Metacognitive Activities 221
assistants and the other two were the ﬁrst two authors. The correspondence between
these judges in terms of allocating subjects’ statements to identical categories on the
lowest hierarchical level of the taxonomies was insufﬁcient in general. Seldom did all
four judges agree. Within the text-studying protocols, however, the category
‘‘rereading’’ showed reasonable concordance among judges, usually when partici-
pants reread substantial portions of the text, mostly in order to memorise it. Another
category for which concordance was found was ‘‘executing action plan step by step’’
within the physics problem-solving protocols. This is illustrated by the statement of a
participant when solving the problem in Figure 1.
The participant states, apparently while drawing: ‘‘So it drives with a speed of
40 km/h all the time and that is a straight line. Then it brakes and the line should go
downwards’’. This statement was categorised identically by all four judges as
‘‘executing action plan step by step’’. The following statement is an example which
was categorised differently by each judge: ‘‘Figure a is usually called an s – t diagram.
Distance, oh!’’ Shortly before, it has been explained in the text that s stands for strada
(from Latin), meaning the distance that was covered. One judge classiﬁes this
statement as ‘‘reading parts of problem statement very carefully; rereading’’, another
as ‘‘looking for words, concepts, ideas or patterns in problem statement’’, the third
ﬁnds the statement indicative of ‘‘rereading part of problem statement because it is
not yet understood’’, and the last judge thinks that it is typical for ‘‘noting conﬂict
between expectations and new interim outcome’’. In the history text-studying
protocols, the statement: ‘‘If England would sell more clothing, that would be bad
for . . . well, it doesn’t say here’’, was classiﬁed as ‘‘noting conﬂict between
expectations and new information in text’’ by two judges, as ‘‘failure to achieve
reading goal’’ by another judge, and as ‘‘deciding which pieces of text are important
and dismissing, skipping or skimming other pieces’’ by the last judge.
The cross-classiﬁcations by the four judges may have been caused by the fact that
the categories leave too much room for interpretation as to the purpose of the
actions of participants. Another problem was that the original taxonomy was too
speciﬁc and did not seem to be very well suited to the tasks used and the
participants involved. In many cases, the judges found it hard to allocate statements
of participants to any of the categories in the taxonomy, which seemed not to
describe the activities of the students particularly well. For these reasons and
because the correspondence between judges was clearly below par, it was decided to
construct new taxonomies.
Figure 1. An assignment in the physics text
222 J. Meijer et al.
First, an inventory was made of different categories assigned by the judges for the
same statements in the protocols. Then these were taken together and given a new,
more parsimonious label. It must be emphasised that this exercise was always carried
out with regard to the particular protocol fragment under consideration. That is to
say, each new label was not only based upon consideration of the common elements
in the descriptions of the former categories, but the corresponding protocol fragments
were also studied closely in order to identify quite concrete activities of the
participants. Thus, a combination of top-down (i.e., theoretically driven) and
bottom-up (i.e., empirically driven) strategies was used by combining pre-coded
categories and observed statements of participants simultaneously. For example, in
the taxonomy for text-studying, the former categories ‘‘noting conﬂict between
expectations and new information in text’’, ‘‘failure to achieve reading goal’’, and
‘‘deciding which pieces of text are important and dismissing, skipping or skimming
other pieces’’ were reinterpreted as one new category with the label ‘‘information
required not found’’. This new category was given its label on the basis of the former
categories assigned by the judges to the protocol fragment cited last. ‘‘Noting that the
end of a unit of meaning has occurred’’, ‘‘noting that overall comprehension has been
reached’’ and ‘‘noting that reading goal has been accomplished’’ were taken together
and labelled as the new category ‘‘reading goal(s) accomplished’’. An accompanying
protocol fragment is the statement: ‘‘Ok, that’s question one’’ from a participant who
has answered the ﬁrst question at the end of the text about the economic depression
by referring to particular parts of the text. ‘‘Rereading text searching for connections
between sentences or relating currently read text to a previously read portion’’,
‘‘mindful navigating through the text’’, ‘‘formulating reading plan’’, and ‘‘searching
text for information related to current point’’ were taken together in the new category
‘‘selecting particular piece of text to look for required information’’. A corresponding
protocol fragment reads: ‘‘Ehm, how did the crisis arise, just searching if it says
something about why it looked so bad’’. This was part of the participants’ response to
an assignment to describe the causes of the depression.
For the problem-solving taxonomy, the same procedure was repeated. The former
categories ‘‘paraphrasing part of problem statement into more familiar terms’’,
‘‘noting interrelationships between variables in problem, keeping them in working
memory’’, and ‘‘summarising’’ were combined in the new category ‘‘paraphrasing,
summarising what was read’’. A corresponding protocol fragment is: ‘‘So she’s going
to drive around the race-circuit with Jos Verstappen and she will keep record of how
fast she’s going’’. This statement followed a fragment in the physics text, describing a
child that is taken for a ride by a Dutch racing driver.
In the beginning of the problem-solving task, that is the text about motion with
accompanying assignments, a table with entries of the speed of the racing car driving
around the circuit, is converted into a speed – time diagram. A participant remarks
after having looked at the diagram: ‘‘Yes, because you can also see where it stops and
so on . . .’’, apparently noting the correspondence between the diagram and the table.
This statement was scored by the four judges with the following categories: ‘‘deciding
which pieces of problem statement are relevant and dismissing, skipping or skimming
Metacognitive Activities 223
other pieces’’, ‘‘building a mental model of the task’’, ‘‘drawing temporary
conclusions, attempts to get the ‘big picture’ whilst abstracting from details’’, and
‘‘constructing alternative solutions’’. The combination was labelled as the new
category ‘‘transferring one representation into another’’. The same participant states
after having read the assignment given in Figure 1: ‘‘So it is a speed – time diagram just
like the one of Jos Verstappen’’. This was scored by the judges with the former
categories: ‘‘instantiating prior knowledge schemata’’, ‘‘look for related or analogous
problem with known solution method’’, and ‘‘formulating step-by-step action plan’’.
The new label given to this combination is ‘‘ﬁnding similarities, analogies’’. The
statement: ‘‘That I don’t understand’’ was categorised as ‘‘comprehension monitor-
ing’’ and ‘‘failure to understand problem’’ by the judges. These two categories were
combined and given the new label ‘‘comprehension failure’’. The former categories
‘‘noting unfamiliar terms in problem statement’’ and ‘‘pinpointing confusions’’ were
used by the judges to characterise the statement of a clearly confused participant: ‘‘The
quantity has been put on the horizontal axis?’’. The new label for the combination of
these former categories is: ‘‘noticing inconsistency, confusion’’. And as a ﬁnal
example, the statement ‘‘Then I should calculate ﬁrst how many times ﬁve minutes go
into one hour’’ was scored as ‘‘decompose problem in subgoals, work on these case by
case’’ and ‘‘formulating step-by-step action plan’’ by the judges. The participants’
statement clearly indicates the use of means-end analysis, a method very common
among children of this age when solving science or mathematics problems (Meijer &
Riemersma, 1986). Therefore, the more general label ‘‘subgoaling’’ was used for the
combination of these former categories.
After all divergent categorisations of the four judges had been combined and given
new, more general characterisations, a preliminary novel classiﬁcation scheme was
devised. It is shown in Figure 2.
This scheme was used to classify all statements in the ﬁve protocols that were
scored before by the four judges. This was done in order to check the completeness of
the scheme for the ﬁve protocols. It appeared that extra categories were needed,
because various statements could not be classiﬁed into any of the novel categories in
Figure 2. They are listed in Figure 3, accompanied by the statement that seemed to
make the addition necessary.
Because the re-analysis of only ﬁve protocols revealed the necessity of the extra 13
categories listed in Figure 3, it was decided to check the completeness of the
taxonomies using all 32 protocols, that is the 16 protocols of the history task and
another 16 protocols concerning the physics text. As expected, it was found that many
subjects conﬁned themselves to linear reading of the history text. Only when
answering the questions with the text or when encountering difﬁculties in
comprehending the text did participants show other activities as well. Many new
categories had to be added in order to be able to classify all statements found in the
protocols. Moreover, as the analysis proceeded, it appeared that many task-speciﬁc
categories, that is those that were used exclusively for coding either the text-studying
or problem-solving thinking-aloud protocols, had to be moved to the classiﬁcation of
general activities that could occur either during text-studying or problem-solving.
224 J. Meijer et al.
Figure 2. Preliminary novel classiﬁcation scheme for the interpretation of thinking-aloud protocols
Figure 3. Additional categories based on re-analysis of ﬁve protocols
Metacognitive Activities 225
For instance, note-taking also occurred while participants worked on the physics task,
for example: ‘‘The symbol for distance is s (of strada), just write that down’’. Prior
knowledge was also activated while reading the text about the economic depression in
the era of president Roosevelt, for example: ‘‘But isn’t that called Black Tuesday? Or
Black Thursday, that was in the history lessons last year’’.
The new categories were allocated to the original superordinate categories
‘‘orientation’’, ‘‘planning’’, ‘‘execution’’, ‘‘monitoring’’, and ‘‘evaluation’’ in order
to restore the hierarchical character of the taxonomy. None of the new categories
appeared to ﬁt the superordinate ‘‘reﬂection’’ category. Instead, concluding,
connecting by reasoning, inferring, paraphrasing, summarising, and commenting
were construed of as elaborative rather than reﬂective. Thus, the superordinate
reﬂection category was replaced by elaboration.
Finally, in order to condense the taxonomy and to avoid redundancies, the newly
derived categories were scrutinised for similarities. Similar categories were combined
and given a new, more encompassing label. In Figure 4, the new combinations are
listed. The complete revised hierarchical taxonomy is given in the Appendix.
Domain-Speciﬁcity of Metacognitive Activities
All statements of participants that were not literal citations of the text they were
reading, were coded according to the new taxonomy. The new taxonomy contains 70
categories, some of which were only sparsely found in the protocols. Because a
quantitative analysis on all categories would be cumbersome and not necessary for
establishing the correspondence between metacognitive activity across the history task
and the physics task, the codes within the main six categories for each task were
counted. Thus a score for each of the main six categories in both domains was
established. The question concerning the domain-speciﬁcity of metacognitive
activities was investigated by means of conﬁrmatory factor analysis (CFA). It turned
out that models using all six main categories of metacognitive activities in both
domains did not converge, that is the parameters of these models could not be
estimated. It was therefore decided to revert to the more parsimonious distinction of
Flavell (1979). Thus, scores for planning and orientation, as well as those for
evaluation and elaboration were added, resulting in three scores for metacognitive
activities in history and physics, that is planning, monitoring, and evaluation. The
scores for execution activities were left out, because most of these activities are of a
cognitive nature rather than metacognitive.
One could inspect the 15 correlations between the six variables under consideration,
and compare the magnitude of the correlations within domains and across domains,
respectively, to obtain an impression of the domain-speciﬁcity of metacognitive activity.
However, testing a single model that contains all variables is preferable. The three scores
for metacognitive activities in each domain were regressed on two latent variables,
representing metacognition in studying a history text and solving problems in physics,
respectively. The scores for metacognitive activity in history load only on the history
factor and the scores for metacognitive activity in physics were only regressed on
226 J. Meijer et al.
the physics factor. This implies that all factors loadings across domains are ﬁxed at zero.
The correlation between both latent variables or factors was then estimated. This is a
correlation corrected for unreliability of the observed variables, since the error variances
of the observed variables are estimated as well. Computer programme AMOS 4
(Arbuckle & Worhke, 1999) was used to test the tenability of the model. A graphical
representation of the model is given in Figure 5.
As can be seen, the factor loadings of the physics variables are consistently higher
than .80, contrary to those of the history variables. Thus, in the physics task,
planning, monitoring, and evaluation all appear to be good indicators of meta-
cognitive activity. Monitoring appears to be the best indicator of metacognitive
Figure 4. Combinations of new categories
Metacognitive Activities 227
activity in history text-studying, whereas evaluation seems to be least related to
metacognitive activity in the history domain. However, of main interest is the
correlation between both factors, which represents the relationship between
metacognitive activity across both task-domains. This correlation of .60 has a
standard error of .23, showing that it is statistically signiﬁcant (one-tailed p ¼ .005).
value associated with the model is acceptable (w
¼ 13.17, df ¼ 8, p ¼ .11),
but the root mean square error of approximation (RMSEA) equals .21 with a 90%
conﬁdence interval ranging from 0 to .40. This large range is due to the small sample
size. Hu and Bentler (1999) recommend not to use RMSEA in small samples since
this ﬁt measure tends to overreject true-population models in such samples.
Incremental ﬁt measures (IFI ¼ .87, CFI ¼ .85) do not resolve the issue, although
they are generally too low as well when criteria for relatively large samples (i.e.,
N 4 250) are applied. However, structural equation modelling has been used
successfully before with data from small samples (Veenman et al., 1997). Hayduk
(1987) showed that data from small samples can be analysed quite well with these
techniques, provided the data are gathered in experimental procedures rather than
mere correlational designs. Jackson (2003) indicates that it is not merely absolute
sample size that should be considered when evaluating the appropriates of conducting
covariance structure modelling. Other factors, such as the number of indicators and
their reliabilities, as well as the ratio of sample size to the number of parameters to be
estimated, are also important. In a simulation study, Jackson found statistically
signiﬁcant effects of the ratio of sample size to the number of parameters to be
Figure 5. Model for the relation between metacognitive activities during history-text reading and
problem-solving in physics. Note: error variances at far right
228 J. Meijer et al.
estimated on various measures of goodness of ﬁt, but the effects were not dramatic.
The smallest ratio involved in his study was 1.25, whereas in the model presented here
it is approximately 1.23, that is 16 participants to 13 parameters (six factor loadings,
six error variances, and one correlation). In an attempt to shed more light on the
matter, the model was compared to two other models. In the ﬁrst alternative, the
correlation between both latent variables was constrained to unity, whereas it was ﬁxed
at zero in the second alternative. The w
values associated with these respective models
were 16.39 and 17.62, both with nine degrees of freedom. Only the difference between
values associated with the base model and the second alternative model exceeds
the critical value of w
with one degree of freedom. Thus, it is more likely that there is a
substantial correlation between metacognitive activity in both domains than that there
is no correlation across domains. Although one can doubt the ﬁt of the model and the
suitability of usage of CFA in data derived from such a small sample, the model
provides support for the domain surpassing nature of metacognitive activities.
Convergence with Other Methods for Scoring Protocols
Apart from the described scoring method using the new taxonomy, the protocols
were also scored by two other methods, based on Veenman’s work (Veenman, 1993;
Veenman et al., 1997; Veenman & Verheij, 2001; Veenman et al., 2004). The ﬁrst
method is quantitative, that is based on counting the number of times participants
appeared to exhibit what Veenman calls metacognitive skill. Statements in the
thinking-aloud protocols that contained evidence of metacognitive skill were allocated
to four different categories and subsequently counted. First, each metacognitive
activity was categorised as orientation, planning, evaluation, or elaboration.
Orientation concerns familiarisation activities (e.g., ‘‘what is expected from me?’’).
Examples of planning are navigating through the text (e.g., ‘‘I am going back to the
previous page for a bit’’) and strategic statements (e.g., ‘‘ﬁrst I’ll make a summary of
this piece and then I’ll read on’’). Evaluation concerns comprehension monitoring
(e.g., ‘‘three slash ﬁve, three and a half, or three quarters ﬁve or three to ﬁve, yes,
that’s it’’) and error detection (e.g., ‘‘no, wait, that’s wrong’’). Finally, examples of
elaboration are recapitulation (e.g., ‘‘suppose that there is more overproduction and
that many people get ﬁred because there is not enough, eh, work, eh, and you need
machines, well, that’s about it’’), paraphrasing (e.g., ‘‘ﬁrst ﬁve minutes twelve
hundred metres’’), and drawing conclusions (e.g., ‘‘eh, well, than that’s the answer’’).
In the quantitative scoring method, the number of orientation, planning, evaluation,
and elaboration activities were simply counted. Thus, the difference between this
method of scoring and the scoring method based on the new taxonomy is that in the
latter a much more elaborate categorisation scheme is used, that is 70 versus 4
categories although the former were reduced to 6 categories afterwards.
The second method used by Veenman (1997) is qualitative, that is based on
judging the quality of participants’ metacognitive skills. In the qualitative scoring
method, every categorised statement was scored on a scale from 1 to 4, depending on
the apparent depth of processing of the activity. For example, one could get more
Metacognitive Activities 229
credits for elaboration if a completely new conclusion that was not present in the text,
was formulated instead of citing text literally. Alternatively, error detection without
correction (e.g., ‘‘no, wait, that’s wrong’’) would render fewer credits than error
detection with correction (e.g., ‘‘in one second the moped covers 24 times 0.3 metres
is 2 point 7 . . . eh . . . 7 point 2 metres’’), which in turn would render less credit
than error detection accompanied by subsequent analysis of the cause of the
error (e.g., ‘‘I’ve done it all wrong, I should have put the time here and the velocity
In both methods, a sumscore was subsequently calculated, reﬂecting a total for the
quantity and quality of metacognitive skill, respectively. Conﬁrmatory factor analytic
models for establishing the correspondence between the latter two methods and our
present nominal categorisation of metacognitive activity did not converge. An
alternative is to look at the Spearman rank order correlations between measures
derived from the new taxonomy and the other two methods. They are given in Table 2.
There appears to be substantial agreement between our new method for scoring
metacognitive activity and the older methods for scoring metacognitive skill, especially
within domains (see upper left and lower right six entries in Table 2). The only
exception is that planning activity in physics is not related to the quality of
metacognitive skill in the same domain. Across domains, there is correspondence as
well, although less. Planning, monitoring, and evaluation in physics are all three
related to the quantity of metacognitive skill in history. This may reﬂect the
general propensity to engage in metacognitive activity, regardless of domain. However,
the quantity of metacognitive skill in physics is only related to evaluation activity in
history. For the remainder, the scores based on the new taxonomy for history are not
related to the quality, and neither to the quantity of metacognitive skill in physics.
Discussion and Conclusion
The original taxonomy was handled differently by various judges in most cases. In
this respect, Pressley and Afﬂerbach’s (1995) and Pressley’s (2000) descriptions of
Table 2. Spearman rank order correlations between various methods for scoring metacognitive
Quality history Quantity history Quality physics Quantity physics
Planning history .47* .58** 7.04 .23
Monitoring history .56* .64** .05 .19
Evaluation history .65** .65** .40 .55*
Planning physics .23 .47* .29 .58**
Monitoring physics .41 .62** .45* .81**
Evaluation physics .47* .59** .68** .85**
Note: *p 5 .05, **p 5 .01. After Bonferroni correction for the number of correlations tested, only
values higher than .63 are statistically signiﬁcant (p ¼ .10/24 ¼ .004). After Holm’s sequential
correction, only values higher than .59 are signiﬁcant.
230 J. Meijer et al.
reading processes appear to be less adequate for usage as coding system for our
thinking-aloud protocols. Pressley’s descriptions are based on exhaustive analyses
of think-aloud data. It is possible that these descriptions are not applicable to the
thinking-aloud protocols gathered in this study, that is, there may be lack of ﬁt
between the afﬂuence of activities of the variety of readers in the studies Pressley
analysed and the present sample of relatively unsophisticated secondary school
students. Another possibility is that different judges merely have different
interpretations of the statements of participants and therefore allocate these to
different speciﬁed categories. It appears that in order to achieve correspondence
between judges scoring metacognitive activities in thinking-aloud protocols, one
requires a taxonomy that is not too detailed.
A disadvantage of increasing grain-size in the descriptions of metacognitive
activities is that one loses sight of the exact nature of the activities involved. For
instance, the new category for error detection comprises recording of the error itself,
possible correction of the error, but also keeping track of reading position in a text or
keeping track of the steps in a calculation, which appear to be difﬁcult for the judges
to distinguish. However, correspondence between judges is a prerequisite for
achieving interrater reliability, which in turn is a precondition for using coded
protocols as a basis for compiling quantitative data. Protocol analyses of new data
gathered among a new sample of forty-three 13-year-olds, shows that the interrater
reliability based on the new taxonomy, is satisfactory. Pearson’s contingency
coefﬁcient for nominal data was larger than .96 for all pairs of raters. This high
level of correspondence would probably not have been reached without extensive
preparatory sessions of the judges. In these sessions independently scored thinking-
aloud protocols were compared and discussed. Another disadvantage of the newly
derived taxonomy is that the original distinction between three up to four levels has
vanished. However, there is no reason to maintain more levels if the activities that
were uncovered in the exploration of the thinking-aloud protocols cannot be
fashioned into a hierarchy themselves.
There is support for the convergent validity of the new method for scoring
protocols and the two methods developed earlier by Veenman (1993). Because the
methods used by Veenman deviate substantially from the method that was
developed in the present study, it may be assumed that counts of metacognitive
activities co-vary with judgements of the quality of metacognitive skill. However, it
is possible that mere counts of metacognitive activities as operationalised in various
coding systems are biased by the frequency of statements of participants. For
example, in this study every statement of participants which deviated from the
original text, was scrutinised and coded subsequently. Merely counting such
statements may result in an unwarranted operationalisation of (meta)cognitive
activity. One should therefore be careful in attaching too much value to the
evidence concerning convergent validity. We should not infer a high level of
metacognitive activity merely because a participant is verbose.
A substantial correlation between metacognitive activities across both task-domains
was established. Metacognitive activities partially surpass the domains of studying
Metacognitive Activities 231
text and answering questions in history and studying text and making assignments in
physics, respectively. Students who tend to exert metacognitive activity in one of
these domains, will tend to do so as well in the other. This implies that metacognitive
activity is not completely task-speciﬁc. This ﬁnding may very well bear on the transfer
of the use of metacognitive strategies.
The resulting classiﬁcation scheme is merely a taxonomy and therefore static,
whereas in fact, the protocols reveal interdependencies between the various
categories. That is to say, the strategies used by the participants consist of sequences
of activities. An example is given in Figure 6. It is a diagram of a frequently found
sequence of activities when participants are looking for information in a text when
required to answer a question about that text.
Instead of counting the metacognitive activities in the taxonomy encountered
in the thinking-aloud protocols, one could alternatively look for patterns of
sensible sequences of these metacognitive activities. Establishing the correlation of
the occurrences of these sequences with learning results may shed light on the
metacognitive strategies that contribute to learning.
Notwithstanding the correlation that was found between metacognitive activities in
both domains, one might wonder if the exercise of revising and rebuilding a
presupposed taxonomy should be repeated for each task anew. Considering that
almost 60% of the categories in the taxonomies (41 out of 70) were applicable for the
history task as well as for the physics task, and were indeed found in both types of
thinking-aloud protocols, it may be expected that at least part of the taxonomy will
stand up when other tasks are involved. A last issue pertains to other variables that
may inﬂuence the categories in the taxonomy, such as age of the participants and
difﬁculty of the task. It seems plausible to assume that the metacognitive repertoire
expands with age until adulthood is reached. It is thus likely that new, more complex
categories must be added to the taxonomy when older participants are involved. At
present, a study with 15-year-olds is underway. Task difﬁculty should provoke
metacognitive activity. As it increases, ready-to-use strategies will fail and one must
resort to the use of general, so-called weak strategies, wherein metacognition plays an
Figure 6. Sequence of activities in a text-searching strategy
232 J. Meijer et al.
This research was made possible by a grant from the Netherlands Organisation for
Scientiﬁc Research. The authors wish to thank Manita van der Stel for her contribution
to the gathering of data and the development of parts of the materials used in the research.
Alexander, J. M., Carr, M., & Schwanenﬂugel, P. J. (1995). Development of metacognition in
gifted children: Directions for future research. Developmental Review, 15, 1 – 37.
Arbuckle, J. L., & Worhke, W. (1999). Amos 4.0 user’s guide. Chicago: SmallWaters Corporation.
Brown, A. L., Bransford, J. D., Ferrara, R. A., & Campione, J. C. (1983). Learning, remembering,
and understanding. In J. H. Flavell & E. M. Markman (Eds.), Handbook of child psychology
(Vol. 3, pp. 77 – 166). New York: Wiley.
Butler, D. L. (1998). The strategic content learning approach to promoting self-regulated learning:
A report of three studies. Journal of Educational Psychology, 90(4), 682 – 697.
Desoete, A., Roeyers, H., Buysse, A., & De Clercq, A. (2002). Dynamic assessment of
metacognitive skills in young children with mathematics-learning disabilities. In G. M. v. d.
Aalsvoort, W. C. M. Resing, & A. J. J. M. Ruijssenaars (Eds.), Learning potential assessment and
cognitive training: Actual research and perspectives in theory building and methodology (Vol. 7,
pp. 307 – 333). Oxford: Elsevier.
Dunlosky, J., & Hertzog, C. (1998). Training programs to improve learning in later adulthood:
Helping older adults educate themselves. In D. Hacker, J. Dunlosky, & A. Graesser (Eds.),
Metacognition in educational theory and practice (pp. 249 – 275). Hillsdale, NJ: Erlbaum.
Fernandez-Duque, D., Baird, J. A., & Posner, M. I. (2000). Executive attention and metacognitive
regulation. Consciousness and Cognition, 9, 288 – 307.
Flavell, J. H. (1979). Metacognition and cognitive monitoring. A new area of cognitive-
developmental inquiry. American Psychologist, 34(10), 906 – 911.
Ford, J. K., Weissbein, D. A., Smith, E. M., Gully, S. M., & Salas, E. (1998). Relationships of goal
orientation, metacognitive activity, and practice strategies with learning outcomes and
transfer. Journal of Applied Psychology, 83(2), 218 – 233.
Hacker, D., Dunlosky, J., & Graesser, A. (Eds.). (1998). Metacognition in educational theory and
practice. Hillsdale, NJ: Erlbaum.
Hattie, J., Biggs, J., & Purdie, N. (1996). Effects of learning skills intervention on student learning:
A meta-analysis. Review of Educational Research, 66(2), 99 – 136.
Hayduk, L. A. (1987). Structural equation modeling with LISREL. Essentials and advances. London:
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for ﬁt indexes in covariance structure analysis:
Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1 – 55.
Jackson, D. L. (2003). Revisiting sample size and number of parameter estimates: Some support for
the N:q hypothesis. Structural Equation Modeling, 10(1), 128 – 141.
Kelemen, W. L., Frost, P. J., & Weaver, C. A. (2000). Individual differences in metacognition:
Evidence against a general metacognitive ability. Memory & Cognition, 28(1), 92 – 107.
Kincannon, J., Gleber, C., & Kim, J. (1999, February). The effects of metacognitive training on
performance and use of metacognitive skills in self-directed learning situations. Paper presented at the
National Convention of the Association for Educational Communications and Technology,
Lester, F. K., & Garofalo, J. (1982). Mathematical problem solving. Issues in research. Philadelphia:
Franklin Institute Press.
Meijer, J., & Riemersma, F. (1986). Analysis of thinking aloud protocols. Instructional Science,
15(1), 3 – 19.
Metacognitive Activities 233
Nelson, T. O. (1996). Consciousness and metacognition. American Psychologist, 51, 102 – 116.
O’Neil, H. F., & Abedi, J. (1996). Reliability and validity of a state metacognitive inventory:
Potential for alternative assessment. The Journal of Educational Research, 89(4), 234 – 245.
Pintrich, P. R., & De Groot, E. V. (1990). Motivational and self-regulated learning components of
classroom academic performance. Journal of Educational Psychology, 82(1), 33 – 40.
Pintrich, P. R., Wolters, C. A., & Baxter, G. P. (2000). Assessing metacognition and self-regulated
learning. In G. Schraw & J. C. Impara (Eds.), Issues in the measurement of metacognition
(pp. 43 – 97). Lincoln, NE: Buros Institute of Mental Measurements.
Polya, G. (1957). How to solve it. Princeton, NJ: Princeton University Press.
Pressley, M. (2000). Development of grounded theories of complex cognitive processing:
Exhaustive within- and between-study analyses of think-aloud data. In G. Schraw &
J. C. Impara (Eds.), Issues in the measurement of metacognition (pp. 261 – 296). Lincoln, NE:
Buros Institute of Mental Measurements.
Pressley, M., & Afﬂerbach, P. (1995). Verbal protocols of reading: The nature of constructively responsive
reading. Hillsdale, NJ: Erlbaum.
Schoenfeld, A. H. (1985). Mathematical problem solving. Orlando, FL: Academic.
Schoenfeld, A. H. (1987). Cognitive science and mathematics education. Hillsdale, NJ: Erlbaum.
Schoenfeld, A. H. (1992). Learning to think mathematically: Problem solving, metacognition and
sense making in mathematics. In D. A. Grouws (Ed.), Handbook of research on mathematics
teaching and learning (pp. 334 – 370). New York: Macmillan.
Schraw, G., & Dennison, R. S. (1994). Assessing metacognitive awareness. Contemporary
Educational Psychology, 19, 460 – 475.
Schraw, G., Dunkle, M. E., Bendixen, L. D., & Roedel, T. D. (1995). Does a general monitoring
skill exist? Journal of Educational Psychology, 87, 433 – 444.
Schraw, G., & Moshman, D. (1995). Metacognitive theories. Educational Psychology Review, 7(4),
351 – 371.
Sperling, R. A., Howard, B. C., Miller, L. A., & Murphy, C. (2002). Measures of children’s
knowledge and regulation of cognition. Contemporary Educational Psychology, 27, 51 – 79.
Sternberg, R. J. (1985). Beyond IQ: A triarchic theory of human intelligence. Cambridge: Cambridge
Van Hout-Wolters, B. H. A. M., Simons, P. R. J., & Volet, S. (2000). Active learning: Self-directed
learning and independent work. In P. R. J. Simons, J. L. van der Linden, & T. Duffy (Eds.),
New learning (pp. 73 – 89). Dordrecht, The Netherlands: Kluwer.
Van Streun, A. (1990). Heuristisch wiskunde-onderwijs [Heuristic mathematics education].
Groningen, The Netherlands: Rijksuniversiteit Groningen.
Veenman, M. V. J. (1993). Intellectual ability and metacognitive skill: Determinants of discovery learning
in computerized learning environments. Doctoral dissertation, University of Amsterdam.
Veenman, M. V. J. (in press). The assessment of metacognitive skills: What can be learned from
multi-method designs? In B. Moschner & C. Artelt (Eds.), Lernstrategien und Metakognition:
Implikationen fu¨r Forschung und Praxis. Berlin, Germany: Waxmann.
Veenman, M. V. J., Elshout, J. J., & Meijer, J. (1997). The generality vs domain-speciﬁcity of meta-
cognitive skills in novice learning across domains. Learning and Instruction, 7(2), 187 – 209.
Veenman, M. V. J., Prins, F. J., & Verheij, J. (2003). Learning styles: Self-reports versus thinking-
aloud measures. British Journal of Educational Psychology,
73, 357 – 372.
Veenman, M. V. J., & Verheij, J. (2001). Technical students’ metacognitive skills: Relating general vs.
speciﬁc metacognitive skills to study success. Learning and Individual Differences, 13(3), 259 – 272.
Veenman, M. V. J., Wilhelm, P., & Beishuizen, J. J. (2004). The relation between intellectual and
metacognitive skills from a developmental perspective. Learning and Instruction, 14(1), 89 – 109.
Wang, M. C., Haertel, G. D., & Walberg, H. J. (1990). What inﬂuences learning? A content
analysis of review literature. Journal of Educational Research, 84(1), 30 – 43.
Winne, P. H. (1996). A metacognitive view of individual differences in self-regulated learning.
Learning and Individual Differences, 8(4), 327 – 353.
234 J. Meijer et al.
Appendix. The new taxonomy of metacognitive activities
Activating prior knowledge (APK)
Establishing task demands (ETD)
Identifying or repeating important information (to be remembered) (IMP)
Studying, rereading question carefully (SQC)
Fill in a value, establish givens (FV)
Observing (tables, diagrams, and so on) (O)
Keep on reading hoping for clarity further on (KRH)
Looking for particular information in text (LPI)
Organising thought by questioning oneself (OT)
Selecting particular piece of text to look for required information (SPP)
Using external source to get explanation (UES)
Change of strategy by reversing arguments
(e.g., cause and consequence) (CSA)
Backward reasoning, decision to chain backward (BR)
Deciding to read difﬁcult parts of text again (DRD) Choosing units (CUN)
Reading notes (RN) Decision to change strategy on basis of interim outcome (DCS)
Formulate action plan (FAP)
Give meaning to axes of graphs, setting up a coordinate system (GMA)
Simplifying problem by dropping restriction(s) or identifying restrictions
for solution (SIM)
Metacognitive Activities 235
Commenting on (explanation in) text (CET)
Error in technical reading (ETR)
Note-taking, underlining, circling, highlighting (NUL)
Reacting to question of experimenter (RE)
Reading aloud (R)
Skipping word(s) (SK)
Concluding, answering without checking text,
offering explanations (CEC)
Converting units (CU)
Empathising (EM) Estimating (EST)
Executing action plan (EAP)
Reading out symbolic convention (e.g., m/s) literally or failure to
pronounce it (RSC)
Transferring from one representation into another (TR)
Checking memory capacity (CMC)
Claiming (partial) understanding (CPU)
Comprehension failure (CF)
Error detection (plus correction), keeping track (ED)
Found required information (FRI)
Information required not found (IRF)
Noticing inconsistency, confusion, checking plausibility (NIPS)
Noticing unfamiliar words or terms (NUT)
Noticing retrieval failure (NRF)
Commenting on task demands or available time (TD)
Deliberately pausing, going back in text (DP) Claiming progress in understanding (CLU)
Noting lack of knowledge (NK) Give meaning to symbols or formulae (GMS)
Noticing differences (ND)
Using former interim outcome (UFO)
Appendix. (Continued )
236 J. Meijer et al.
Appendix. (Continued )
Explaining strategy, justifying (EGJ)
Finding similarities, analogies (FSA)
Uncertainty about conclusion (UC)
Reading goal(s) accomplished (RGA) Give up, quit (GQ)
Connecting parts of text by reasoning (CPR)
Paraphrasing, summarising what was read (PS)
Summarising by rereading (sub)headings or words in bold print (SRH)
Summarising (entire) text by dates and events, checking representations,
words and symbols; preparing for posttest (SUM)
Commenting on difﬁculty of problem (CDP)
Commenting on personal habit (CPH)
Metacognitive Activities 237