ArticlePDF Available

Comparison of the Reliability and Validity of Scores from Two Concept-Mapping Techniques


Abstract and Figures

This paper reports the results of a study that compared two concept-mapping techniques, one high-directed, "fill-in-the-map," and one low-directed, "construct-a-map-from-scratch." We examined whether: (1) skeleton map scores were sensitive to the sample of nodes or linking lines to be filled in; (2) the two types of skeleton maps were equivalent; and (3) the two mapping techniques provided similar information about students' connected understanding. Results indicated that fill-in-the-map scores were not sensitive to the sample of concepts or linking lines to be filled in. Nevertheless, the fill-in-the-nodes and fill-in-the-lines techniques were not equivalent forms of fill-in-the-map. Finally, high-directed and low-directed maps led to different interpretations about students' knowledge structure. Whereas scores obtained under the high-directed technique indicated that students' performance was close to the maximum possible, the scores obtained with the low-directed technique revealed that students' knowledge was incomplete compared to a criterion map. We concluded that the construct-a-map technique better reflected differences among students' knowledge structure. © 2001 John Wiley & Sons, Inc. J Res Sci Teach 38: 260-278, 2001.
Content may be subject to copyright.
Comparison of the Reliability and Validity of Scores from
Two Concept-Mapping Techniques
Maria Araceli Ruiz-Primo,
Susan E. Schultz, Min Li, and Richard J. Shavelson
Stanford University/CRESST, School of Education, 485 Lasuen Mall, Stanford,
California 94305-3096
Received 1 September 1999; accepted 31 August 2000
Abstract: This paper reports the results of a study that compared two concept-mapping techniques,
one high-directed, ``®ll-in-the-map,'' and one low-directed, ``construct-a-map-from-scratch.'' We examined
whether: (1) skeleton map scores were sensitive to the sample of nodes or linking lines to be ®lled in; (2) the
two types of skeleton maps were equivalent; and (3) the two mapping techniques provided similar infor-
mation about students' connected understanding. Results indicated that ®ll-in-the-map scores were not
sensitive to the sample of concepts or linking lines to be ®lled in. Nevertheless, the ®ll-in-the-nodes and ®ll-
in-the-lines techniques were not equivalent forms of ®ll-in-the-map. Finally, high-directed and low-directed
maps led to different interpretations about students' knowledge structure. Whereas scores obtained under
the high-directed technique indicated that students' performance was close to the maximum possible, the
scores obtained with the low-directed technique revealed that students' knowledge was incomplete com-
pared to a criterion map. We concluded that the construct-a-map technique better re¯ected differences
among students' knowledge structure. ß2001 John Wiley & Sons, Inc. J Res Sci Teach 38: 260± 278, 2001
Concept maps have been used to assess students' knowledge structure, especially in science
education (Novak, 1990). The justi®cation for assessing students' knowledge structures is based
on the idea that relating concepts that belong to the same domain is an important characteristic
of scienti®c literacy (e.g., Bybee, 1996; Moore, 1995). Theory and research have shown that
understanding a subject domain such as science is associated with a rich set of relations among
important concepts in the domain (Novak, 1998; Novak & Gowin, 1984; Novak, Gowin, &
Johansen, 1983; Novak & Ridley, 1988). We know, for example, that successful learners develop
elaborate and highly integrated frameworks of related concepts (Mintzes, Wandersee, & Novak,
1997), just as experts do (Chi, Glaser, & Farr, 1988; Glaser, 1991). Research has shown that
highly organized structures facilitate problem solving and other cognitive activities (e.g.,
generating explanations or rapidly recognizing meaningful patterns; Baxter, Elder, & Glaser,
*Correspondence to: Maria Araceli Ruiz-Primo. E-mail:
The original version of this paper was presented at the 1998 AERA Annual Meeting, San Diego, CA.
ß2001 John Wiley & Sons, Inc.
1996; Mintzes et al., 1997), and that differences in the performance of experts and novices are
due, largely, to how knowledge is structured in their memories (Chi et al., 1988; Glaser, 1991).
Concept maps provide a ``picture'' of how key concepts in a domain are mentally organized/
structured by students. With this assessment technique, students are asked to link pairs of
concepts in a science domain and label the links with a brief explanation of how the two concepts
go together. Although concept maps have been used in large-scale, as well as classroom
assessment, a wide variety of techniques are called concept maps and little is known about the
reliability and validity of scores produced by these different mapping techniques (e.g., Ruiz-
Primo & Shavelson, 1996). We suspect that the observed characteristics of the representation of
a student's knowledge structure depend to a large extent on how the representation is elicited.
Simply put, the method used to ask students to represent their knowledge can affect the
representation they provide as well as the scores they obtain (Ruiz-Primo, Schultz, & Shavelson,
1996; Ruiz-Primo & Shavelson, 1996; Ruiz-Primo, Shavelson, & Schultz, 1997).
Through a series of studies we sought to increase our understanding of how different
mapping techniques affect the representation and interpretation of a student's knowledge stru-
cture. In this paper, we provide reliability and validity evidence on the effects of two mapping
techniques, ``®ll-in-the-map'' and ``construct-a-map.''
Concept Map Assessment
We de®ne a concept map as a graph in which the nodes represent concepts, the lines between
nodes represent relations, and the labels on the lines represent the nature of the relations. The
combination of two nodes and a labeled line is called a propositionÐthe fundamental unit of the
map. Our characterization of a concept map assessment as based on its three componentsÐtask,
response format, and scoring systemÐhas revealed the enormity of variations in mapping
techniques used in research and practice (see Ruiz-Primo & Shavelson, 1996).
The characteristics of the task, the response format, and the scoring system hold the key for
tapping what concept-map-based assessments are intended to evaluate: knowledge structure (or
``connected understanding'' for some authors). The assessment task, for example, can vary in the
constraints (directedness) it imposes on a student in eliciting her representation of structural
knowledge. One dimension in which directedness varies lies in what is provided for use in the
concept map (Figure 1; see Surber, 1984).
If the characteristics of the assessment task fall on the left extreme the student's repre-
sentation is probably determined more by the mapping technique (or the assessor if you will),
than by the student's own knowledge or connected understanding.
If the assessment task falls on
the right extreme, the student is free to decide which and how many concepts to include in her
map, which concepts are related, and which words to use for explaining the relation. Asking the
student to generate the concepts to construct her map provides a good piece of information about
the student's knowledge in a particular domain (e.g., are the concepts selected by the student
relevant/essential to the topic?). However, this openness may also be undesirable in practice. In
one of our studies (Ruiz-Primo et al., 1996) we compared two mapping techniques that differed
on whether the concept sample was student-generated or assessor-generated. We found that
under the student-generated condition, some students provided related but not relevant/essential
The characteristics of the assessment task have an impact on the response format and the scoring system. For
example, a task that provides the structure of the map, will probably provide such a structure in the student's
response format. If the task provides the concepts to be used, the scoring system will not focus on the
``appropriateness of the concepts'' used in a map. The combination of the task, the response format, and the
scoring system is what determines a mapping technique.
concepts to the topic. An irrelevant but related concept (e.g., ``chemistry'' within the topic of
ions, molecules, and compounds) led students to provide many accurate but irrelevant rela-
tionships between concepts within the topic in which students were assessed (e.g., compounds
``is a'' concept in chemistry). This situation led to arti®cially high scores. Furthermore, the
student-generated sample technique proved challenging when developing a scoring system since
each concept map might have a unique set of concepts and relations.
We suspect the cognitive demands imposed on students by high-directed techniques are
different from low-directed techniques. Furthermore, although high-directed techniques may be
responded to and scored quickly, they also are more likely to misrepresent a student's knowledge
structure by imposing a structure on their responses. In this study we examined the reliability and
validity of two mapping techniques, one high-directed and the other low-directed.
De®ning The Two Mapping Techniques
Some researchers (e.g., Schau & Mattern, 1997) have argued that asking students to draw a
map from scratch imposes too high a cognitive demand to produce a meaningful representation
of their knowledge. They proposed an alternative technique, ``®ll-in-the-map.'' In what follows
we describe both techniques, ®ll-in-the-map and construct-a-map.
Fill-in-the-Map. The ®ll-in-the-map technique provides students with a concept map where
some of the concepts and/or the linking words have been left out. Students ®ll in the blank nodes
or blank linking lines (e.g., Anderson & Huang, 1989; McClure & Bell, 1990; Schau, Mattern,
Weber, Minnick, & Witt, 1997; Surber, 1984). The response format is straightforward; students
®ll in the blanks and their responses are scored correct ± incorrect. Arguments can be made for
(e.g., ease of administration, scoring, and retrieval of propositions from long-term memory) and
against (e.g., imposes a structure on a student's knowledge) the technique. We posit that as
students' subject matter knowledge increases, the structure of their maps should increasingly
re¯ect the structure of the domain as held by experts (see Glaser, 1996; Shavelson, 1972, 1974).
By imposing a structure on the relations between concepts, it is dif®cult to know whether or not
students' knowledge structures are becoming increasingly similar to experts'. Structure of
representation, however, is not the only issue to consider. Usually, with ``®ll-in'' students are
provided with linking words in the skeleton map and they select the concepts from a list of
concepts. Therefore, there is less evidence gathered on students' connected understanding. In our
research using the construct-a-map technique we found that the linking words students used to
relate two concepts provided an insight into a students' understanding in a particular content
domain (e.g., Ruiz-Primo et al., 1996).
Figure 1. Degree of directedness in the concept map assessment task.
Construct-A-Map From Scratch. The ``construct-a-map'' technique varies as to how much
information is provided by the assessor (Figure 1). The assessor may provide the concepts and/or
linking words or may ask students to construct a hierarchical or non-hierarchical map. The
response format is simply a piece of paper on which students construct a map. Scoring systems
vary from counting the number of nodes and linking lines (not recommended) to evaluating the
accuracy of propositions (see Ruiz-Primo & Shavelson, 1996).
This mapping technique, however, has been considered problematic for large-scale
assessment because students need to be trained to use maps and scoring is dif®cult and time-
consuming (e.g., Schau et al., 1997). Our research has tried to overcome these two problems (see
Ruiz-Primo et al., 1996, 1997). We designed a 50-minute program to teach students how to
construct concept maps. The program proved to be effective in achieving this goal with more
than 100 high school students. Moreover, to ®nd an ef®cient scoring system we have explored
different types of scores: some based only on the propositions, others using a criterion map. Map
propositions can be scored for accuracy and comprehensiveness or simply on whether the
propositions are correct or incorrect. Based on this differentiation we have studied three types of
scores: proposition accuracy scoreÐthe sum of individual proposition scores obtained on a
student's map; convergence scoreÐthe proportion of accurate propositions in the student's map
out of all possible propositions in the criterion map; salience scoreÐthe proportion of correct
propositions out of all propositions in the student's map. All three score types have yielded high
interrater reliability coef®cients, above .90, even when the quality and accuracy of the
propositions is judged.
This study explored the technical characteristics of the ``®ll-in-the-map'' and ``construct-a-
map'' techniques. More speci®cally, we examined whether the: (a) mapping techniques can be
considered equivalent; (b) ®ll-in-the-map scores are sensitive to the nodes (concepts) selected to
be ®lled in (construct-a-map scores have proven not to be sensitive to the sample of concepts
used; Ruiz-Primo et al., 1996); and (c) ®ll-in-the-map scores were sensitive to the linking lines
selected to be ®lled in (linking words).
One hundred and ®fty-two high school chemistry students and two chemistry teachers from
the Bay area participated in the study. Seventy-three were males and 79 were females. The
majority of the students were Caucasian (58.5%), followed by Asian (13.2%), Hispanic (4.6%),
African American (3.3%), and other ethnicities (1.5%; e.g., Indian). The rest of the students,
about 19%, did not provide information about their ethnicity. The proportion of ethnicity was
consistent across seven classes.
Standardized Testing and Reporting (STAR) test scores on science, language, and
expression were collected for about 51% of the students. The remaining students either did
not permit review of their academic ®les or did not take the STAR test. The test was taken the
same year in which the data were collected in this study.
Students were in one of seven chemistry classes. Classes 1, 2, 5, 6, and 7, were taught by
Teacher 1 (6 years of teaching experience), and classes 3 and 4 by Teacher 2 (1 year of teaching
experience). Classes 1, 2, 3, and 4 (96 students) were considered advanced, the remainder (56
students) were regular chemistry classes.
Students and teachers were trained to construct concept maps, including the ®ll-in-the-map
technique, with the same 50-minute training program used in previous studies (Appendix A; see
Ruiz-Primo et al., 1996, 1997). To evaluate the training, 25% of the maps constructed by
students at the end of the training session were randomly sampled and analyzed. The analysis
focused on whether students used the concepts provided on the list, labeled the lines,
and provided accurate propositions. Results indicated that 92% of the students used all the
concepts provided in the list, all used labeled lines, and all provided four or more accurate
propositions. We concluded that the program succeeded in teaching students to construct
concept maps.
Selection of Concepts and Development of the Criterion/ Skeleton Maps
To identify the structure of the skeleton map for the ®ll-in mapping technique, we assumed
that: (1) there is some ``agreed-upon organization'' that adequately re¯ects the structure of a
content domain; (2) ``experts'' in that domain (in this context, the teachers who participated in
the study) can agree on the structure; and (3) experts' concept maps provide a reasonable
representation of the subject domain (e.g., Glaser, 1996). Therefore, the skeleton maps used were
based on the criterion map.
We chose the topic, ``Chemical Names and Formulas,'' as the domain for sampling the
concepts used in the study.
The two teachers of the seven classes and researchers (the second
author was a high school chemistry teacher for 10 years) were involved in the process of
selecting the concepts and creating the criterion map. Teachers were asked to identify the
concepts they considered to be the most important in the unit. Researchers also selected the most
important concepts by carefully reviewing the text used to teach the topic. Appendix B provides
a brief description of the procedure followed to select the key concepts and to de®ne the criterion
map (for details see Ruiz-Primo et al., 1996).
The ``agreed'' upon links across teachers' and researchers' maps were represented in the
criterion map and considered the ``substantial'' links that students were expected to know after
instruction on the topic (Figure 2). The criterion map was used as the master map for the purpose
of constructing the four skeleton maps. Concepts selected for the blank nodes on the skeleton
maps were randomly sampled from the key-concept list. Linking lines selected to be ®lled-in on
the skeleton maps were sampled from the linking lines on the criterion map. Propositions
provided in the skeleton maps were taken from the criterion map. The concepts for the construct-
a-map technique were all those on the key-concept list.
To evaluate whether the ®ll-in-the-map scores were sensitive to the sample of nodes or
linking lines to be ®lled in, we used a 2 2, node (concept) sample by linking-line sample
design. Four 20-node skeleton maps were constructed. In two of the maps' 12 nodes (60% of the
nodes) were left blank. In the other two skeleton maps, 12 linking lines (31.5% of the linking
lines in the criterion map) were left blank (i.e., no linking words). Concepts and linking lines to
be left blank were randomly selected from the list of key concepts and the list of propositions in a
criterion map. The four skeleton maps were as follows: AÐskeleton map with Sample 1 of
nodes left blank; BÐskeleton map with Sample 2 of nodes left blank; CÐskeleton map with
Although we used this topic in previous studies, the selection of concepts for mapping was carried out again
since different teachers participated on this occasion.
Sample 1 of linking lines left blank; and DÐskeleton map with Sample 2 of linking lines left
blank (Figure 3).
Within each of the seven classes students were randomly assigned to one of four sequences
of skeleton maps: Sequence 1Ðskeleton map A followed by skeleton map C; Sequence 2Ð
skeleton map A followed by skeleton map D; Sequence 3Ðskeleton map B followed by skeleton
map C; and Sequence 4Ðskeleton map B followed by skeleton map D. Students were tested on
four occasions (Table 1): On Occasion 1, all students constructed a concept map from scratch
using the 20 concepts provided by the assessors. On Occasion 2, half the students ®lled in
skeleton map A and half ®lled in skeleton map B. On Occasion 3, half the students ®lled in
skeleton map C and half ®lled in skeleton map D. On Occasion 4, all students received a 30-item
multiple-choice test designed by the teachers and researchers.
The two mapping techniques varied in their task demands and constraints on students. Table
2 provides a pro®le of the directedness of the assessment tasks across techniques. The construct-
a-map technique asked students to construct a map using the 20 concepts provided by the
assessor. Students were encouraged to provide detailed propositions (linking words) to explain
the relationship between the two concepts they were linking. No restriction was imposed on the
type of structure students could use in the map (e.g., students were not instructed to create a
hierarchical structure).
Figure 2. Criterion map.
Figure 3. Fill-in-the-nodes, map A (top map) and ®ll-in-the-lines, map C (bottom map) skeleton maps.
The ®ll-in-the-map technique asked students to ®ll in two skeleton maps, one with blank
nodes and the other with blank linking lines. After randomly selecting nodes, seven nodes were
the same on skeleton map A and skeleton map B. For the blank-linking line maps, only two
propositions were the same across skeleton map C and skeleton map D. Students' responses on
each skeleton map were scored as correct or incorrect. A maximum of 12 points could be
awarded to each student on each skeleton map.
As in previous studies, to score students' constructed maps we developed a proposition
inventory to account for variation in the quality of students' propositions. This inventory
contained the 190 possible relations between all pairs of concepts in the key-concept list. Based
on this inventory, each proposition was scored on a 5-point scale, from 0 for inaccurate/incorrect
to 4 for excellent/outstanding. Table 3 provides the de®nitions of the categories and one example
of the proposition inventory. For example, the accurate excellent proposition between acids and
Table 1
Within-class design
Sequence Occasion 1 Occasion 2 Occasion 3 Occasion 4
Construct-a-map Fill-in-the-nodes: Fill-in-the-lines: Multiple-choice
Sample 1: Map A Sample 1: Map C test
Construct-a-map Fill-in-the-nodes: Fill-in-the-lines: Multiple-Choice
Sample 1: Map A Sample 2: Map D test
Construct-a-map Fill-in-the-nodes: Fill-in-the-lines: Multiple-choice
Sample 2: Map B Sample 1: Map C test
Construct-a-map Fill-in-the-nodes: Fill-in-the-lines: Multiple-choice
Sample 2: Map B Sample 2: Map D test
Table 2
Directedness Pro®le of the Mapping Techniques
Map Components
Technique Concepts Linking Lines Linking Words Structure of the Map
Construct-a-map Provided in a list: Not provided Not provided Not provided
Students use the
concepts in the list
for constructing the
Fill-in-the-nodes Provided in a list: Provided in the Provided in the Provided in the
Student selects, skeleton map skeleton map skeleton map
from the list, the
concept to ®ll-in a
Fill-in-the-lines Provided in the Provided in the Provided in a list: Provided in the
skeleton map skeleton map Students selects, skeleton map
from the list, the
linking words to
®ll-in a line
compounds should be read, according to the direction of the arrow (<), as follows: compounds
that give off H
when dissolved in water are acids. The maximum score for a map constructed by
students was based on the criterion map: the number of links (38) in the criterion map was
multiplied by 4 (assuming all propositions were scored as excellent).
The multiple-choice test was designed by both teachers and researchers. The test items were
both conceptual and mechanical (see Appendix C for examples of the multiple-choice test). A
maximum of 30 points could be awarded to each student on this test. The internal consistency of
the multiple-choice test was .74.
We examined whether the: (1) skeleton map scores were sensitive to the sample of nodes or
linking lines left blank; (2) two forms of skeleton maps were equivalent; and (3) two mapping
techniques provided similar information about students' connected understanding. Results are
organized in three sections: ®ll-in-the-map technique; construct-a-map technique; and com-
parison across techniques.
Before focusing on a detailed discussion of the results, variation between classrooms needs
to be addressed. We compared the seven classes using two measures, the STAR-science and the
multiple-choice test scores. The one-way ANOVA results for both measures indicated a signi-
®cant difference between groups (F
3.98; p.002 and F
MC(6, 143)
2.59; p.02).
Tukey's HSD (p.05) indicated that differences in the STAR-science mean scores were due to
classes 5 and 6, which both differed signi®cantly only from Class 2. Differences in the multiple-
choice test were due only to the difference between Class 6 and Class 2. Moreover, a split-plot
ANOVA, type of skeleton map (T) by sequence (S) by class (C), indicated no signi®cant
interaction of class with any other within- or between-subjects factor (F
1.53; p.09;
1.95; p.08; F
.90; p.58). Since no other differences were found across the
advanced and regular classes, for simplicity and brevity, we decided to collapse the seven classes
and present overall results. However, all statistical analyses were run by class and results are
available from the authors. These analyses do not change the overall analyses reported herein.
Finally, no signi®cant mean differences in gender or ethnicity were found across any of the
comparisons made. Therefore, analyses considering these two variables are not presented.
Table 3
Quality of proposition categories
Quality of Proposition Descriptions and Examples
ExcellentÐ4 Outstanding proposition. Complete and correct. It shows a deep
understanding of the relation between the two concepts.
acids±compounds: <that gives off H
when dissolved in water are
GoodÐ3 Complete and correct proposition. It shows a good understanding of
the relation between the two concepts.
acids±compounds: >are examples of
PoorÐ2 Correct but incomplete proposition. It shows partial understanding
of the relation between the two concepts.
acids±compounds: <form
Don't CareÐ1 Although accurate, the proposition does not show understanding of
the relationship between the two concepts.
acids±compounds: >is a different concept
Inaccurate/invalidÐ0 Incorrect proposition.
acids±compound: >made of
Fill-In-The-Map Techniques
In this section we focus on the ®ll-in-the-map (skeleton map) technique by assessing ®rst,
whether the ®ll-in-the-map scores are sensitive to the sample of nodes or linking lines selected
for the skeleton map; and second, whether the two types of skeleton maps, ®ll-in-the-nodes and
®ll-in-the-lines, can be considered equivalent mapping techniques.
For each of the four skeleton maps developed, two ®ll-in-the-nodes and two ®ll-in-the-lines,
we calculated the internal consistency. On average, the alpha for the ®ll-in-the-nodes maps was
.71 and for ®ll-in-the-lines was .85.
Comparing Fill-In-The-Map Scores Across Samples of Concepts and Linking Lines. To
determine whether the ®ll-in-the-map scores were sensitive to the sample of nodes (concepts) or
linking lines (propositions) left blank, we compared the mean and variances of scores between
skeleton maps A and B (with blank nodes) and between skeleton maps C and D (with blank
linking lines).
The mean scores and standard deviations for the ®ll-in-the-nodes skeleton maps A and B and
®ll-in-the-lines skeleton maps C and D are presented in Table 4. Overall, students' performance
across the two types of skeleton maps and samples was high. However, it was higher for ®ll-in-
the-nodes maps than for ®ll-in-the-lines. An independent-samples t-test indicated no signi®cant
difference between the two samples of concept means (t1.57, p.12) or the two samples of
linking line means (t1.64, p.10). The Levene test indicated that variances were not
homogeneous across samples (F
6.77 and F
2.16; p<.20). However, since the
interquartile range across samples was the same or very similar (nodes: sample 1, IQR 2.00
and sample 2, IQR 2.00; linking lines: sample 1, IQR 4.00, and sample 2, IQR 6.00), we
concluded that both samples of nodes and linking lines were equivalent and that students' scores
were not affected by the particular sample used in the skeleton maps. Similar results using
different samples of concepts for constructing a map were found in one of our previous studies
(Ruiz-Primo et al., 1996).
Comparing Fill-In-The-Nodes and Fill-In-The-Lines Skeleton Maps. For the ®ll-in-the-
nodes and ®ll-in-the-lines techniques to be considered equivalent, they at least need to produce
similar means and variances. We carried out a 2 4 (skeleton map type by sequence) split-plot
ANOVA to evaluate whether the type of skeleton map (i.e., ®ll-in-the-nodes and ®ll-in-the-lines)
Table 4
Means and standard deviations by type of skeleton map and sample
Type of Skeleton Map n(Max. 12) SD
Sample 1ÐMap A 80 11.21 1.43
Sample 2ÐMap B 72 10.81 1.74
Sample 1ÐMap C 78 9.77 2.74
Sample 2ÐMap D 73 8.99 3.09
and the sequence in which students took the different forms of skeleton maps (e.g., skeleton map
A followed by skeleton map C or skeleton map A followed by skeleton map D) affected their
Table 5 provides the mean scores and standard deviations for each type of skeleton map and
sequence. As mentioned before, mean scores were higher for ®ll-in-the-nodes than ®ll-in-the-
lines, independent of the sequence in which students took the assessments.
ANOVA results indicated a signi®cant interaction between type of skeleton map (T) and
sequence (S) (F
2.73, p.046; Z
.05) and a signi®cant difference for type of map
65.95, p.000; Z
.31); but no signi®cant difference was observed for sequence
.63, p.599).
A closer examination of the interaction showed that it was ordinal. The mean difference in
scores between nodes and linking lines skeleton maps was not statistically signi®cant for those
students under Sequence 3 (F
4.06, p.052) whereas it was for those students under the
other three sequences (F
16.50, p.000; F
26.57, p.000; F
23.93, p.000).
Filling-in-the-nodes using Map B somehow facilitated the ®ll-in-the-lines when Map C was
used. A closer look into the skeleton maps revealed that the number of propositions students
needed to read for ®lling-in-the-nodes (Map B) and ®lling-in-the-lines (Map C) overlapped more
in Sequence 3 than in any other sequence.
For the purposes of the study, however, a more important result was the one related to the
differences between the two types of skeleton maps, ®ll-in-the-nodes and ®ll-in-the-lines. The
split-plot ANOVA indicated that means differed signi®cantly. The magnitude of Z
indicated a
large effect due to type of skeleton map (about 31% of the variance was accounted for by this
factor). Furthermore, an F
test indicated that variances of the two types of maps were
different (F
3.35, p.05). We concluded that ®ll-in-the-nodes and ®ll-in-the-lines were
not equivalent forms of skeleton maps. Fill-in-the-nodes maps were easier for students than ®ll-
in-the-lines maps.
Since the two samples of nodes and linking lines were considered equivalent (Table 4), we
ignored the sample of nodes or linking lines used in the skeleton maps and calculated a pooled-
within-sequence correlation between the ®ll-in-the-nodes and ®ll-in-the-lines maps. The
magnitude of the pooled correlation was .56, suggesting that students were ranked somewhat
differently across the two types of maps. However, the magnitude of the correlation may be
lowered due to the restriction of range observed in the ®ll-in-the-nodes maps. The correlation
corrected for attenuation was .72.
Table 5
Means and standard deviations by type of skeleton map and sequence
Fill-in-the-Nodes Fill-in-the-Lines
Sequence nMean SD Mean SD
1 Nodes 1 ± Lines 1 43 11.09 1.52 9.72 2.84
2 Nodes 1 ± Lines 2 36 11.33 1.33 9.31 3.06
3 Nodes 2 ± Lines 1 35 10.63 1.82 9.83 2.65
4 Nodes 2 ± Lines 2 37 10.97 1.67 8.68 3.14
Total 151
11.01 1.60 9.39 2.93
One student did not ®ll out the second skeleton map.
Construct-a-Map Technique
In this section we examine the consistency of scores across raters for the construct-a-map
technique, characterize students' constructed maps, and compare types of scores.
Interrater Reliability. All construct-a-maps were scored for accuracy and compre-
hensiveness. For each student we calculated a proposition accuracy scoreÐthe sum of the
scores obtained on all propositions; convergence scoreÐthe proportion of accurate propositions
in a student's maps out of all possible propositions in the criterion map; and salience scoreÐthe
proportion of valid propositions out of all the propositions in the student's map.
A sample of 55 students' maps (more than a third of the sample) was scored by three raters.
To examine the generalizability of scores across raters, three person (p) by rater (r) G studies
were carried out, one for each type of score (Table 6).
Raters introduced negligible error. Both relative (^
2) and absolute (
) generalizability
coef®cients were very high across types of scores. Based on these results, the remaining 97
concept maps were randomly distributed among the three raters and only one rater scored each
map. The randomization was done within each of the seven classes. Thus, all three raters scored
a sample of students' maps across the seven classes.
Students' Maps. Table 7 provides information about the characteristics of students' con-
structed maps. Two-thirds of the maps used all 20 concepts provided in the list to construct their
maps. Another ®fth used 18 ± 19 concepts and only one student used just 14 concepts.
A surprising ®nding was that 6.6% of the students provided more than 38 links in their maps,
which is the number of links on the criterion map.
Furthermore, a few of the students provided
better propositions than those in the criterion map! This led us to re-score the criterion map using
the same criteria applied for students. Therefore, some propositions in the criterion map became
``Good,'' instead of ``Excellent,'' and one proposition became ``Poor.'' The original maximum
score of 152 was corrected to 135.
Table 6
Estimated variance components and generalizability coef®cients for person by rater G study across types of
Score type
Proposition accuracy Convergence Salience
Source of Estimated Percent of Estimated Percent of Estimated Percent of
Variation Variance Total Variance Total Variance Total
Component Variability Component Variability Component Variability
Persons (p) 290.54 96.26 0.03114 97.65 0.02863 95.15
Raters (r) 0.36 0.12 0.00011 0.34 0.00020 0.66
pr,e 10.92 3.62 0.00064 2.00 0.00126 4.19
2(relative) .99 .99 .98
(absolute) .99 .99 .98
In fact, 18% of the students provided between 25 and 38 links.
Types of Scores. Table 8 provides the means, standard deviations, and correlations across
the three types of scores used for the construct-a-map technique. Information provided about
students' connected understanding varies across the types of scores. Whereas the mean salience
score indicated that students' performance was close to the maximum, the proposition accuracy
and convergence scores indicated that students' knowledge was rather partial.
The high correlation between proposition accuracy and convergence scores (.95) was very
similar to correlations we have found in other studies (e.g., Ruiz-Primo et al., 1996, 1997).
However, correlations between proposition accuracy and convergence scores with salience
scores (.73 and .75, respectively), were lower than the ones we have observed before (.85).
When G theory has been used to evaluate the dependability of these measures (see Ruiz-
Primo et al., 1996, 1997), we found that the percent of variability among persons was higher for
proposition accuracy and convergence scores than for salience scores. This indicated that these
two measures better re¯ected the differences in students' knowledge structures than did salience
The general conclusion about construct-a-map scores is consistent with our previous
research. Proposition accuracy and convergence scores re¯ect the differences in students'
knowledge structure better than salience scores. Based on practical (e.g., scoring time) and
technical (e.g., stability of scores) arguments, we concluded that the convergence score was the
most ef®cient.
Comparing Students' Scores Across Assessment Techniques
In this section we focused ®rst on evaluating the extent to which the scores on the two
mapping techniques, ®ll-in-the-map and construct-a-map, converged. Then, we evaluated the
extent to which the two mapping technique scores converged with multiple-choice scores. A
Table 7
Means and standard deviations of students' concept map components
Minimum Score Maximum Score
Map Components Mean SD Observed Observed
Nodes 19.34 1.23 14 20
Linking Lines 25.41 6.60 14 43
Accurate Propositions 18.88 7.44 0 42
Table 8
Means, standard deviations, and correlations across the three types of construct-a-map scores
Descriptive Statistics Correlations
Type of Score nMaximum Mean SD PA CON SAL
Proposition Accuracy (PA) 152 135 53.91 22.17 Ð
Convergence (CON) 152 1 .50 .19 .95 Ð
Salience (SAL) 152 1 .73 .17 .73 .75 Ð
correlational approach was used to compare techniques because of differences between score
Table 9 provides the descriptive statistics for the three types of assessments administered to
the students: construct-a-map, ®ll-in-the-map, and multiple-choice test.
Mean scores across the forms of assessments do not provide the same picture about
students' knowledge of the topic. Whereas ®ll-in-the-map and multiple-choice scores indicate
that the students' score was close to the maximum criterion, the convergence score indicated that
the students' knowledge was rather partial compared to the criterion map.
Table 10 provides a multiscore ± multitechnique matrix. In the matrix, rel iability coef®cients
are enclosed in parenthesis on the main diagonal. Along with the observed correlations, we
present correlations corrected for unreliability when appropriate. However, because different
Table 9
Means and standard deviations across the three types of assessments
nMaximum Mean SD
Convergence 152 1 .50 .19
Fill-in-the-nodes 152 12 11.02 1.59
Fill-in-the-lines 151 12 9.39 2.93
Multiple-choice test 150 30 24.05 3.74
Table 10
Correlations between mapping technique scores and types of assessments
Types of Assessment and
Techniques CON NOD LIN MC
Convergence Score (CON) (.99)
Fill-in-the-nodes (NOD)
Observed .47 (.71)
Corrected .56
Fill-in-the-lines (LIN)
Observed .44 .53 (.85)
Corrected Ð
Multiple-choice (MC)
Observed .44 .37 .65 (.74)
Corrected .51 .51 .82
Interrater reliability.
Internal consistency averaged between the two-sample skeleton maps.
Both assessments are reliable, therefore correction was not calculated.
Internal consistency.
reliability estimates were used in the matrix, and hence measurement error was de®ned
differently, some of these corrections may not be accurate and must be interpreted cautiously.
Therefore, we focus on the observed correlations.
Mapping Technique Scores. If the construct-a-map and ®ll-in techniques measure the same
construct, we should expect a high correlation between these scores. Yet, correlations were lower
than expected (r.48 averaged across types of scores), indicating that students were ranked
differently according to the technique used. It seems that different aspects of the students'
connected understanding were being tapped with the different techniques. Restriction of range
observed in both types of ®ll-in-the-map scores may have contributed to the magnitude of the
correlations; interpretation of the low coef®cients should be considered with caution.
Comparing Mapping and Multiple-Choice Scores. Magnitudes of the correlations between
construct-a-map scores and multiple-choice scores and between ®ll-in-the-map scores and
multiple-choice scores were very close to each other. The correlations between ®ll-in-the-map
scores with multiple-choice scores were quite surprising. The magnitudes of the correlations
between ®ll-in-the-nodes and multiple-choice test reported by Schau et al. (1997) were higher
(.75 on average) than the one we found in this study (.37).
Two issues may explain these differences: restriction of range observed in the ®ll-in-the-
nodes skeleton map scores (i.e., skeleton map was very easy for students in our study) and
differences between the characteristics of the ®ll-in-the-nodes maps used in the two studies. For
example, Schau et al. (1997) used 37 nodes and 50% were left blank; we used 20, and 60% were
left blank. Also, the propositions in the skeleton map used by Schau et al. were less complex than
the ones used in ours.
Whether the characteristics of the maps can affect students' scores deserves to be studied
more carefully. For example, how many nodes in a skeleton map is optimum? How many nodes
need to be left blank? What is the best way to select the nodes left blank?
Notice, however, that the correlation with ®ll-in-the-lines (.65) was the highest among all
the correlations between mapping scores and multiple-choice scores. Differences between these
two forms of ®ll-in maps deserve more attention.
An important ®nding for our purposes was that the pattern of correlations is not the same
across mapping techniques. Mapping techniques, then, did not provide similar information about
students' knowledge structure or connected understanding.
We think that the construct-a-map technique better re¯ects students' knowledge structures.
We based this conclusion on the fact that this technique is the only one that accurately re¯ected
the differences we saw among students' responses in their scores. The ®ll-in-the-map
score distributions were negatively skewed (skewness value ranged from ÿ.755 for ®ll-in-the-
lines to ÿ1.538 for ®ll-in-the-nodes) indicating that most students obtained high scores, whereas
the convergence scores were normally distributed (Kolmogorov± Smirnov normality test
con®rmed that only convergence scores were normally distributed; p.200). It seems then,
that convergence scores better re¯ect the differences in students' knowledge than the other
What, then, is the ®ll-in-the-map technique tapping? What aspect of students' knowledge is
being measured with this form of assessment? A closer look at the cognitive activities displayed
in this technique is needed. Talk aloud protocols may help to better de®ne the cognitive activities
re¯ected by both techniques.
In this study we asked the following questions: Are ®ll-in-the-map (skeleton map) scores
sensitive to the nodes and linking lines selected to be ®lled-in? Are ®ll-in-the-nodes skeleton
maps equivalent to the ®ll-in-the-lines skeleton maps? Does the ®ll-in-the-map technique
provide the same picture of a student's connected understanding as the construct-a-map
Our results led to the following tentative conclusions. (1) Skeleton map scores were not
sensitive to the sample of concepts or linking lines to be ®lled-in. Probably the selection of
concepts and propositions re¯ected the key content of the unit and were cohesive enough so that
any combination of concepts or propositions could provide similar information about students'
knowledge. (2) Fill-in-the-nodes and ®ll-in-the-lines techniques are not equivalent forms of ®ll-
in-the-map. Further research is needed to de®ne which of these two forms provides the most
accurate information about students' knowledge or connected understanding. (3) The relation-
ship between the two mapping techniques suggests that both mapping techniques tap somewhat
similar but not identical aspects of students' connected understanding. Students' talk aloud
protocols may provide insight into the cognitive activities involved in constructing and ®lling-in
a map. (4) Construct-a-map scores most accurately re¯ected the differences across students'
knowledge structure. (5) The different pattern of correlations between scores from the multiple-
choice test and both mapping techniques con®rmed that the mapping techniques were not
equivalent. (6) Convergence scoresÐthe proportion of accurate propositions in the students'
maps to the number of all possible propositions in the criterion mapÐare the most ef®cient
indicator when scoring construct-a-map concept maps.
Our overall conclusion is that we need to invest time and resources in ®nding out more about
what aspects of students' knowledge are tapped by different forms of the concept map
assessment. Which technique should be considered the most appropriate for large-scale
assessment? Practical issues, though, cannot be the only criterion for selection. Constraints and
affordances imposed by different forms of assessments affect the way students perform. To
resolve the issue of what is being measured with these different techniques, we need information
about the cognitive activity displayed in each of them.
The work reported herein was supported, in part, by the Educational Research and Development
Centers Program (No. R305B60002), as administered by the Of®ce of Educational Research Improvement,
U.S. Department of Education. The ®nding and opinions expressed in this report do not re¯ect the positions
or policies of the National Institute on Student Achievement, Curriculum, and Assessment, the Of®ce of
Educational Research and Improvement, or the U.S. Department of Education.
Appendix A
Training For Constructing Concept Maps
The training lasts about 50 minutes and had four major parts. The ®rst part focuses on
introducing concept maps: what they are, what they are used for, what their components are (i.e.,
nodes, links, linking words, propositions), and examples (outside the domain to be mapped) of
hierarchical and non-hierarchical maps. The second part emphasizes the construction of concept
maps. Four aspects of mapping are highlighted: identifying a relationship between a pair of
concepts; creating a proposition; recognizing good maps; and redrawing a map. Students are
then given two lists of common concepts to collectively construct a map. The ®rst list focuses on
the ``water cycle''Ða non-hierarchical map; the second list focuses on ``living things''Ða
hierarchical map. The third part of the program provides each individual with nine concepts on
the ``food web'' to construct a map individually. The fourth part of the program is a discussion of
students' questions after they had constructed their individual maps.
The program has proved to be effective in achieving this goal with more than 100 high
school students. To evaluate effectiveness of the training, we have randomly sampled
individually constructed maps at the end of the training within each group. These analyses
have focused on three aspects of the maps: use of the concepts provided on the list; use of labeled
links; and the accuracy of the propositions. Results across studies (see Ruiz-Primo et al., 1996,
1997) have indicated that: (a) more than 94% of the students used all the concepts provided on
the list; (b) 100% used labeled lines, and (c) more than 96% provided one or more valid
propositions. We have concluded that the training program has succeeded in training students to
construct concept maps.
Appendix B
Procedure Used To Construct A Criterion Map
1. Select a panel. Usually, it is composed of experts in the content domain to be tested,
teachers, and the researchers or assessors.
2. Ask each panel participant to provide a list of the ``X'' number of the most important
concepts in the subject domain.
3. Have panel participants compare and discuss their lists of selected concepts until a
consensus is reached about which are the most important concepts. This will be
considered the ``Key-Concept List.''
4. Ask each participant to construct a concept map with the key concepts.
5. Construct a concept map with relations that appear in at least 80% of the participants'
concept maps.
6. Discuss and modify the resulting concept map with participants until a consensus is
reached about which relations should be present in the map.
Appendix C
Examples Of the Multiple-Choice Test
Examples of the Conceptual Items:
1. Negative ions are formed from neutral atoms by:
(a) gaining in atomic number
(b) losing in atomic number
(c) losing of electrons
(d) gaining of electrons
2. Which one of the following pairs of elements would you expect to form molecular
(a) sodium and bromine
(b) calcium and chlorine
(c) nitrogen and oxygen
(d) aluminum and sulfur
3. Binary ionic compounds are composed of:
(a) two monoatomic cations
(b) two monoatomic anions
(c) one or two polyatomic ions
(d) a monoatomic cation and a monoatomic anion
4. A cation is any atom or group of atoms that:
(a) gains electrons and has a positive charge
(b) gains electrons and has a negative charge
(c) loses electrons and has a positive charge
(d) loses electrons and has a negative charge
Examples of the Mechanical Items:
1. An ite or ate ending on the name of a compound indicates that the compound:
(a) is a binary ionic compound
(b) is a binary molecular compound
(c) contains a polyatomic anion
(d) contains a polyatomic cation
2. The compound formula formed when aluminum reacts with sulfur is:
(a) Al
(b) Al
(c) Al
(d) Al
3. The name of the chemical compound, Fe
(a) iron sulfate
(b) iron II sulfate
(c) iron III sulfate
(d) iron VI sulfate
4. Select the correct formula:
(a) Rb
(b) Na
(c) Al
(d) Mg
Anderson, T.H., & Huang, S-C.C. (1989). On using concept maps to assess the
comprehension effects of reading expository text. Technical Report No. 483. Urbana-
Champaign: Center for the Studying of Reading, University of Illinois at Urbana-Champaign
(ERIC Document Reproduction Service No ED 310 368).
Baxter, G.P., Elder, A.D., & Glaser, R. (1996). Knowledge-based cognition and per-
formance assessment in the science classroom. Educational Psychologist, 31, 133 ± 140.
Bybee, R.W. (1996). The contemporary reform of science education. In J. Rothon & P.
Bowers (Eds.), Issues in science education (pp. 1 ± 14). Arlington, VA: National Science
Teachers Association, National Science Education Leadership Association.
Chi, M.T.H., Glaser, R., & Farr, M.J. (1988). The nature of expertise. Hillsdale, NJ:
Lawrence Earlbaum.
Glaser, R. (1991). Expertise and assessment. In M.C. Wittrock & E.L. Baker (Eds.), Testing
and cognition (pp. 17 ± 39). Englewood Cliffs, NJ: Prentice Hall.
Glaser, R. (1996). Changing the agency for learning: Acquiring expert performance. In K.A.
Ericsson (Ed.), The road to excellence: The acquisition of expert performance in the art,
sciences, sports, and games (pp. 303 ± 311). Mahwah, NJ: Lawrence Erlbaum.
McClure, J.R., & Bell, P.E. (1990). Effects of an environmental education-related STS
approach instruction on cognitive structures of preservice teachers. University Park, PA:
Pennsylvania State University (ERIC Document Reproduction Service No. ED 341 582).
Mintzes, J.J., Wandersee, J.H., & Novak, J.D. (1997). Teaching science for understanding.
San Diego: Academic Press.
Moore, J.A. (1995). Cultural and scienti®c literacy. Molecular Biology of the Cell, 6, 1 ± 6.
Novak, J.D. (1990). Concept mapping: A useful tool for science education. Journal of
Research in Science Teaching, 27 (10), 937 ± 949.
Novak, J.D. (1998). Learning, creating, and using knowledge. Concept maps as facilitative
tools in school and corporations. Mahwah, NJ: Lawrence Erlbaum.
Novak, J.D., & Gowin, D.R. (1984). Learning how to learn. New York: Cambridge
University Press.
Novak, J.D., Gowin, D.R., & Johansen, G.T. (1983). The use of concept mapping and
knowledge with junior high school science students. Science Education, 67(5), 625 ± 645.
Novak J.D., & Ridley, D.R. (1988). Assessing student learning in light of how students
learn. Paper prepared for The AAHE Assessment Forum. American Association for Higher
Education (ERIC Document Reproduction Service No. ED 299923).
Ruiz-Primo, M.A., Shavelson, R.J., & Schultz, S.E. (1997, March). On the validity of
concept map based assessment interpretations: An experiment testing the assumption of
hierarchical concept-maps in science. Paper presented at the AERA Annual Meeting, Chicago,
Ruiz-Primo, M.A., & Shavelson, R.J. (1996). Problems and issues in the use of concept
maps in science assessment. Journal of Research in Science Teaching, 33, 569 ± 600.
Ruiz-Primo, M.A., Schultz, S.E., & Shavelson, R.J. (1996, April). Concept-map based
assessment in science: An exploratory study. Paper presented at the AERA Annual Meeting,
New York, NY.
Schau, C., & Mattern, N. (1997). Use of map techniques in teaching applied statistics
courses. The American Statistician, 51, 171 ± 175.
Schau, C., Mattern, N., Weber, R., Minnick, K., & Witt, C. (1997, March). Use of ®ll-in
concept maps to assess middle school students' connected understanding of science. Paper
presented at the AERA Annual Meeting, Chicago, IL.
Shavelson, R.J. (1972). Some aspects of the correspondence between content structure and
cognitive structure in physics instruction. Journal of Educational Psychology, 63, 225±234.
Shavelson, R.J. (1974). Methods for examining representations of a subject-matter structure
in a student's memory. Journal of Research in Science Teaching, 11, 231 ± 249.
Surber, J.R. (1984). Mapping as a testing and diagnostic device. In C.D. Holley & D.F.
Dansereau (Eds.), Spatial learning strategies: Techniques, applications, and related issues (pp.
213 ± 233). Orlando: Academic Press.
... Though the observed structural issues in this study may have partly resulted from the open-endedness of the task, the choice not to direct students in constructing their concept maps was welladvised. Directing students could disrupt their reasoning and result in a concept map that does not represent their own knowledge, which, in turn, could hamper reflection (Gouli et al., 2003;Ruiz-Primo et al., 2001). That said, it would be interesting to investigate whether higher quality concept maps, constructed according to the intended structure, would increase the effectiveness of the feedback (i.e., the comparison with the expert example) and reflection prompts. ...
... Creating concept maps from scratch has often been compared to more 'restricted' concept mapping tasks (e.g., where students have to complete incomplete concept maps or where they have to construct a concept map with only predefined concepts and linking labels) (e.g., Cañas et al., 2012;Gouli et al., 2003;O'Donnell et al., 2002;Ruiz-Primo et al., 2001;Strautmane, 2012). The degree of restrictiveness influences the way students structure their concept map and may affect their knowledge acquisition: the more restricted the task, the more likely that students produce higher quality concept maps that resemble the intended structure (Cañas et al., 2012;Ruiz-Primo et al., 2001). ...
... Creating concept maps from scratch has often been compared to more 'restricted' concept mapping tasks (e.g., where students have to complete incomplete concept maps or where they have to construct a concept map with only predefined concepts and linking labels) (e.g., Cañas et al., 2012;Gouli et al., 2003;O'Donnell et al., 2002;Ruiz-Primo et al., 2001;Strautmane, 2012). The degree of restrictiveness influences the way students structure their concept map and may affect their knowledge acquisition: the more restricted the task, the more likely that students produce higher quality concept maps that resemble the intended structure (Cañas et al., 2012;Ruiz-Primo et al., 2001). Another way to reduce the structural issues in students' ...
Full-text available
Background Creating concept maps can help students overcome challenges of accurate knowledge monitoring and thus foster learning. However, students' knowledge often contains gaps and misconceptions, even after concept map creation. Theoretically, students could benefit from additional support, but it is unclear whether this might also be the case for (more practical‐oriented) secondary vocational students. Objectives This study investigated whether the effectiveness of concept maps for learning could be improved by providing students with expert examples and reflection prompts in addition to their self‐generated concept maps. Methods First‐year secondary vocational students (N = 91, Mage = 17.3 years) participated in this study, which utilized a pretest‐intervention‐posttest design. Regarding the intervention, students worked in two successive online learning environments, in which they had to present their knowledge in concept maps. After creation, students' concept maps were, depending on condition, supplemented with (1) an expert example with comparative feedback (a combined concept map) and related reflection prompts, (2) the combined concept map only, or (3) no combined concept map and no prompts. Results and Conclusions Analyses based on students' domain knowledge demonstrate that students significantly increased their knowledge in all conditions. Data indicate that there was no significant difference in knowledge gain between conditions. Further analysis showed that students in the experimental conditions demonstrated higher learning gains if they consulted the combined concept map more often than their peers. Implications Access to an example in addition to students' self‐generated concept maps seems promising in fostering their knowledge acquisition. However, secondary vocational students might need additional ways of support to guarantee higher learning gains. Avenues to increase the effectiveness of support are discussed.
... A proposition, the basic unit of meaning in a concept map, is the combination of two nodes (i.e., concepts) and the linking phrases between them (Ruiz-Primo & Shavelson, 1996); for example, mass influences acceleration. Concept maps have been widely used as conceptual learning tools to communicate the flexible understanding of complex issues and to organize subject knowledge for higher-order thinking and meaningful learning such as linking concepts based on conceptual and causal relationships (Pedaste et al., 2013;Ruiz-Primo et al., 2001;Schwendimann & Linn, 2016). Concept maps have also been deployed to assist with inquiry learning or problem solving by externalizing conceptual thinking about the subject knowledge required to solve a problem. ...
... However, few studies have analyzed studentconstructed cognitive maps that reflect the thinking underpinning students' inquiry task performance. While some studies have assessed student-constructed concept maps, they have focused on the conceptual knowledge represented in the maps and these maps were not used for inquiry learning (e.g., Ruiz-Primo et al., 2001). ...
... Table 1 outlines the coding sheet, with examples for illustration. This scoring method has been demonstrated to be reliable and valid and simple to master by raters (Ruiz-Primo et al., 2001). For example, in the concept map presented in Fig. 1, the concept nodes "bacteria" and "dissolved oxygen" are connected by the linking phrase "need," to form the proposition "Bacteria need dissolved oxygen." ...
Full-text available
Higher-order thinking is crucial to inquiry learning. It is important to investigate how students think in inquiry contexts. Given the tacit nature of higher-order thinking, cognitive maps (e.g., concept maps, reasoning maps) have been used to externalize thinking and have shown promising effects in terms of improving inquiry task performance. However, few studies have analyzed student-constructed maps that reflect the thinking underpinning students’ inquiry task performance. This study aimed to address this gap. Sixty-nine 11th grade students worked in small groups to explain a fish die-off phenomenon in a virtual ecosystem and collaboratively constructed an integrative cognitive map to facilitate thinking during the task. The map comprised a concept map (representing conceptual thinking about relevant subject knowledge) and a reasoning map (representing the reasoning process). Regression analyses showed that the quality of the student-constructed maps, particularly the reasoning maps, was a significant predictor of inquiry task performance assessed based on students’ written explanations of the phenomenon. Although the quality of the concept maps was not a significant predictor of inquiry task performance, it did predict the quality of the reasoning maps. Student thinking reflected in concept mapping and that reflected in reasoning mapping play different roles in inquiry learning.
... The construction style of a concept map can be categorized into two kinds: open-ended and closed-ended maps (Taricani and Clariana, 2006;Ruiz-Primo et al., 2001;Hirashima, 2019). Open-ended fashion is a mapping activity that allows learners to add links, concepts, and form propositions freely that express their knowledge. ...
... Open-ended fashion is a mapping activity that allows learners to add links, concepts, and form propositions freely that express their knowledge. It enables the teacher to reveal the difference between students' knowledge structures (Ruiz-Primo et al., 2001;Hirashima, 2019). However, it is hard to assess (Taricani and Clarina, 2006) and provides feedback to learners. ...
Conference Paper
Full-text available
Extension concept mapping extends an existing original map by linking it to a new additional map. This technique encourages learners to review and improve their knowledge structure. Previous studies have demonstrated the difference of knowledge structure achievements between the original and additional maps in the Extended Scratch-Build (ESB) and Extended Kit-Build (EKB) approaches. However, no information has been provided related to the extent of the tightness between the concept maps. The tight interconnectedness of knowledge structures represents expertise and depth of personal knowledge. This study investigated the effect of different concept mapping tools on the student's ability to connect concept maps. Fifty-five second-year university students participated and were divided into two groups: the control group utilized the ESB map, and the experimental group used the EKB map. Extension Relationships (ER) scores were used to confirm that learners could associate prior existing with new concept maps. ER is a particular link that tightly interconnected the previous original map with the additional map. ER scores evaluate both the quantity and quality of relations that the learners have made. The statistical analysis results emphasized that the experimental group outperformed the control group regarding the number and quality of ER scores.
... In this paper, we use mind-mapping as means for problem exploration, which has been proven to be useful for reflection, communication, and synthesis during idea generation [28,29]. The structure of mind-maps thus facilitates a wide-range of activities ranging from note-taking to information integration [30] by highlighting the relationships between various concepts and the organization of topic-oriented flow of thoughts [31,32]. In design ideation, studies have also shown that there are positive correlations between the total quantity of generated nodes and the depth of nodes versus idea uniqueness in a mind-map [33]. ...
Mind-mapping is useful for externalizing ideas and their relationships surrounding a central problem. However, balancing between the exploration of different aspects (breadth) of the problem with respect to the detailed exploration of each of its aspects (depth) can be challenging, especially for novices. The goal of this paper is to investigate the notion of “reflection-in-design” through a novel interactive digital mind-mapping workflow that we call “QCue”. The idea behind this workflow is to incorporate the notion of reflective thinking through two mechanisms: (1) offering suggestions to promote depth exploration through user's queries (Q), and (2) asking questions (Cue) to promote reflection for breadth exploration. This paper is an extension of our prior work where our focus was mainly on the algorithmic development and implementation of a cognitive support mechanism behind QCue enabled by ConceptNet (a graph-based rich ontology with “commonsense” knowledge). In this extended work, we first present a detailed summary of how QCue facilitated the breadth-depth balance in a mind-mapping task. Second, we present a comparison between QCue and conventional digital mind-mapping i.e. without our algorithm through a between-subjects user study. Third, we present new detailed analysis on the usage of different cognitive mechanisms provided by QCue. We further consolidate our prior quantitative analysis and build a connection with our observational analysis. Finally, we discuss in detail the different cognitive mechanisms provided by QCue to stimulate reflection in design.
... The open-ended method is a potential approach for reflecting and measuring students' knowledge structure. Still, it has a drawback in the achievement of understanding scores that tends to be less than closed-ended method [11]. ESB attempts to overcome the shortcomings of the usual open-ended method by offering an expansion of a two-phase concept map: Phase 1 and Phase 2. ...
Conference Paper
Full-text available
Extended concept mapping is a technique to integrate prior existing knowledge structure with new relevant information. Some researchers argue that the extended concept mapping approach is efficient in building a knowledge base and facilitating improved meaningful learning. However, no information has been presented regarding the distribution of knowledge structure. This study aimed to compare two extended concept mapping approaches and investigate the distribution of quality of students' knowledge structures. A total of 55 college students participated and divided into two groups, control and experimental. The students in the control group used the Extended Scratch-Build (ESB) technique, and those in the experimental group used the Extended Kit-Build (EKB) map. The results suggested that the students in the experimental group not only outperformed the control group in terms of quality of knowledge structure scores but also had an equitable distribution of achievement in material subtopics. Students who used the EKB were able to maintain performances consistently from the beginning to the end of material subtopics.
... Concept maps consist of concepts and their interrelations linked to each other with lines (Novak & Cañas, 2008). In a concept map, interrelated concepts form a meaningful statement together (Novak & Cañas, 2008;Ruiz-Primo, Schultz, Li, & Shavelson, 2001). For example, "a bird can fly" is a meaningful statement. ...
... In the development of concept maps, the technique "fill in the blanks in the drawn map," which is a highly guiding technique, or the technique "draw a map from scratch," which is a lessguiding technique, can be used. In their study, Ruiz-Primo, Schultz, Li, & Shavelson (2001) discussed these two different techniques and concluded that the technique "draw a map from scratch" reflected the difference between students' knowledge structures more effectively. Besides, concept maps, which can be prepared by students individually, can help students acquire a strong grasp of the subject being taught as well as develop high-order thinking skills such as problem-solving skills (Eden, 1988). ...
Full-text available
Concept maps are used to assess and improve prospective teachers' conceptual understanding levels. In this research, the aim was to describe prospective science teachers' conceptual understanding of the atom by using concept maps. The research employed the case study approach, one of the qualitative research patterns. The research group consisted of 15 fourth-year prospective science teachers. The concept maps drawn by the participants were used to describe their conceptual understanding of the atom. For data analysis, the descriptive analysis method, one of the qualitative analysis methods, was used. The data obtained from the concept maps were divided into the categories previously defined by the researcher. The created categories were evaluated by two academics with expertise in physics education, and a correspondence analysis was conducted. As a result of the research, it was concluded that prospective teachers could establish successful and meaningful propositions in concept maps, however, most of the propositions were collected in the categories of "meaningless," "improvable," or "acceptable."
To investigate the cognitive validity of scientific and quantitative reasoning items, “think-alouds” (verbal solutions) were elicited for a general education instrument. Several items were not aligned in terms of anticipated versus actual content, and the instrument's accuracy is questioned. We discuss study weaknesses and merits of this framework and analysis.
Full-text available
Abstract Aconcept map consists of a task, a response format, and a scoring system. Variation in tasks, response formats, and scoring systems may elicit different
Full-text available
Randomly divided high school students into instruction (n = 28) and control (n = 12) groups to investigate the correspondence between the structure of stimulus material (content structure) and the structure of S's memory during learning (cognitive structure). The instruction group read sections of a text on Newtonian mechanics in 5 daily sessions; controls received no instruction. All Ss were given pre- and postexperimental achievement and word association tests, and a word association test after each session. Digraph analysis indicates that the structure of the content was "tight" and "formal." In the instruction group: (a) posttest achievement scores increased significantly, (b) cognitive structure (word association data) changed considerably during instruction, (c) key concepts were interrelated more closely at the end of instruction than at the beginning and (d) cognitive structure corresponded more closely to content structure at the end of instruction. Similar changes for the control group were not observed. (29 ref.) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Science education has undergone a revolution in recent years, shifting its emphasis from breadth and memorization to depth and understanding. Teaching Science for Understanding begins with an overview of the changes in science education. It then presents a review of each major instructional strategy, information about how it is best used, and the effectiveness of the strategies for understanding and retention of information. The book presents the main strategies used to achieve this depth of understanding, including the use of computer simulations, small laboratories, and journal writing, and it discusses how to use each strategy at the elementary, secondary, and college level.
Differentiation of sciences was recently replaced by their integration. The old pedagogical disciplines amalgamated in the new comprehensive science of education. Science education is the most dynamic part of the science of education. The previous special didactics are replaced with the 3P-model (Pedagogy, Psychology and Philosophy). Processes in social systems are so complicated that their description cannot be done without using complex interdisciplinary and multidisciplinary instruments. The scientifi.c and practical activities in education need a proper media for effective discussion and exchange of ideas, results and good practices. The multidisciplinary scholarly journal can take such a role.
Students who have completed applied statistics courses often lack knowledge of the interconnections among the important concepts they have studied. According to a cognitive network model of knowledge, they lack connected understanding about statistics, and so are unable to apply these concepts. Connected understanding can be represented visually in the form of a map. Mapping techniques, including graphic organizers and concept maps, are useful: (1) for instructional planning, (2) as a learning tool, and (3) for assessment. We discuss each of these uses in statistics education, with an emphasis on assessment.
Changes in knowledge underlie the cognitive capabilities that are displayed in competent performance and the acquisition of improved performance. It is important to bring these knowledge-generated processes to attention because they represent possibilities for instructional design that might improve learning. In this article, the role of performance assessments in making relevant cognitive activity apparent to teachers and students is discussed. Descriptions of the cognitive activity of fifth-grade students carrying out a science performance assessment reveal critical differences between those who think and reason well with their knowledge of circuits and those who do not. Differences in quality of explanations, adequacy of problem representation, appropriateness of solution strategies, and frequency and flexibility of self-monitoring indicate more or less effective learning of the subject matter. Awareness of and attention to these cognitive characteristics of competent performance in an assessment situation provides teachers the necessary feedback to construct classroom environments that encourage reasoning and knowledge integration. In this way, performance assessments not only evaluate student performance but suggest changes in instructional practice to support effective learning in the elementary science classroom.
Concept maps provided a measure of subjects' cognitive structures before and after completion of an environmental education course. Concept maps were constructed from expressions taken from the issue "global climate." Expressions were assigned to one of three domains: science, technology or society. Maps were analyzed by constituent propositions, which were categorized by various characteristics including the domains of the expressions connected, the relationship expressed and the strength, determined by a protocol developed for this study. Significant differences were found in the frequencies of occurrence for various proposition characteristics and these were correlated with previous academic experiences. Some proposition characteristics were also correlated with the results of a final examination. Comparison of concept maps prepared before and after an environmental education course showed some changes in proposition characteristics. A brief description of the course, the expressions used in the concept mapping activity, a description of "networking" symbols and a sample map, the protocol used to evaluate the concept maps, and the results of the statistical analysis are appended. (Author/KR)
understanding how expertise is acquired poses a great challenge to learning theory / [suggest] the development of structured knowledge is a central feature of cognitive ability in early and later learning, and conditions of learning and experience significantly influence the kinds of knowledge structures that are acquired / infer a major principle or hypothesis underlying the acquisition of competence, which can be labeled a "change in agency," that is, a change in the agency for learning as expertise develops and performance improves / consider the learning and cognitive processes that underlie the course of acquisition of expert performance processes of acquisition [patterns and organized structured knowledge, self-regulation, environment and the discipline, representation and procedural knowledge, the self as an agent for learning and conditions of experience] (PsycINFO Database Record (c) 2012 APA, all rights reserved)
The highest levels of performance and achievement in sports, games, arts, and sciences have always been an object of fascination, but only within the last couple of decades have scientists been studying these empirical phenomena within a general theoretical framework. [This book] brings together [research] on specific domains of expertise and related theoretical issues, such as the importance of individual differences in ability and innate talent for attaining expert levels of performance. (PsycINFO Database Record (c) 2012 APA, all rights reserved)