ArticlePDF Available

Erroneous examples: Effects on learning fractions in a web-based setting

Authors:
  • German Institute for Artifical Intelligence

Abstract and Figures

Learning from errors can be a key 21st century competence, especially for informal learning where such metacognitive skills are a prerequisite. We investigate whether, how and when web-based interactive erroneous examples promote such competence, and increase understanding of fractions and learning outcomes. Erroneous examples present students with common errors or misconceptions. Three studies were conducted with students of different grade levels. We compared the cognitive, metacognitive, conceptual, and transfer learning outcomes of three conditions: a control condition (problem solving), a condition that learned with erroneous examples without help, and a condition that learned with erroneous examples with error detection and correction support. Our results indicate significant metacognitive learning gains of erroneous examples with help for 6th-graders. They also show cognitive and conceptual learning gains for 9th and 10th-graders when additional help is provided. No effects were found for 7th-graders. We discuss the implications of our findings for instructional design.
Content may be subject to copyright.
Note: This is a preprint of the paper to be published in IJTEL
Erroneous Examples: Effects on Learning Fractions in a Web-
Based Setting
Dimitra Tsovaltzi*, Erica Melis+, Bruce M. McLaren^, Ann-Kristin Meyer+
*Educational Technology/ ^Center for e-Learning Technology (CeLTech) Saalrland University,
Campus, Building C5.4, D-66123 Saarbrücken
+DFKI GmbH, German Research Centre for Artificial Intelligence, Stuhlsatzenhausweg 3 (Building D3
2) D-66123 Saarbrücken Germany
Dimitra.tsovaltzi@mx.uni-saarland.de
Abstract. Learning from errors can be a key 21st century competence, especially for
informal learning where such metacognitive skills are a prerequisite. We investigate
whether, how and when web-based interactive erroneous examples promote such
competence, and increase understanding of fractions and learning outcomes. Erroneous
examples present students with common errors or misconceptions. Three studies were
conducted with students of different grade levels. We compared the cognitive,
metacognitive, conceptual, and transfer learning outcomes of three conditions: a control
condition (problem solving), a condition that learned with erroneous examples without
help, and a condition that learned with erroneous examples with error detection and
correction support. Our results indicate significant metacognitive learning gains of
erroneous examples with help for 6th-graders. They also show cognitive and conceptual
learning gains for 9th and 10th-graders when additional help is provided. No effects were
found for 7th-graders. We discuss the implications of our findings for instructional
design.
Keywords. Erroneous examples, learning from errors, empirical studies, fractions
misconceptions, adaptive learning, conceptual learning, metacognition, learner support
1 Introduction
There is a growing interest and a body of knowledge regarding worked examples (correct
solutions) and a lot of evidence of their effectiveness as an instructional method in learning
mathematics and in science education (Catrambone, 1994; 1998; McLaren, Lim & Koedinger,
2008; Paas, 1992; Renkl, 1997; Sweller & Cooper, 1985; Trafton & Reiser, 1993; Van Gog,
Pass &van Merrienboerg, 2006). The benefits of worked examples are especially discussed in
connection to cognitive load theory (Pass & Merrienboer, 1994; Sweller, 1988; Sweller et al,
1998), which emphasises their ability to reduce cognitive load in comparison to standard
problem solving. Moreover, in the context of informal learning that is rapidly gaining ground,
learning from errors with its inherent metacognitive skills of spotting and correcting errors
2
may be an important competence to warrant the validity of informally acquired knowledge.
Therefore, erroneous examples are a potential teaching strategy for promoting such skills.
Erroneous examples are counterparts of worked examples that include one or more errors.
Although there has been some interest in investigating the use of erroneous examples in
conjunction with worked examples, erroneous examples have been scarcely investigated in
their own right. Moreover, erroneous examples are rarely used in mathematics teaching,
because many mathematics teachers are sceptical about discussing errors in the classroom
(Tsamir & Tirosh, 2003). Teachers are cautious of exposing students to errors in fear that it
could lead to incorrect solutions being assimilated by students, in behaviourist fashion
(Skinner, 1938). As a consequence, it remains open (1) if and when erroneous examples are
beneficial for learning and (2) what form of erroneous examples is more beneficial.
In particular, the question of what form or what type of erroneous examples presentation is
beneficial can be carefully explored in the context of learning technologies, where erroneous
examples can be implemented in an interactive fashion, thus opening new possibilities for
adaptive instruction. The presentation of erroneous examples can vary by the kind and amount
of feedback provided, diverse tutorial strategies can be used, and the choice and sequencing of
the learning material can be decided on the fly (e.g., erroneous examples provided in
conjunction with, for instance, standard problem-solving exercises, or worked examples).
Adaptation to the needs of individual students has two main advantages. First, it can shed light
on learning research, as it facilitates testing how students learn under different manipulations.
Second, it may contribute to better learning outcomes in formal education (in or after the
classroom).
We focus on fractions as a core topic in middle school math curricula around the world.
Fractions are a good target for adaptive, web-based instruction. There is evidence that
students, and even preservice teachers, do not have the expected level of understanding of
fractions (Jones Newton, 2008). Persistent misconceptions lead to poor performance in solving
fraction problems (Stafylidou & Vosniadou, 2004). Since fractions are also essential to other
key subjects, such as physics and chemistry problems, they represent a “gateway” topic to
success for any student of science and mathematics. Thus, new, successful forms of teaching
fractions could have a profound impact on science and math learning.
Theoretical and empirical work provides some support for studying errors that can promote
student learning of mathematics (Borasi, 1994; Müller, 2003; Oser & Hascher, 1997; Seidel &
Prenzel, 2003; Strecker, 1999). For example, Borasi argues that mathematics education could
benefit from the discussion of errors by encouraging critical thinking about mathematical
concepts, by providing new problem solving opportunities, and by motivating reflection and
inquiry.
Siegler and Chen (2002; 2008) conducted a controlled comparison of correct and incorrect
examples for mathematical equality problems. They found that when students studied and self-
explained both correct and incorrect examples they learned better than when students studied
and self-explained only correct examples. They hypothesised that self-explanation of correct
and erroneous examples strengthened correct strategies and weakened incorrect problem
solving strategies, respectively.
Grosse and Renkl (2007) studied whether explaining both correct and incorrect examples of
probability problems makes a difference to learning and whether highlighting errors helps
students learn from those errors. Their empirical studies (in which no help or feedback was
provided) showed some learning benefit of erroneous examples, but unlike the results of
Siegler and colleagues (2002; 2008), the benefit they uncovered was only for learners with
strong prior knowledge and for far transfer.
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
3
Both Siegler (2002) and Grosse and Renkl (2007) concluded that in order for students to
benefit from incorrect solutions, they have to be able to explain “why” the solutions are
incorrect. In particular, a later study by Grosse and Renkl (2007) analysed think-alouds on
self-explanation strategies. The analysis revealed that spontaneous self-explanations of errors
are very important for learning, but that they inhibit principle-based explanations
(explanations based on principles of the domain) that are normally produced when self-
explaining worked examples, for instance. However, such principle-based self-explanations
are crucial to learning.
Durkin and Rittle-Johnson (2008, 2012) and Rittle-Johnson and Wagner Alibali (2001)
tested whether comparing incorrect and correct examples of decimal problems promotes
greater learning than comparing two correct decimals examples. They hypothesized that
compa ring incorrect examples to correct examples may be particularly effective for
emphasizing the critical attributes of correct examples as suggested by Grosse and Renkl
(2007). They found that students in the incorrect condition had higher procedural posttest
scores, as well as higher conceptual posttest scores on a delayed posttest two weeks later, than
students in the correct condition.
In the domain of medical education, research on erroneous examples has demonstrated the
benefits of erroneous examples in combination with elaborate feedback in the acquisition of
problem-solving schemata. This was compared to the use of erroneous examples without
feedback (Kopp, Stark, Fischer, 2008) and with knowledge of correct solution feedback
(Stark, Kopp, Fischer, 2011). The diagnostic knowledge, which included conceptual, strategic
and teleological knowledge, increased more for students who worked with erroneous
examples and elaborate feedback on “why” the step was wrong and “which” step would be
correct. The effects of elaborate feedback were replicated for a more complex domain that
imposed additional cognitive load, but the effects of erroneous examples or their interaction
were not replicated (Stark, Kopp, Fischer, 2011). Erroneous examples had a significantly
better effect on cognitive skills in a delayed posttest. This effect was persistent regardless of
prior knowledge.
Finally, in the domain of decimal numbers, internet-based interactive erroneous examples
with feedback on correctness of solution and on error explanation were compared to a problem
solving with feedback on correctness (McLaren, et al 2012). ). They found that middle school
students who worked with erroneous examples did better on a delayed posttest than the
students who worked with standard problems and attributed this finding to “desirable
difficulties” (Schmidt & Bjork, 1992). In particular, they hypothesized that challenging
students with difficult problems, which erroneous examples could be described as, did not lead
to immediate learning benefits, but did lead to delayed learning benefits.
This scientific findings are also supported by the results of the highly-publicised TIMSS
studies (OECD, 2001) showed that Japanese math students outperformed their counterparts in
most of the western world. The key curriculum difference cited was that Japanese educators
present and discuss incorrect solutions and ask students to locate and correct errors.
1.1 Contribution of Our Studies
We take the earlier controlled studies further by investigating erroneous examples
decoupled from worked examples in the context of technology enhanced learning with
4
ActiveMath, a web-based system for mathematics (Melis, Goguadze, Homik, Libbrecht,
Ullrich, & Winterstein, 2006). Our ultimate goal is to develop micro and macroadaptation for
the presentation of erroneous examples for individual students since the benefit of erroneous
examples may depend on individual skills, grade level, etc. By microadaptation we mean the
teaching strategy, or step-by-step feedback, inside an erroneous example based on the
student’s performance. By macroadaptation we mean the choice of task for the student, as well
as the frequency and sequence of the presentation of erroneous examples.
We focus on the empirical results that inform our work on the adaptive technology. In
contrast to the Siegler (Siegler, 2002; Siegler & Chen, 2008) studies, we are interested in the
interaction of students’ with erroneous examples and how situational and learner
characteristics impact that interaction. Extending the work of Grosse and Renkl (Gross &
Renkl, 2007; Renkl, 1997), we investigate interactive erroneous examples with adaptive error-
detection and error-correction help. This novel design relies on the intelligent technology of
ActiveMath. Our primary rationale for including error detection and correction help in the
empirical studies is that students are not accustomed to working with and learning from
erroneous examples in mathematics. Thus, they may not have the required skills to review,
analyse, and reflect upon such examples, as Grosse and Renkl (Gross & Renkl, 2007) have
hypothesised based on their results, thus additional help may be necessary. Taking this strand
and providing additional elaborate help, we also extend the work of Kopp and colleagues
(Kopp, Stark & Fischer, 2008) in medical education to the domain of mathematics education.
Moreover, we include feedback that emphasises conceptual principle-based knowledge in
order to counter-balance the effect reported by Grosse and Renkl (2007). They found that
such reflections were missing in the students’ spontaneous self-explanations of errors and
hypothesised that, due to this lack of more conceptual explanations, learning opportunities
created by errors were not exploited. Providing such help in an adaptive fashion to students of
different knowledge levels might eliminate the aptitude-treatment effect for transfer, which
was one of their main findings. Additionally, we did studies with school kids of lower and
higher levels, to test if the benefits reported by Grosse and Renkl (2007) transfer to different
the school level and for which grades in particular.
With regard to the possible drawbacks of erroneous examples, we hypothesise that a
student is less likely to exhibit the feared 'conditioned response' of behaviourist theory
(Skinner, 1938) when studying errors that the student has not made him/herself and thus has
not (necessarily) internalised. On the contrary, students may benefit from erroneous examples
when they encounter them at the right time and in the right way. For example, rewarding a
student for error detection may lead to memory annotation such that errors will be avoided in
subsequent retrieval. At the same time, a student is unlikely to be demotivated by studying
common errors in the domain, made by others, as when emphasizing errors the student has
made him/herself. In fact, some of our own work has already demonstrated the motivational
potential of erroneous examples (Melis, 2004).
In summary, we believe that learning from errors can help students develop (or enhance)
their critical thinking, error detection, and error awareness skills, something that is not
possible with correct examples and difficult with unsupported problem solving (Borasi, 1994).
Moreover, erroneous examples may weaken students’ incorrect strategies, as opposed to
worked examples that strengthen correct strategies (Siegler, 2002). Additionally, similar to
worked examples, erroneous examples do not ask students to perform as in problem solving,
but instead provide a worked-out solution that includes one or more errors. Thus, they could,
reduce extraneous cognitive load in comparison to problem solving (Paas, Renkl & Sweller,
2003), while increasing germane cognitive load in the sense of creating cognitive conflict
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
5
situations. Adaptive help, in particular, might support deeper reflection on errors and help
induce such cognitive conflict. Especially the kind of adaptive help that elaborates on
conceptual understanding of errors may catalyse the creation and exploitation of such learning
opportunities. Furthermore, erroneous examples may guide learners toward learning
orientation rather than performance orientation; specifically in combination with help that
increases student’s involvement in the learning process and in more conceptual understanding
(Siegler, 2002).
In the course of our investigation of erroneous examples, we aim to answer the following
research questions:
When
1. Do advanced students, in terms of grade level, gain more from erroneous examples
than less advanced students?
How
2. Can students' cognitive skills, conceptual understanding, and transfer abilities
improve through the study of erroneous examples?
3. Does work with erroneous examples help to improve the metacognitive competencies
of error detection, error awareness and error correction?
4. Does adaptive help play a role in whether and how students learn from erroneous
examples?
Based on these considerations and research questions, our primary hypotheses are:
Hypothesis 1: Presenting erroneous examples to students will improve:
H1a: their cognitive skills,
H1b: conceptual knowledge,
H1c: transfer skills, and
H1d: metacognitive skills
Cognitive skills refer to solving standard fraction addition and subtraction exercises.
Conceptual knowledge refers to understanding the domain concepts necessary for solving
each specific problem, for instance “addition as increasing”. Transfer refers to solving more
difficult problems using the same concept, e.g. three-fraction addition as opposed to two-
fraction addition, or solving problems using a theoretically related concept. Metacognitive
skills refer to error detection and error correction.
A control group learning through partially supported problem solving is compared to
the erroneous examples groups on the dependent variables, cognitive skills, metacognitive
skills, conceptual learning, and transfer.
Hypothesis 2: The learning effect of erroneous examples is stronger when students are
supported in finding and correcting the error with additional help.Two experimental
groups were used, one with help and one without help, to test this hypothesis.
Hypothesis 3: The effect of erroneous examples with adaptive help will be independent of
grade level. Three levels of students are tested spanning five grade levels.
6
Moreover, we explore the following supplementary conjectures:
1. The learning effect of erroneous examples depends on when they are presented to the
students. The order of presentation of erroneous examples is varied between studies, to
allow drawing some conclusions.
2. The cognitive load of students will be reduced through working with erroneous examples,
as opposed to standard problem solving, and that they will be more motivated to learn and
understand the materials, which results from a shift to learning orientation. Self-reports
were analysed to test these conjectures.
To assess the learning effects of erroneous examples at different grade levels and settings,
we conducted lab studies with 6th, 7th and 8th-graders and classroom studies with 9th and 10th-
graders. The participants came from both urban and suburban German schools from two
states. In a previous article (Tsovaltzi, Melis, McLaren, Meyer, Dietrich & Goguadze, 2010),
we presented results of the first two studies and preliminary results of the third study. Here we
present the analysis of the third study with additional data that we collected to account for
group size differences. We also present the new analysis of the questionnaires of all three
studies and discuss the relevance of these results with regard to the learning gains analysis. In
view of the new analyses, we further present implications that can be drawn from our results.
2 Study 1: 6th-Grade Lab Study
2.1 Methods
2.1.1 Design
Fig. 1. A Standard exercise in ActiveMath (with English translations in the legends)
One control group and two experimental groups were used. The control condition, No-
Erroneous-Examples (NOEE), trained with partially supported standard fraction exercises
(Figure 1), but no erroneous examples. The experimental condition Erroneous-Examples-
With-Help (EEWH) trained with standard exercises, but also with erroneous examples (Figure
2) and provision of additional help within the erroneous examples for explaining the error. The
condition Erroneous-Examples-Without-Help (EEWOH) trained with standard exercises, and
Please write all individual thinking steps
as if you were thinking aloud. Add more
steps whenever you need to.
Add steps
Results
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
7
erroneous examples but without additional help. The participants completed the experiment on
a single day in approximately 2 hours and 40 minutes with three breaks of between five and
ten minutes. Breaks were not obligatory, so participants could choose to skip them.
Participants sat together in a computer room, but all parts of the study were completed
individually on separate computers. All sessions were completed over the course of three
weeks and were supervised by the experimenter (first author) and her assistant (fourth author).
Fig. 2. Interactive erroneous example in ActiveMath on the typical error of adding numerators and
denominators of fractions with unlike denominators.
2.1.2 Participants
Twenty-three volunteers from the 6th-grade at German schools participated in this study,
which took place in a lab at the DFKI (German Research Center for Artificial Intelligence) in
Saarbrücken, Germany. The participants were recruited through a press release announcing the
study that was described as software testing that gives students a possibility to practice
mathematics. All students who expressed interest were accepted for participation based on
availability criteria during the time planned for the studies. Their parents signed a letter of
consent informing them that the participants were free to drop out at any point during the
study. Participants came from different urban and suburban schools in Germany (Saarland).
2 groups of students get a pizza each. In the
first group there are 3 students, 2 of whom are
girls. In the second group there are 5 students,
4 of whom are girls. The pizza is split equally
within every group. Karl is trying to calculate
what part of the pizza the girls of both groups
got together. His result is ¾ of a pizza. Karl
has made an error. Find the error in Karl’s
calculations. Choose the first erroneous step.
Find the error in Karl’s calculation. Pick the first
erroneous step.
8
They received a payment of ten Euro at the end of the session, irrespective of whether they
completed all parts. They were randomly distributed to the groups by the experimenter and her
assistant as follows: NOEE=8, EEWH=8, EEWOH=7. The experimenter’s assistant was also
mainly responsible for the communication with the participants prior to the experiment. All
participants had just completed a course on fractions at school. The mean of their term-grade
in mathematics across conditions was 2.04 (SD=.88) (best=1 vs. fail=6), so the participants
were generally good students. There was no significant difference in the means of the pretest
among conditions (F(2,20)=0.23, p=.79, n2=0.02).
2.1.3 Materials
The design included a pre-questionnaire, a familiarisation, a pretest, an intervention, a
posttest and a post-questionnaire, which were presented in this order to all students in the
ActiveMath software environment.
Familiarisation. The familiarisation in ActiveMath allowed students to train with the
system. All conditions trained in writing fractions in the system using a specialised input
editor and in interacting with the system in general. The exercises used in this phase asked
students to order the following fractions from smallest to largest: 1, 1/6, 7/6. This skill was not
trained during the intervention or tested in the pre and posttest. Correct and incorrect feedback
as well as the correct worked out solution were presented to all conditions. The EEWH
condition received additional help to get familiar with how help is presented in ActiveMath.
No erroneous examples were used during the familiarization.
Standard Fraction Exercises. Standard fraction
exercises included addition and subtraction of
fractions represented in ActiveMath. A simple
exercise of fraction subtraction with unlike
denominators is shown in Figure 1. We asked the
students to write all thinking steps, as if they were
thinking aloud, so that the system could more
accurately assess the students’ performance on an
exercise. After entering their result, students got
feedback from ActiveMath to indicate whether their
result was correct or wrong and the correct worked
out solution was presented.
Interactive Erroneous Examples. The presentation of
erroneous examples in ActiveMath is done through a
tutorial strategy, which defines when and how to
provide help, signal correct and incorrect answers,
give answers away, show previous steps of the
students, etc. Previous steps are folded and hidden
automatically, to allow students to concentrate on the current step. Students can choose to
unfold previous steps if they want to refer back to them. Erroneous examples include
instances of typical errors students made in rule-application and errors that address common
3
Fig. 3. Error-correction Phase
Step 1
Correct Karl’s first erroneous step.
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
9
The result, 6/8 cannot be correct, because the girls
should get more than 1 pizza.
fractions misconceptions. Figure 2 displays the task presented in the first phase. Each step of
the erroneous solutions is presented as
choices in a multiple-choice question
(MCQ) and students have to select the
erroneous step. After completing this
phase, students are prompted to
correct the error, as shown in Figure 3.
Feedback Design. Based on pilot
studies (Tsovaltzi, Melis, McLaren,
Dietrich, Goguadze & Meyer, 2009),
we designed feedback for helping
students understand and correct the
errors. There are four types of
unsolicited feedback: standard
feedback, error-awareness and error
detection (EAD) feedback, self-explanation feedback and error-correction scaffolds.
Standard feedback consists of flag feedback (checks for correct and crosses for incorrect
answers) along with a text indication. It also consists of the correct answer or correct worked
solution, which is presented to the student at the conclusion of an attempt.
EAD feedback (Figure 4) focuses on supporting the metacognitive skills of error detection
and awareness that may trigger cognitive conflict. It appears on the screen after the student has
indicated having read the problem statement.
Self-explanation feedback (Figure 5) is presented in the form of MCQs. It aims to help
students understand and reason about the error through “why” questions (Figure 5, top).
“Why” questions are asked to further prompt reflection that can lead to cognitive conflict,
elaboration on errors, and conceptual understanding of errors. After a choice, the system
indicates whether the response was correct or not and provides additional conceptual
explanation of the error and of what the right thing to do would be (Figure 5, top right)
Error-correction scaffolds prepare the student for correcting the error in the second phase
and also have the form of MCQs. They start with “how” questions that concentrate rather on
procedural skills and attempt to facilitate the acquisition of practical knowledge. Additional
conceptual explanations are provided depending on the student’s response. The incorrect
choices in the MCQs correspond to typical misconceptions or performance errors. For
example, the second choice at the top part of Figure 5, “Karl may add the numerators but not
the denominators”, tries to see if the students understand that both numerators and
denominators have to be transformed when making fractions like. By addressing such
misconceptions and errors, MCQs are meant to prepare the students for correcting the error in
Phase 2. Students receive correct and incorrect feedback on their choices, and eventually the
correct answer. The “how” question at the bottom of Figure 5, which follows the “why”
question, asks the student: The second choice, “By using 5 as the common denominator,
because it is larger.” is an over-generalisation error that students make by analogy to when
adding e.g. 1/5+1/15. The student in this case gets the feedback that the answer is wrong
together with additional help (Figure 5, bottom right).
Fig. 4. EAD feedback with additional visual example
10
MCQs are nested (2 to 5 layers). If a student chooses the right answer at the two top-level
MCQs (the “why” and “how” questions), then the next levels, the error-correction MCQs, are
skipped, under the assumption that the student probably knows how to correct the error and to
avoid providing unneeded help which might frustrate students or interfere with existing
problem-solving schemata that would have to be extended (Kalyuga, Ayres, Chandler, &
Sweller 2003).
In the second, error-correction, phase the chosen step is crossed out, and an additional
editable box is provided for correcting the error (cf. Figure 2). After that error-specific
feedback is provided, e.g., “You forgot to expand the numerators”, along with the correct
solution. Here, we allow students one attempt to correct the mistake. Only one attempt is
allowed so that this process is not too much like problem solving
In the intervention, all groups solved six sequences of three exercises. The control group
solved only standard exercises. The sequences for the experimental groups included: standard
exercise - standard exercise - erroneous example. In the EEWH group, erroneous examples
were presented with additional help (EAD, error detection/correction MCQs, and error-
specific help). The condition Erroneous-Examples-Without-Help (EEWOH) included standard
exercises, and erroneous examples but without additional help.
These sequences trained skills that are typical fraction topics taught at school, e.g. fraction
addition/subtraction with like denominators and with unlike denominators, addition of whole
numbers with fractions, as well as word problems, that did not include complex modelling
tasks, which would require students to use fraction operators to represent the word problems.
Fig. 5. “Why” and “How” MCQs1 with choices and conceptual explanations
Pretest and Posttest. The pretest and posttest were the same for all three conditions and were
counter-balanced and consisted of similar problems to those used in the intervention and a
transfer problem (a four-fraction addition, as opposed to the maximum of three in the
intervention). However, there was no feedback or additional help provided in the pretest and
1
Multiple Choice Questions
Why ist he 2nd step wrong?: (1) Because
Karl may not add the numerators
directly. (2) Karl may add the
denominators 3 and 5, but not the
numerators. (3) I don’t know.
How can one transform 3rds and 5ths?:
(1) Find the less common multiple of 3,
and 5, that is 15. (2) Use 5 as the
common denominator, as it is the
largest. (3) I don’t know.
Not quite. Think, for
instance, how you
calculate 1/2+1/4, name
ly 1/2+1/4=3/4
Right! If Karl adds the denominators 3 and
5 he gets 8ths which cannot be broken into
thirds and fifths. The fractions have to be
transformed, like 2 dollars and 4 euro have
to be transformed to be added.
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
11
postest. Finally, three erroneous examples were part of the posttest only, as we did not want
the control group to see any erroneous examples before the intervention. The posttest
erroneous examples consisted of two phases, similar to the intervention erroneous examples,
but instead of feedback they included three open conceptual questions on error detection and
awareness. The questions were of the kind “Why cannot Oliver’s solution be correct?”, “What
mistake did Oliver make?”, “Why did Oliver make this mistake? What does he not understand
about fractions?” These questions were designed to test students’ error detection skills as well
as their understanding of basic fraction principles. For example, the mistake Oliver made was
that he added the denominators 6 and 8 in the exercise 7/6+5/8. The answer to the question
about what Oliver did not understand would be “That if one adds the denominators 6 and 8,
one gets 14ths, which one cannot break in neither 6ths nor 8ths.”, which refers to the basic
concept of common denominators.
Questionnaires. The pre- and post-questionnaires used in all studies were based on MSLQ
2
(Pintrich, Smith, Garcia, & McKeachie, 1991) and on CAQ
3
(Knezek & Rhonda, 1996), which
contain six-point Likert scale questions for self-report. The items were adjusted and translated
into German. The questionnaires consisted of six constructs each: motivation, error-awareness,
critical thinking, cognitive load, learning orientation, and self-efficacy. There were eighteen
items in total per questionnaire. The greatest number of items were dedicated to motivation (5)
and the least to self-efficacy, error-awareness and critical thinking (2). The pre- and post-
questionnaires were designed to have equivalent constructs and items. For example, a pre-
motivation item was: “I know that computers give me the opportunity to learn many new
things” (German: “Ich weiss, dass Computer mir die Möglichkeit geben, viele neue Dinge zu
lernen.”). The equivalent post-motivation item was: “I learned many new things through the
learning software” (German: “Durch das Lernprogramm habe ich viele neue Sachen gelernt”).
3.1.1 Results: 6th-Grade Lab Study
Table 1. Descriptive Statistics: Lab Study 6th-Grade
Condition
EEWH N=8
EEWOH N=7
NOEE N=8
Score
Subscore
mean(sd)%
mean(sd)%
mean(sd)%
Cognitive
Skills
Pretest
80.2(26.7)
85.7(17.8)
86.5(12.5)
Post-pre-diff
-2.1(33.6)
1.2(21.7)^
2.1(23.9)+
Metacognitive
Skills (EE)
EE-find
91.7(15.4)+
76.2(31.7)^
66.5(35.6)
EE-correct
80.2(12.5)+
75.0(21.0)^
68.7(25.9)
EE-ConQuest*
64.6(25.5)+
60.2(33.3)^
41.7(21.2)
EE-total
75.3(16.8)+
67.9(27.5)^
54.7(23.0)
Total-time-on-postEE
16.9(6.2)^
13.8(5.5)+
18.0(5.1)
Transfer
Transfer
75.0(46.2)+
71.4(48.8)
75.0(46.3)^
Note: +=best, ^=middle learning gains, *= also conceptual skill
2
Motivated Strategies for Learning Questionnaire
3
Computer Attitude Questionnaire
12
ANOVA Results. The results for the erroneous examples scores follow our hypotheses, although
they were mostly insignificant (cf. Table 1). The EEWH condition scored highest in almost all
scores. For all these scores, EEWOH came second, followed by NOEE. The big variances
between conditions (cf. Figure 6) were only significant for correcting the error (EE-correct) in
the erroneous examples. Nevertheless, we ran an ANOVA for that score, since the group size
is almost the same across conditions. The condition showed no significant effect in the
ANOVA, there was a significant difference when comparing EEWH and NOEE for finding
the error in the planned contrasts (Helmert) (t(20)=2.14, p<.05, d=1.29, r=.54). Another quite
big difference was between EEWH and NOEE for the total erroneous example score
(t(20)=1.95, p=.065, d=1.02, r=.46), which includes correcting the error and answering
conceptual questions. These learning gains related to erroneous examples did not transfer to
the cognitive skills where the differences between pretest and posttest are minimal in either
direction for all conditions and there was a ceiling effect both in the pretest (M =84.1,
SD=19.3) and the posttest (M=84.4, SD=15.8). This was probably due to the high prior
knowledge level of the participants.
Fig. 6.Descriptive Statistics for 6th grade
ANCOVA Results. As we did not have access to the term grades of the participants before the
experiment, the conditions were not balanced in that respect. Therefore, we analysed the data
with the term-grade but also with the pretest score as covariates, to capture the possible
influence of previous math and fraction knowledge, respectively, on the learning effects. With
this analysis, there is a main effect for erroneous examples in answering conceptual questions
(t(20)=2.25, p<.05, d=1.01, r=.45) , and in the total erroneous examples score (t(20)=2.34,
p<.05, d=1.04, r=.46), when comparing the two erroneous example conditions with the
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
13
control. The same scores were also significantly higher for EEWH vs. NOEE (conceptual
questions: t(20)=2.48, p<.05, d=1.11, r=.49/ erroneous examples: t(20)=2.96, p<.05, d=1.32,
r=.55) respectively). Additionally, the difference for finding the error was significantly higher
for EEWH vs. NOEE (t(20)=2.37, p<.05, d=1.06, r=.47).
Questionnaires Results. The questionnaires of sixteen participants were evaluated: EEWH=5,
EEWOH=5, NOEE=6. Due to technical reasons some pre and post questionnaires’ data was
lost. Paired sample t-test revealed that most self-reports were worse in the post-questionnaires
than in the pre-questionnaires (cf. Table 2), however, these results were significant only for
two constructs: motivation (t(14)=2.66; p<.05, d=0.92, r=0.42) and error-awareness (t(14)=
2.95; p<.05 d=1.05, r=0.47). Exceptions were the self-reports on cognitive load (for EEWOH
and NOEE), learning orientation (for EEWOH and NOEE), and self-efficacy (for EEWOH),
which were better in the post-questionnaire.
Table 2. Self-report in pre and post-questionnaires for 6th-grade
Condition
EEWH N=5
EEWOH N=6
NOEE N=5
Construct
pre vs. post
mean(sd)%
mean(sd)%
mean(sd)%
motivation
Pre
78.00(10.95)
83.33(6.24)+
80.56(13.24)^
Post
67.33(13.21)
75.33(7.67)+
72.22(5.44)+
Err-awareness
Pre
76.67(19.00)
78.33(22.52)+
77.78(13.61)^
Post
63.33(9.50)+
61.67(18.26)^
54.17(22.82)
Crit-thinking
Pre
71.67(12.64)+
68.33(12.36)
70.83(14.67)^
Post
68.33(10.87)
48.33(25.95)+
62.50(21.57)^
Cognitive-load
Pre
42.22(15.01)+
27.78(8.78)
38.89(17.57)^
Post
47.78(9.30)
25.56(8.43)^
35.19(9.07)+
Learn-orient.
Pre
73.33(7.57)+
65.83(6.18)
71.53(13.29)^
Post
70.83(13.82)^
69.17(12.36)
72.22(9.00)+
Self-efficacy
Pre
80.00(12.64)^
80.00(12.64)
83.33(14.91)+
Post
75.00(10.21)
93.33(10.87)+
76.39(17.01)^
Note: +=best, ^=middle
When comparing the conditions with ANOVA and planned contrasts, the difference in the
reported cognitive load in the post-questionnaire is significantly better for NOEE than the two
experimental conditions (F(2,13)=7.76, p=.006, n2=0.54). The individual group differences
were also significant: EEWH vs. NOEE (t(8)=2.32, p<.05, d=1.29, r=.57) and EEWH vs.
EEWOH (t(9)=3.93, p<.05, d=2.18, r=.78). The ANCOVA and planned contrasts with
covariates the pretest score and the term grade also revealed that EEWOH reported
significantly more self-efficacy than EEWH (t(9)=3.05, p<.05, d =2.15, r=.73).
3.2 Discussion: 6th Grade.
We found significant differences in the scores for erroneous examples, which show that
erroneous examples, in general, and the additional help, in particular, supported better the
metacognitive skills of error detection and error correction. The higher performance in the
conceptual questions related to understanding the error also indicates better conceptual
14
understanding for the erroneous examples conditions and for the help condition. To illustrate
this, the erroneous example Oliver must calculate how much 7/6+5/8 is. His results is 6/7.”
was followed by the conceptual question „Why cannot Oliver’s result be correct?. An
example of a good answer in the NOEE condition is “Because the common denominator is not
7 and it cannot be reduced to 7. This is correct but it does not explain the reason why this is7
cannot be the denominator why the denominator cannot be reduced to 7, therefore it does not
get to the necessary reasoning for spotting the error. An answer from the EEWH conditions is
Because the first summand is greater than his result”, which gets to the point of the error
recognition, indicating that the sum in Oliver’s addition is even smaller than one of the added
fractions. Recognising that, which was trained in the erroneous example conditions, is the skill
necessary for spotting errors.
The better performance found on metacognitive skills is not in line with the self-reports on
self-efficacy. This scale focused on understanding complex fraction problems and basic
concepts of fractions. EEWH reported more self-efficacy in comparison to EEWOH, who
performed better. Furthermore, we had no evidence that studying erroneous examples had an
effect on standard cognitive skills, where the level was very high to begin with. Interestingly,
the term grade was not a significant covariate of the cognitive load self-reports. However, our
hypothesis that erroneous examples and the additional help would cause less cognitive load
does not seem to be supported by the comparison of the conditions reports on post cognitive
load.
4 Study 2: Lab Study 7th and 8th Grade
4.1 Methods
4.1.1 Design
The design in this study was the same as in Study 1.
4.1.2 Participants
Twenty-four paid volunteers in the 7th and 8th-grade participated in the study, eight in each of
the three conditions. They were recruited and assigned to groups in the same way as
participants in Study 1. 7th and 8th-graders are similarly advanced beyond 6th-graders in their
understanding of fractions, according to our expert teachers. They have had more opportunity
to practice, but often retain their misconceptions in fractions. The mean of their term grade in
mathematics was again at the upper-level of the grading scale and a little higher compared to
the 6th grade (M=2.8, SD=1.2) (best=1 vs. fail=6). The pretest mean difference was not
significant between conditions (F(2,21)=0.23, p=.80, n2=0.02). Consistent with the judgments
of the expert teachers, there was no significant difference in the scores of the 7th compared to
the 8th grade (t(22)=0.71, p>.05, n2=0.02, d=0.29, r=.14).
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
15
4.1.3 Materials
The materials overlapped to a large degree with those of Study 1, but participants in Study
2 also solved world problems that were not used in the 6th-grade, since such exercises are not
typically encountered in German schools in this grade, so teachers advised us against using
them. An example of a world problem is: “Eva invited her friends to her birthday party. They
drank 8 3/7 bottles of apple juice as well as 1 5/6 bottles of lemonade. How many bottles did
they drink all together?”
4
The expected transformation into a mathematical expression in this
exercise is: 8 3/7 + 1 5/6. In total, there were seven sequences of exercises in this study. A
world problem also testing transfer was added to the posttest. By including such fraction
modelling, we aimed to induce and measure conceptual understanding.
4.1.4 Results: 7th-8th-Grade Lab Study
As a whole, the results do not support our hypotheses for the 7th and 8th-grade (cf. Table 3),
although differences in scores are small and not significant. NOEE scored better in almost all
scores, apart from the conceptual questions, where EEWOH did best. EEWOH was also
second best in finding the error and in the total erroneous examples score. EEWH came
second in the cognitive skills, correcting the error, transfer exercises, and modelling. The
standard deviation for all scores except for improvement on cognitive skills was highest for
EEWH (cf. Figure 8).
ANOVA Results. Since the group size is the same across conditions, the results of the ANOVA
can be considered robust although Levene’s test was significant for finding the error (p=.018),
conceptual questions (p=.000) and for the total score on erroneous examples (p=.000). The
only statistically significant score in the ANOVA test was the time spent on the posttest
erroneous examples (F(2,21)=5.59, p=.011, n²=.35), where NOEE spent significantly more
time than the erroneous-examples conditions together (t(22)=2.88, p<.05, d=1.23, r=.52) and
EEWH alone (t(22)=3.04, p<.05, d=1.63, r=.63).
ANCOVA Results. The ANCOVA with covariates the term grade and the pretest score
showed that only the term-grade is a significant covariate for answering conceptual questions
(F(1,21)=4.49, p=.047, n²=.18) and also has quite a big covariating effect for the total
erroneous examples score (F(1,21)=4.03, p=.059, n²=.17). In both cases, considering the
covariating effect decreases the difference between the control and the erroneous example
conditions that originally scored worse. There is also a significant effect of the condition for
the time spent on erroneous examples (F(2,21)=5.28, p=.014, n²=.59) when term-grade is
considered as a covariate.
Other Results. An important result in this study is the significant difference in the scores for
finding and correcting the error (t(23)=4.89, p<.001, d=0.59, r=.28). The standard deviation
for the two metacognitive competencies is comparable, but the mean for correcting is more
4
The problem entails the usual assumption that apple juice and lemonade bottles have the same volume.
There was no evidence that students did not understand this assumption.
16
than 0.5 point lower than for finding the error (M=3.12, SD=.95 for finding, M=2.54, SD=.99
for correcting), which means that a significant number of participants were able to find the
error but not to correct it. This is also true when comparing separate conditions. Where the
difference for EEWOH and for NOEE between finding and correcting the error is significant
(EEWOH: t(7)=4.33, p<.05, d=1.15, r=.49; NOEE: t(7)=4.32, p<.05, d=1.44, r=.58), but not
significant for EEWH (t(7)=2.19, p>.05, d=1.64, r=.63). The same phenomenon occurred even
with students who could solve exercises. Most students could add fractions with unlike
denominators, but could not correct related errors. For example, they could solve solve the
addition 1/6 + 3/8 = 4/24 + 9/24 = (4+9)/24 = 13/24 correctly, in the erroneous example Oliver
(Step 1: 7/6 + 5/8, Step 2: (7+5)/(6+8), Step 3: 12/14, Step 4: 6/7) they identified Step 2 as
wrong, but when asked to correct it, they often forgot to extend the numerators after
calculating the common denominator, probably because they concentrated on extending the
denominators. The MCQs to the conceptual questions after spotting the error are shown in
Figure 7 and the correct answer is marked.
In other words, the problem of not finding the less common multiple was accepted as the
first occurring problem without further mentioning that they also had to extend the
numerators. This gave the following erroneous solution: 7/6+5/8=12/24.
Table 3. Descriptive Statistics: Lab Study 7th-8th-Grade
Condition
EEWH N=8
EEWOH N=8
NOEE N=8
Score
Subscore
mean(sd)%
mean(sd)%
mean(sd)%
Cognitive
Skills
Pretest
73.7(26.7)
71.2(19.7)
77.9(12.4)
Post-pre-diff
2.4(24.4)^
-4.3(26.6)
6.9 (17.9)+
Metacognitive
Skills (EE)
EE-find
68.7(34.7)
75.0(13.4)^
90.6(12.9)+
EE-correct
57.8(26.7)^
54.7(21.1)
65.6(20.8)+
EE-ConQuest*
55.2(46.5)
62.5(12.6)+
61.5(19.4)^
EE-total
59.3(37.1)
63.7(11.9)^
69.8(15.0)+
Total-time-on-postEE
8.1(4.3)+
11.5(4.2)^
15.5(4.8)
Transfer
Transfer
45.2(45.8)^
38.0(36.0)
67.3(28.5)+
Conc. Underst.
Modelling
36.4(42.2)^
19.8(35.0)
40.8(48.6)+
Note: +=best, ^=middle learning gains, *= also conceptual skill
What did Oliver do wrong in the step?
1. All steps are actually correct.
2. He added numerator with numerator and denominator with denominator. (Correct)
3. He simplified wrongly.
4. His common denominator is wrong.
5. I don’t know.
Why did Oliver make this error? What did he not understand about fractions?
1. He actually understood everyhing.
2. That he cannot add whole numbers direct with fractions.
3. That he has to extend the numerators because he now has one denominator.
4. That he has to find the less common multiple of 6 and 8, because he cannot make 8ths or 6ths
out of 14ths (6+8=14) (Correct)
5. I don’t know.
Fig. 7. MCQs for the posttest erroneous example “Oliver”.
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
17
Fig. 8. Descriptive Statistics 7th-8th-Grade with standard deviations
Questionnaires Results. The questionnaires of fifteen participants were evaluated: EEWH=6,
EEWOH=6, NOEE=3. Unfortunately, some data was lost due to technical reasons, which led
to a very small N in the NOEE condition. Therefore, the results reported can only be
considered indicative. As, in the 6th-grade, most self-reports were worse in the post-
questionnaires than in the pre-questionnaires (cf. Table 4), as measured in a paired sample t-
test. However, none of the differences were significant. Self-reports that improved in the post-
questionnaire include the ones on cognitive load (for NOEE), on learning orientation (for
EEWOH and NOEE), and on self-efficacy (EEWOH and NOEE).
There were some significant differences when comparing self-reports from the pre- and
post-questionnaires. The reports on self-efficacy were significantly better for NOEE vs.
EEWH (t(7)=2.69, p<.05, d=2.03, r=.71). In ANCOVA contrasts with the covariates pretest
and term-grade, the difference reported on cognitive load also became significantly better for
NOEE vs. EEWOH (t(13)=2.52, p<.05, d =1.9, r=.69).
18
Table 4. Self-report in pre and post-questionnaires for 7th-8th-grade
Condition
EEWH N=6
EEWOH N=6
NOEE N=3
Construct
pre vs. post
mean(sd)%
mean(sd)%
mean(sd)%
motivation
Pre
69.45(13.07)
72.78(12.72)^
84.45(8.39)+
Post
58.33(6.12)
63.33(26.25)^
83.33(17.64)+
Err-awareness
Pre
68.06(18.57)
73.61(19.31) ^
80.56(12.73)+
Post
52.78(14.59)
62.50(25.14) ^
69.45(26.79)+
Crit-thinking
Pre
70.83(15.59)
70.83(13.69) ^
75.00(16.67)+
Post
62.50(21.57)+
59.72(20.69) ^
58.33(8.33)
Cognitive-load
Pre
48.15(25.50)+
45.37(14.24) ^
42.59(22.45)
Post
53.70(14.34) ^
58.33(29.76)+
22.22(5.56)
Learn-orient.
Pre
65.97(12.48)
66.67(7.45) ^
72.22(2.41)+
Post
61.11(7.76)
68.06(19.84) ^
87.50(18.16)+
Self-efficacy
Pre
73.61(12.27) ^
76.39(16.17)+
69.45(9.62)
Post
68.06(22.00)
80.56(13.61) ^
94.45(9.62)+
Note: +=best, ^=middle
4.2 Discussion: 7th-8th-Grade
An explanation for the fact that the erroneous examples conditions, and especially the
EEWH condition, did not perform better in the metacognitive skills tested through erroneous,
is the little time students spent on erroneous examples in the posttest. Moreover, the long
session might have overloaded the students and especially the ones in the EEWH condition
whose sessions last long (over two and a half hours) because of the help provided. The
possible resulting fatigue might be the reason why they did not spent more time on erroneous
examples in the posttest. The self-reports on cognitive load are consistent with this hypothesis.
Moreover, the high self-reports of NOEE on self-efficacy especially in comparison to EEWH
might also mean that NOEE was more motivated in the posttest.
A plausible interpretation for the fact that the term grade is a significant covariate for
answering conceptual questions, but not for cognitive skills is that a higher level of prior math
knowledge is required to process new conceptual knowledge. This high-level knowledge is not
necessary to deal with trained (almost automated) cognitive skills, which can be mastered by
using well-practiced solutions steps (algorithmically). The difference between finding and
correcting the error may mean that although students know the correct rules for performing
operations on fractions and can recognise errors that violate these rules, they still have
knowledge gaps that surface when asked to correct the error. A simpler explanation that is
easier to find the error (recognise it) than to correct it is plausible, but elucidate the reasons
behind this difference. Moreover, the inability to correct erroneous steps, for example, not
extending the numerators when adding unlike fractions, which we observed with the same
students who otherwise solve standard exercises with unlike fractions reveals that students do
not understand the principle behind extending numerators. Rather they extend numerators
automatically (algorithmically) and easily forget to when they don’t put their algorithmic
procedure in action from the beginning. One can think of this phenomenon as analogous to
reciting a whole poem when the first line is provided, but not without the first line.
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
19
5 Study 3: Classroom Study 9th-10th-Grade
To test the use of erroneous examples outside the lab we conducted classroom studies.
Apart from the general ecological validity, this decision lab was also motivated by an attempt
to avoid another ceiling effect, which is unlikely to occur in standard mixed-level classes. We
previously reported results from our classroom studies for this level (Tsovaltzi, Melis,
McLaren, Meyer, Dietrich & Goguadze, 2010) which, were not reliable due to a combination
of big variances and unequal group sizes that the dropout of participants resulted to. In order
to raise the reliability of our results, we collected additional data. Moreover, the data come
from a different school, making the sample more representative. The results reported here and
the corresponding discussion refers to a new analysis with the additional data. Moreover, we
report and discuss the questionnaires analysis, which was not included in the past report.
5.1 Methods
5.1.1 Design
The design was similar to that of Study 1 and Study 2. Differences include that the students
were not strict volunteers, but they agreed to take part in the studies in coordination with their
mathematics teacher and their parents signed a consent form. They did not receive payment.
Participants were informed that the study was not going to be assessed as part of their course-
work. Another important difference is that in this study we were able to run the experiments
on two different days, which was not possible in the lab studies. We were thus hoping to
reduce the possibility of fatigue. This difference adds to the ecological validity of the results,
in terms of the time students spent working with mathematics. Each session lasted two
classroom hours with standard school breaks. The sessions took place in the computer labs of
the schools, where students often work as part of their mathematics course.
5.1.2 Participants
Seventy-seven students in the 9th and 10th-grade participated in the study. Fifty-seven
students completed the study successfully, fourteen did not attend school on the second day of
the experiment and 6 either did not complete the intervention or entered values that showed
non-attempts to more than 50% of the exercises (for instance, only “1” and “2” instead of
fractions) and were screened. These classroom studies tested students from two different
schools, one urban and one suburban, of yet a higher level (9th and 10th-grade). Our expert
teachers advised that students of these levels typically still exhibit common fractions
misconceptions. Moreover, 9th and 10th-graders have, on average, higher math knowledge.
Since we found that the level of math knowledge has a covariating effect on conceptual
understanding, we wanted to test if erroneous examples would have a stronger effect with
these higher grade students.
Participants were semi-randomly distributed to conditions, but the conditions were
balanced so that the mean term-grade was about the same in each condition. The final
distribution to conditions of the participants who completed all sessions was as follows:
20
EEWH=18, EEWOH=20, NOEE=19. The difference in the pretest was not significant either
between 9th and 10th grade (F(2,54)=3.03, p=.057, n2=.33), or between conditions
(F(2,54)=1.24, p=.29, n2=.053).
5.1.3 Materials
Fig. 9. Interactive Erroneous Example on the Concept “part of a whole” with Error-Awareness and
Error Detection (EAD) Feedback (bottom).
Taking into account teachers’ emphasis on fractions misconceptions as the common
problem at this level we shifted from the traditional school fraction curriculum and included
more conceptual exercises to address the basic principles of fractions, and common
misconceptions. For instance, the exercises used the principles of “addition as increasing”,
“subtraction as decreasing”, and “part of a whole” (Malle, 2004). In effect, we reorganised our
He calculates:
Step 1: Walking distance = path -1/6 of
path 4/5 of path
Step 2: …
Jan rides his bike for 1/6 of the path to
school, then drives with the tram 4/5 of
the path and finally walks the rest of the
path. He wants to know what fraction of
the path he walks.
The result, walking distance=5 1/30,
cannot be correct. Travel with the bus is
already 4/5 of the total distance, so the
walking distance must be less than 1/5.
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
21
sequences to reflect this shift. Capturing this structure in the presentation of sequences
(although it was not explicitly indicated) intended to raise the awareness of these underlying
principles. We added one sequence to train the basic concept “part of a whole”, to explicitly
include conceptual errors on top of the rule-application errors, which were the focus of the
previous lab studies. In total, there were seven sequences. Figure 9 displays a task that trained
the concept “part of a whole”. The EAD feedback for this task is at the bottom of Figure 9.
Moreover, we changed the order of presentation of the erroneous examples in the
intervention; a sequence here consisted of standard
exercise erroneous examples standard exercise, to
test whether allowing students to train a bit after the
erroneous examples would make a difference in learning
outcomes. Furthermore, we adjusted the pretest and
posttest exercises to test these concepts by adding world
problems on them and also added two transfer exercises:
one for fraction subtraction and one for the basic
concept “relative part of” (Malle, 2004).
Two more new exercises asked students to transform
a fraction operation represented by pizzas into a
numerical fraction representation. For example, the task in Figure 10 had to be represented as
3/5+1/4. This type of exercise is commonly used at schools and was meant to give us a better
assessment of the students’ standard fraction competencies.
5.1.4 Results: Classroom Study 9th- and 10th-Grade
Table 5. Descriptive statistics classroom studies 9th and 10th-Grade
Condition
EEWH N=18
EEWOH N=20
NOEE N=19
Type of score
Type of Subscore
mean(sd)%
mean(sd)%
mean(sd)%
Time-on-task
Total-interv-duration
32.5(8.8)
26.4(6.9)
21.7(6.2)
EE-or-equiv-duration
16.2(4.5)
10.6(3.8)
6.0(2.4)
Cognitive
Skills
Pretest
74.5(14.2)
66.4(21.1)
64.9(17.2)
Transform
16.2(23.0)+
4.9(33.2)^
-10.2(45.4)
Diff-post-pre-total
8.9(12.8)+
1.4(23.5)
4.9(18.8)^
Metacognitive
Skills (EE)
EE-find
61.1(28.7)+
50.0(28.1)
60.5(28.0)^
EE-correct
40.3(28.0)+
21.3(30.6)
30.3(33.9)^
EE-ConQuest*
50.9(20.7)+
50.4(24.9)^
47.8(25.1)
EE-total
50.8(22.1)+
44.5(24.0)
46.8(24.7)^
Total-time-on-EE
5.9(3.2)+
4.1(3.1)
5.9(3.9)+
Transfer
Add-subtr-total (cog. transfer)
32.0(30.1)+
20.0(34.3)
29.0(34.6)^
Conc-transf-total*
46.8(34.7)+
30.4(29.3)^
29.5(30.30)
Transfer-total
39.4(20.3)+
25.2(25.8)
29.2(26.8)^
Conceptual
Understanding
Part-of-whole
11.1(47.3)+
-5.0(59.4)^
-9.9(44.6)
Addition-as-incr
65.3(44.7)+
56.3(48.6)^
30.5(46.4)
Subtr-as-decreas
52.9(49.9)+
27.5(44.4)
34.2(47.3)^
Rel-part-of
22.2(42.8)^
7.5(24.5)
23.7(42.1)+
Modelling-total
54.5(30.5)+
33.1(24.6)
35.6(27.4)^
Note: +=best, ^=middle learning gains, *=also conceptual skill
Fig. 10. Pizza Representation of
the Fraction problem 3/5+1/4
22
The results of the classroom studies supported our hypothesis. The participants in the EEWH
condition scored higher in all four scores for learning (cognitive skills: Diff-post-pre-total,
metacognitive skills: EE-total, transfer: transfer-total, and conceptual understanding:
modelling-total), and in all subscores except for modelling the concept “relative part of”.
NOEE comes second for the four main scores, but this varies for individual subscores. The
variances tend to be high for all variables (cf. Figures 11-14), but they are comparable
between conditions, that allows an analysis of variance, except from transformation (p=.002)
and “relative part of” (p=.003), for which we report contrasts assuming unequal variance.
Fig. 11. Descriptive statistics with standard deviation for cognitive skills (9th-10th-grade)
Fig. 12. Descriptive statistics with standard deviation for metacognitive skills (9th-10th-grade)
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
23
Fig. 13. Descriptive statistics with standard deviation for transfer (9th-10th-grade)
Fig. 14. Descriptive statistics with standard deviation for conceptual understanding (9th-
10th-grade)
24
ANOVA Results
The difference in favour of EEWH for the time-on-task were significant, both for the total
intervention duration (F(2,54)=10.1, p=.000, n2=.29) and for the time spent on erroneous
examples or equivalent standard exercises, which applies for NOEE, (F(2,54)=35.45, p=.000,
n2=.57). The biggest non-significant differences were also in favour of EEWH and for the
variables conceptual knowledge (world problem of basic concepts) in total (F(2,54)=3.03,
p=.057, n2=.11), and for modelling the basic concept “addition as increasing” (F(2,54)= 2.81,
p=.067, n2=.09) (cf. also Table 3).
Moreover, the cognitive skills in the exercises increased more for EEWH who also had a
lower variance than for the other two conditions, although the difference was not significant in
the analysis of variance. EEWH reached the mean of 83.4 (SD=14.1) in the posttest and
surpassed the other two conditions by about 15% (EEWOH: M=67.9, SD=21.1 and NOEE:
M=69.9, SD=17.2) although they started with a higher pretest (cf. Table 3). This difference in
the posttest was also significant (F(2,56)=3.49, p=.038, =.13).
ANOVA Planned Contrasts:
Main Effects. In ANOVA planned contrasts there were main effects for erroneous examples
for time-on-task (intervention duration: t(53)=4.03, p<.001, d=0.86, r=.40 / EE or equivalent:
t(49.72)=8.45, p<.001, d=2.4, r=.77, unequal variance assumed), for the subscore
transformation (t(54)=2.09, p<.05, d=0.57, r=.27), but not for cognitive skills in general as
well as for the subscore “addition as increasing” (t(54)=2.31, p<.05, d=0.63, r=.30), but not
for conceptual understanding as a whole. However, NOEE spent significantly more time on
the standard exercises common to all conditions in comparison to EEWH and EEWOH
together (t(53)=3.22, p<.05, d=0.88, r=.40).
EEWH vs. NOEE. EEWH had more time-on-task (intervention duration: t(23)=4.67, p<.001,
d=1.28, r=.45 / EE or equivalent: t(23)=8.43, p<.001, d=3.46, r=.86, unequal variance
assumed) than NOEE. EEWH was better in transformation (subscore for cognitive skills)
(t(23)=2.24, p<.05, d=0.86, r=.40, unequal variance assumed), in conceptual understanding
(t(23)=2.09, p<.05, d=0.57, r=.27) and its subscore “addition as increasing” (t(23)=2.27,
p<.05, d=0.62, r=.30).
EEWH vs. EEWOH. EEWH was better than EEWOH in conceptual understanding in general
(t(30)=2.54, p<.05, d=0.69, r=.33), in transfer (t(30)=2.54, p<.05, d=0.69, r=.33), and they
also had more time-on-task (intervention duration: t(30)=2.54, p<.05, d=0.7, r=.33 / EE or
equivalent: t(30)=4.06, p<.001, d=1.43, r=.58, unequal variance assumed).
ANCOVA Results
We tested the possible covariating effect of the pretest score. The pretest score was meant
to indicate significant differences based on the prior fraction knowledge. The results show that
it has a covariating effect on learning for the cognitive skills (F(1,54)=12.88 p=.001, n2=.50)
and separately for the transformation subscore (F(1,54)=6.60, p=.013, n2= .34), as well as for
the cognitive transfer (F(1,54)=5.16, p=.027, n2= .30). It also had a covariating effect on the
metacognitive scores (total score on erroneous examples) (F(1,54)=4.36, p=.042, n2=.28) as
well as for correcting the error separately (F(1,54)=5.09, p=.028, n2=.29). Taking these into
account, the time-on-task remains significantly longer for EEWH (F(2,54)=9.64, p=.000,
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
25
n2=.49), the effect for transformation subscore becomes significant (F(2,54)=3.52, p=.037,
n2=.34) whereas the effect for the conceptual knowledge subscore “addition as increasing” is
stronger but remains insignificant (F(2,54)=2.89, p=064, n2=.32), always in favour of EEWH.
ANCOVA Planned Contrasts:
Main Effects The ANCOVA planned contrasts showed the same main effects of erroneous
examples as the ANOVA contrasts. More specifically, there are main effects of erroneous
examples for the time-on-task (t(54)=3.56, p=.001, d=0.97, r=.43), for the transformation
subscore (t(54)=2.42, p<.05, d=0.66, r=.31), and for the concept addition as increasing
(t(54)=2.32, p<.05, d=0.63, r=.30).
EEWH vs. NOEE Differences between EEWH and NOEE are significant for time-on-task
(t(23)=4.23, p<.001, d=0.78, r=.36), for transformation (t(23)=2.87, p<.05, d=0.97, r=.43),
and for the subscore addition as increasing (t(23)=2.35, p<.05, d=0.64, r=.30), but not for
conceptual understanding as a whole (t(23)=1.74, p=.09, d=0.47, r=.25).
EEWH vs. EEWOH The significant differences between EEWH and EEWOH include the
scores for cognitive skills (t(30)=2.13, p<.05, d=0.58, r=.27), and for conceptual
understanding (t(30)=2.10, p<.05, d=0.58, r=.27).
Other Results Although we did not find any significant difference between conditions in
metacognitive skills, we again found that significantly more students across conditions could
find the error in the posttest erroneous examples than could correct it (t(56)=8.94, p<.001,
d=0.87, r=.397). This difference was also significant for individual conditions, when
comparing finding vs. correcting the error (EEWH : t(17)=3.83, p<.05, d=0.66, r=.31;
EEWOH: t(19)=5.88, p<.001, d=0.98, r=.44; NOEE: t(18)=5.75, p<.001, d=0.97, r=.44).
However, the effect is less strong for EEWH.
Questionnaires Results. Forty-eight participants completed both the pre- and the post-
questionnaire: EEWH=18, EEWOH=16, NOEE=14. Some students from the EEWOH and the
NOEE conditions chose not to fill in the post-questionnaire. The students who did not fill in
the questionnaires were students who struggled throughout the experimental sessions, which is
what probably led to their lack of motivation to fill in the post-questionnaire. This probably
makes the results much harsher on the EEWH condition whose participants, including the
ones who struggled, all filled in the questionnaires.
In paired sample t-test, all self-reports were significantly worse in the post-questionnaire,
apart from cognitive load, which was better, but not significantly. However, there were no
significant differences between conditions when comparing the drop between pre and posttest.
There were no interesting results in the analysis of variance, however, as expected, the
term-grade had a covariating effect on the cognitive load (F(1,45)=8.15, p=.007, n2=0.16),
unlike in the 6th-grade. This makes the difference in the reported for cognitive load drop
significantly higher for EEWH than for NOEE (t(30)=2.22, p<05, d=0.24, r=.012), whereas
the difference between EEWH and EEWOH just missed significance (t(28)=2.05, p=.05,
d=0.14, r=.07). However, the effect sizes are small in both cases.
26
Table 6. Descriptive Statistics of Questionnaires 9th, 10th- Grade
Note: +=best, ^=middle
Another interesting result is that there is a significant negative correlation between the
reports of self-efficacy in the pre-questionnaire with both the amount of help (r(46)=-.71,
p=.001) and the amount of time spent on erroneous examples (r(46)=-.49, p=.045) during
intervention. This possibly means that the more students felt able to tackle fractions, the less
help they received and the less time they needed to work through erroneous example, thus
confirming their self-reports.
With regard to the students self-reports on motivation, they did not correlate with the time
they spent on the erroneous examples (r(46)=-.21, p=.43). This means that they did not apply
themselves as expected from their self-reports, which is also reflected on the rather low
learning effects. The motivation (b=.13, t(45)=.65, p>.05) and self-efficacy (b=-.08, t(45)=-
.43, p>.05) reported in the posttest were also not good predictors of the time spent in the
posttest.
Additionally, we found that students self-report on error-awareness (b=.17, t(45)=1.21,
p>.05) and critical-thinking (b=.002, t(45)=.012, p>.05) in the pre-questionnaire was probably
not an accurate estimation as it could not predict the performance on the relevant
metacognitive skills in the posttest: finding the error, correcting it and answering conceptual
questions.
5.2 Discussion: 9th and 10th-Grade Classroom Study
The most striking result is that erroneous examples with help had a significant effect
on the cognitive skills as compared to erroneous examples without help. This was not the case
in the comparison to no erroneous examples. The reason for that might be that the NOEE
condition spent significantly more time on standard exercises practicing cognitive skills unlike
the erroneous examples conditions as evidenced by the ANOVA contrasts (main effect for
NOEE for standard-exercises duration; t(53)=3.22, p<.05, d=0.88, r=.430). Despite of that,
there were main effects of erroneous examples on the transformation subscore of cognitive
skills. One should be careful with the interpretation of that finding, as the EEWH condition
saw a few pizza representations as part of some EAD feedback (cf. Figure 4), which bore
similarities to the representations they were later asked to transform in order to make the
Condition
EEWH N=18
EEWOH N=16
NOEE N=14
Construct
pre vs. post
mean(sd)%
mean(sd)%
mean(sd)%
motivation
Pre
52.93(14.84)^
49.38(15.59)
61.43(14.73)+
Post
35.00(18.26)^
29.69(17.37)
42.14(18.26)+
Err-awareness
Pre
57.89(34.57)^
66.25(32.43)+
52.86(24.32)
Post
37.89(27.40)^
26.25(21.56)
45.71(35.46)+
Crit-thinking
Pre
50.53(22.23)+
45.63(24.21)^
39.29(12.69)
Post
33.16(17.34)^
32.50(22.06)
42.86(27.01)+
Cognitive-load
Pre
36.49(20.05)+
39.58(16.77)
38.57(21.59)^
Post
30.53(16.67)+
36.67(20.37)^
37.62(19.67)
Learn-orient.
Pre
50.00(14.81)^
50.94(11.72)+
49.64(12.93)
Post
42.63(20.51)^
31.56(20.79)
43.93(17.34)+
Self-efficacy
Pre
71.05(16.29)+
61.88(14.71)
67.14(18.58)^
Post
52.63(24.00)^
50.00(25.29)
57.14(29.20)+
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
27
calculations. Still, two facts make this finding interesting. First, that EEWOH who did not see
any such representations also scored significantly higher in this kind of exercise than NOEE.
Second, that there was no significant difference on transformation skills between EEWH and
EEWOH.
Moreover, the main effect on the conceptual understanding subscore “addition as
increasing” shows, at least partially, that the erroneous example conditions benefitted indeed
from the conceptual focus of the erroneous examples. This focus was even stronger in the
error-detection and error-correction help (see Sections 2.1.3 and 5.1.3), which is also reflected
in the significant differences in conceptual understanding between EEWH and EEWOH and
big between EEWH and NOEE.
The effects of erroneous examples, especially in combination with help, become more
interesting if one considers that EEWH also reported more reduced cognitive load in the post-
questionnaire in comparison to the pre-questionnaire. Although the effect size is small, this is
a good indication that for students of higher grade working with erroneous examples makes it
easier to understand and deal with fraction problems, including erroneous examples. This is
not true for erroneous examples without help.
A puzzling result at first sight is the high variances and very low means observed in
modelling the basic concepts tested in this experiment. This is an indication that some students
could understand the principle behind them and had no problems applying them, whereas
others were just confused. This effect is particularly high for the EEWOH in modelling “part
of a whole”, as well as for modelling “relative part of” that was not taught at all during
intervention, but was meant to test transfer from the more general concept “part of a whole”.
Both of these concepts seem to have been particularly confusing for EEWOH and NOEE. The
explanation for the NOEE seems to be obvious, namely that they did not receive training with
erroneous examples which, based on our hypothesis, would increase their conceptual
understanding. On the contrary, the cause of the higher variance and the negative learning
effect in modelling part of a whole” for EEWOH is not that clear. It may mean that this
condition was confused by being asked to represent the difficult concept “part of a whole”
explicitly and conceptually, as opposed to the standard school algorithmic approach. Since
they received no help, they could not recover from the confusion at all, unlike EEWH, and
scored badly both in this trained concept (“part of a whole”), and in the transfer concept
(“relative part of”).
On the contrary, the somewhat higher learning effect of EEWH can be attributed to the
extra help they had in dealing with the new approach to this concept. This resulted in scoring
better at the relevant exercise, as well as in transferring from the concept “part of a whole” to
“relative part of”. The high variances in the EEWH condition are an indication that some
students remained confused and did not grasp the underlying concept. Looking at the data,
students who did not solve the exercise correctly often did not make an attempt at the first
step, which supports that they did not grasp the underlying concept necessary for the first
modelling step. These might be students who rely on purely procedural/algorithmic solutions
and would need more practice than the one exercise they trained with. Another supportive
evidence for the students’ confusion is mirrored in the fact that many students in the NOEE
used the standard algorithmic solution learned at school to solve modelling problems. For
example,
28
in the posttest they had to calculate the part of the square that is not shaded in Figure 14.
The expected conceptually adequate answer was 1-7/16, indicating that the students have
understood that they have to find the part of the whole
and that the whole is represented through 1. The
solution a lot of students in the NOEE condition
provided was 16/16-7/16. This solution is correct and
was counted as correct, but does not make it clear that
the students have understood the underlying concept.
Similarly, NOEE managed to score better than the
erroneous example conditions in modelling the
untaught “relative part of” (although not significantly)
by simply using the standard algorithmic strategy
taught at school.
A simple explanation for the lack of the expected
transfer between the concepts “part of a whole” and
the “relative part of” is that the participants never
mastered the taught concept in order to be able to
transfer from it, but, at the same time, their original algorithmic strategy had been destabilised
through the experimental intervention. However, one cannot exclude the possibility that the
theoretically subordinate category of “relative part of” is actually not cognitively subordinate,
which is prerequisite for transfer to occur.
It is intriguing that there were no effects for erroneous examples with regard to
metacognitive skills. Although there is no clear explanation for that, it is possible that
students, and especially the more competent ones, did not spend the necessary time on
erroneous examples in the posttest, which measured these competencies. The fact that the
students’ reports on self-efficacy did not correlate with the time spent on erroneous examples
during intervention, and the negative correlation between the reports on self-efficacy and the
steps taken during intervention imply that possibly the more competent students who could
spot the error and directly choose the right explanation might have actually needed more help
on correcting the error to improve their metacognitive skills.
The students inability to assess their error awareness and critical-thinking, which did not
predict their performance on finding and correcting the error in the posttest, could be an
indication that in fact erroneous examples fine-tuned their self-assessment. That is, students
who worked with erroneous examples during intervention were made aware of their lack of
error-awareness and critical thinking, which they reported in the posttest. It is quite
interesting, that these self-reports in the post-questionnaires are actually closer to their scores
in correcting the error. Especially for NOEE, the students’ perception did not change as they
did not get any feedback on their relevant abilities. This interpretation would explain the
unexpected, although not significant, results in the error-awareness and critical thinking
constructs (cf. Table 6).
Moreover, the fact that the term-grade has a covariating effect on the cognitive load
reported by the students of 9th and 10th grade in the questionnaires could mean that the
erroneous examples with help imposed less cognitive load on the more competent students in
mathematics. That is in line with work on how automated schemata can explain differences
between novices and experts (Chi, Feltovich. & Glaser, 1981; Reimann & Chi, 1989), as well
as with the findings of Gross and Renkl (Gross & Renkl, 2007; Renkl, 1997). In fact it could
be the explanation behind why more competent students benefit more from erroneous
examples.
Fig. 13. Posttest exercise on the
concept “part of a whole”.
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
29
6 General Discussion and Implications for Cognitive Modelling
In general, we had some results that supported the use of erroneous examples with additional
help in teaching fractions and some that reveal different tendencies depending on the class
level. In the following, we discuss these results thematically based on our hypothesis, while in
every section we also review the influence of grade level. We compare the different grades
although the two lower grades-levels (6th and 7th-8th) were tested in the lab, because the two
differences in the setting arguably counter-balance each other. These differences are, the
presence of the teacher in the class studies, which could add motivation for grades 9th and 10th
that were tested in the classroom, and the payment received by grades 6th, 7th and 8th for their
participation in the lab study, which could also motivate students to work harder. Other
differences, for example in the materials used, are taken into consideration in the relevant
discussion sections. Still, when comparing the results between grades 6th, 7th and 8th with those
of grades 9th and 10th one must keep in mind that their ecological validity is lower as they were
lab studies.
6.1 Hypothesis 1
6.1.1 Cognitive Skills (H1a), Conceptual Understanding (H1b), and Transfer (H1c)
In our studies, we found that more advanced students (9th and 10th-grade) benefit from
erroneous examples with help in terms of cognitive skills (including standard problem
solving) in general, as opposed to erroneous examples without help, and partially as opposed
to no use of erroneous examples. Although this was not the case for either of the two less
advanced levels that we tested, it might have been an artefact of the very high prior fraction
knowledge of the particular participants (6th, 7th, and 8th-grade). In particular for the middle
grade level (7th, and 8th-grade), it is possible that the problems they face with fractions are also
more conceptual rather than procedural and that they might benefit rather from the conceptual
material. Moreover, we had some evidence that deep conceptual understanding is supported
by erroneous examples with additional error-detection and error-correction help. Such
evidence includes the better performance of the EEWH over the NOEE condition at the
conceptual questions in the 6th-grade, as well as the main effects in modelling “addition as
increasingfor EEWH vs. NOEE and in modelling in general for EEWH vs. NOEE (big but
not significant) and EEWH vs. EEWOH (significant) for the 9th and 10th grade. The higher
grades (9th, 10th) are the ones that received more intervention materials aiming at conceptual
understanding. The difference in conceptual understanding between EEWH and EEWOH for
the same grade levels might have also instigated the respective difference in cognitive skills.
Our results do not show a benefit using erroneous examples, with or without help, for
increasing cognitive skills or conceptual knowledge in the 7th and 8th-grade. For this grade
levels, prior knowledge seems to play a crucial role. A reason for that might be the
combination of the high grade level but also the high competence (term grade and pretest
scores) which the participants had. Students of the 9th and 10th grade shared the high grade
30
level, but not the level of competency. They were an average school class, and hence a more
representative sample.
The higher transfer scores of EEWH in the 9th and 10th grades are promising, but little
transfer occurred in 7th and 8th grades. Moreover, there were no significant differences in any
of the grades 6th, 7th, or 8th. The transfer scores for 6th graders are high across conditions,
which is probably the result of the corresponding high metacognitive learning gains that were
observed in this level. Similarly, the low cognitive and metacognitive gains in the 7th and 8th
grade explain the low transfer scores. On the contrary, 9th and 10th grade scored rather low
because transfer was also measured on modelling exercises that were far more demanding than
standard fraction exercises. These results together probably mean the conceptual
categorisation of problems inside a sequence, which was done for the 9th and 10th grade is in
the right direction for transfer to occur and for the learning potential of erroneous examples to
unfold. Grades 6th, like grades 7th and 8th did not receive concept-related sequences during
intervention, which might be one reason behind the lack of differences in transfer scores
between conditions. A more explicit representation of the concept dealt with in the sequences
that were used for grades 9th and 10th might be necessary for students to assign a problem-
solving schema to a concept, like suggested by Catrambone and Holyoak (1989) and be able
to retrieve it later for application. Research on conceptual chunks by Koedinger and Anderson
(1990), points at the same direction for improving transfer skills.
As a whole, our results from the more advanced 9th and 10th grade show clear indications
that fostering conceptual understanding through the use of erroneous examples with additional
help can result in significant learning effects for conceptual knowledge, but also for cognitive
skills. Moreover, although standard cognitive skills are also fostered through extensive
practice with standard exercises, such practice does not suffice to improve all kinds of
cognitive skills, or conceptual knowledge. In our results this is especially true for the well-
practiced fraction addition, were students learned or reminded themselves of the algorithmic
steps, but they could not improve significantly either in transformation skills that also
addressed fraction addition, or in conceptual understanding of fraction addition. We consider
the results from the 9th and 10th grades particularly important, first, because the turn to the
more conceptual learning material was made in this study and, second, because there was no
ceiling effect, and third, because the setting was more ecological.
6.1.2 Metacognitive Competencies (H1d): Error Detection vs. Error Correction
We had evidence that erroneous examples can influence the metacognitive skill of error
detection for lower-grade (6th-grade) but highly competent students. There is a possible
twofold explanation for this. First, these students, who have just learned fractions can handle
the demanding erroneous examples because the cognitive skills and domain knowledge that
erroneous examples presuppose is readily available to them. Second, there is room for
improving their error-detection significantly as they have not yet applied much of what they
have learned to make errors themselves and, hence, to practice error-detection on their own
errors.
There were no significant differences in students metacognitive skills for the other class
levels. Nonetheless, it was interesting to find out that students of the higher grade level (9th-
10th-grade) often did not judge correctly their ability for critical thinking and error awareness.
The results indicated that dealing with erroneous examples made their judgement more
accurate.
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
31
An interesting mismatch between the competencies of finding and correcting the error
across conditions is evident in our results. This mismatch persisted in all our studies
independent of student level or material design, and it was significant in our studies with the
two groups of higher-grade students (7th-8th and 9th-10th-grades). Ohlsson (1996) has described
this phenomenon as dissociation between declarative and practical knowledge. Here
declarative knowledge here means rule definitions, which relates to recognising the violation
of rules and hence spotting the resulting errors. Practical knowledge means rule applications,
which relates to applying the correct rule after spotting the error in order to correct it. It is
intriguing that in our classroom studies with 9th and 10th-graders, students’ cognitive skills did
improve through erroneous examples, despite the fact that their ability to find errors developed
significantly more than that of correcting errors. This might show that the competence of
correcting typical errors is not necessary for monitoring, correcting, or avoiding one’s own
errors. That is consistent with Ohlsson’s (1996) argument, that when the competency for
finding errors is active, it functions as a self-correction mechanism that, given enough learning
opportunities, can lead to a reduction of performance errors. However, it is a new finding in
comparison to previous research in erroneous examples that has not differentiated between the
competencies of finding and correcting errors.
6.2 Hypothesis 2: Erroneous Examples with or without Help
The choice between help or no help pertains to the microadaptation of erroneous examples.
Although we found some main effects for erroneous examples for the less advanced 6th-grade
(metacognitive skills) and the more advanced 9th and 10th-grade (conceptual understanding),
most effects were for erroneous examples with help. This is consistent with the results of
Kopp and colleagues (2008) in the medical domain in terms of the benefit of erroneous
examples with help, although the domains differ a lot and therefore a comparison is tenuous.
We also found that the use of erroneous examples without help might be worse than no use
of erroneous examples for conceptual and transfer skills, which is not reliably true for
metacognitive skills. As a whole, the inconsistent performance observed in the classroom
study with regard to the modelling might mean that there was a conflict between the standard
procedural way that teachers normally apply to teach fractions at school and the conceptual
way our erroneous examples deal with fractions. This effect might be stronger for EEWOH
who are left confused, due to the lack of guidance. However, more familiarity with erroneous
examples and the conceptual strategy might counter-balance this confusion, especially when
combined with provision of help. Siegler (2002) suggested that requests for explanation of
correct and incorrect strategies lead to a period of “cognitive ferment” (p. 51), following
cognitive conflict, and only later do they cause the development of correct strategies and the
ability to self-explain. He attributes this delay to a state of increased uncertainty and
variability.
For medium-advanced students (7th and 8th-grade), no difference was found between
erroneous examples with or without help.
In general, to continue on Ohlsson’s (1996) argument, it seems like erroneous examples
with error-detection and error-correction help that specifically train finding errors and
explaining them might offer the required learning opportunities without the need to develop
error-correction skills, which was very moderately observed in our data. The help we provided
32
assisted students to explain errors conceptually, but also to understand the practical/procedural
implications of these conceptual explanations in terms of problem solving. The contribution of
such help is also in line with the theoretical work by van Gog and her colleagues (Van Gog,
Pass & van Merrienboerg, 2004) who have advocated its use in the context of worked
examples as a way of promoting conceptual understanding.
6.3 Hypothesis 3: Grade Level
We already discussed differences in grade level in the previous sections. As a summary,
we have found more support for the use of erroneous examples as an instructional method for
the more advanced students of the 9th and 10th grades who have had fraction courses in
previous years.
For students just learning fractions, namely 6th-grade students, we found that their
metacognitive abilities were enhanced. These metacognitive gains for erroneous examples
with help did not give rise to enhanced cognitive skills. One could suspect that the cognitive
load might have been too large to allow the pass from metacognitive skills to schema creation
and hence cognitive skills. In fact, cognitive load was experienced as high by students of this
level independent of their previous mathematical knowledge, as we found no significant
covariating effect of the term grade on the cognitive load self-reports, contrary to what we
expected. The possibility, however, still remains that the existing high level of cognitive skills
(ceiling effect) did not allow learning effects to occur.
We did not find supportive evidence for the use of erroneous examples with students of
medium level (7th and 8th grade). As mentioned above, the reason for that might be that the
materials used were not appropriate to induce learning at this level.
Moreover, contrary to what we expected due to the use of adaptive help, the grade level
appears to play a role in whether students learn from erroneous examples with help. This can
be an indication that the more conceptual adaptive help triggered germane cognitive load for
students of a higher grade level (and hence higher prior knowledge). For students of lower
grade level, for which the material was less conceptual the adaptive help was not enough to
cause the required germane cognitive load in the form of cognitive conflict. This difference
could have led to the comparatively higher learning gains.
6.4 Supplementary Conjectures
6.4.1 Presentation of Erroneous Examples
Regarding the presentation of erroneous examples, which relates to macroadaptation, we
have at least a first indication that they are more beneficial when presented after the students
have been confronted with standard exercises and followed again by standard exercises, since
we only found a significant improvement at tasks other than erroneous examples when this
order of presentation was used. A potential explanation is that this gives students the
opportunity to review the material before working with erroneous examples that might also
increase the perceived relevance of erroneous examples, as well as to practice what they have
learned after the presentation of the erroneous examples. However, this might be different for
students who are just learning fraction operations, or for students of lowers competency and
self-regulation skills who might need more practice with standard fraction problems before
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
33
confronting them with erroneous examples. This could allow them, first, to practice with the
problems at all and, second, to become more aware of the difficulties involved before they can
understand and work with erroneous examples.
6.4.2 Motivation, Cognitive Load, and Learning Orientation
There were no significant differences between conditions for measures of motivation,
however, neither motivation nor self-efficacy seem to be a good criterion for whether students
learn on not from erroneous examples and for whether help is effective.
Self-reports of higher-grade students (9th and 10th grade) show that working with erroneous
examples and additional adaptive help reduce the perceived cognitive load that is caused by
solving fraction problems together with erroneous examples. This would be consistent with
our hypothesis. However, the results were not the same for the other grades. Since we used
more conceptual materials for the 9th and 10th-grade and we intended to induce germane
cognitive load through the use of conceptual help (“why” and “how” questions), this might be
an indication that students experienced the required cognitive conflict but also were assisted
by the help for resolving it. On the contrary, the materials for the other levels were possibly
too easy for cognitive conflict to occur, so that the additional help was perceived as extraneous
cognitive load.
We did not have any clear indications that learning orientation is fostered through the use of
erroneous examples.
6.5 Open questions
Two main questions remain open: first, how interactive erroneous examples can be
improved in general; second, if and how medium advanced students (7th and 8th-grade) can be
assisted in learning with erroneous examples to profit from them. In the following, we discuss
these questions from different perspectives and suggest possible solutions.
6.5.1 Design of Interactive Erroneous Examples
A practical measure, in terms of the design of interactive erroneous examples, may be to
allow students to explicitly request more help, which would amount to more help on
procedural “how” knowledge. It is likely that they will use this extra feature if they feel
uncertain about their answer, thus overcoming a possible shortcoming of our design of
interactive erroneous examples, which assumes that if students can answer the basic “why”
and “how” process-oriented MCQs they do not need error detection and correction help.
Currently, MCQs providing such additional error detection and correction help are skipped
once the student has answered the first two MCQs correctly, in an attempt to avoid a possible
“expertise reversal effect” (Kalyuga et al, 2003). Following Kalyuga and his colleagues, we
tried to track the existence of knowledge and avoid providing students with redundant help.
For that reason, we considered answering the top-level self-explanation MCQs as evidence
that the students would also possess the knowledge dealt with by the following MCQs.
However, this might be too coarse an indicator for when and how much help is needed.
34
Moreover, it underestimates the difficulty students have with applying rules (practical
knowledge), as opposed to recognising (declarative knowledge) and the respective benefit of
explaining detailed “how” questions in combination with “why” questions. Support for this
reasoning is the fact that the students of the 9th and 10th grade who felt able to cope with
fractions, based on their self-reports, and received less help did not score as well as one
expected. Had they received some additional help on the errors, they might have learned more.
6.5.2 Materials and Instructional Design
The materials and instructional design might also need modifications. For instance, the
results might be clearer if we enrich our conceptual exercises and test rather whether errors
that reveal lack of conceptual understanding are committed. We want to elaborate more on
such conceptual exercises since the standard fraction exercises practiced at school might be
too simple to influence students’ performance alone through process-oriented (“how”) help, as
we have observed in our studies with the less-advanced and medium-advanced students. This
is hypothesised from a theoretical perspective by Ohlsson (1996) and van Gog and colleagues
(Van Gog, Pass & van Merrienboerg,2004) and was empirically tested in the medical domain
with positive results for erroneous examples with help (Stark, Kopp, Fischer, 2011). A good
start would be to try to replicate our results for the advanced students (9th and 10th-grade)
using the new, more conceptual materials with the other grade levels, and especially with the
7th and 8th grades. A more representative sample in terms of prior math and fraction
knowledge is also a prerequisite for this test. Furthermore, the replication of the results would
help rule out the possibility that the materials alone and not the level made the difference in
our results.
6.5.3 Presentation
We plan to test whether the order of presentation really plays a significant role, by using the
more conceptual material and varying the order of presentation between different conditions.
Moreover, it could be the case that explicitly making students aware of the basic concept
handled in each sequence would further increase awareness of such concepts and the related
errors that indicate lack of awareness of these principles. This might also contribute to better
transfer as students would be trained in categorising problem types based on their basic
concepts.
7 Conclusions and Implications for Instructional Design
As a whole, our studies reveal a good potential for erroneous examples as an instructional
method that can help students in the demanding domain of fractions, although they show room
for further improvement. The overall finding that working with erroneous examples with help
produces better learning effects than working without help replicates the results of Kopp and
colleagues (Kopp, Stark, Fischer, 2008; Stark, Kopp, Fischer, 2011). The studies also indicate
that previous results on the benefits of self-explaining correct and incorrect examples by
Siegler and colleagues in water displacement and mathematical equality problems (Siegler,
2020; Siegler et al 2008) and Grosse and Renkl (2007) in probability problems are
transferable, first, to using Interactive erroneous examples alone, and second, to the fraction
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
35
domain. Analogous to the aptitude-treatment effect that Grosse and Renkl (2007) observed
with regard to transfer, and despite our expectation that help might counter-balance such an
effect, we found that the students’ grade level may be important for potential benefit from
erroneous examples in general. However, we did not find transfer effects for erroneous
examples. Overall, the fact that erroneous examples with help caused less cognitive load to
students of higher grade levels who received conceptual materials suggests a potential similar
effect to worked examples (correct solutions), as often discussed in the relevant work (Pass,
1994; Renkl, 1997; Trafton, 1993). Stark, Kopp and Fischer (2011) have looked at cognitive
load as a covariate of learning from erroneous examples. A more detailed investigation of
cognitive load to differentiate between the kinds of cognitive load induced through erroneous
examples with help would be even more interesting in view of the desired cognitive conflict,
which would constitute germane cognitive load in the case of erroneous examples.
The work presented, generated interesting research questions that remain to be answered.
As an outcome of this work, first implications for instructional design can be formulated.
In general, erroneous examples are recommended rather as an instructional method for
higher-level grades if the aim is to enhance both cognitive skills and conceptual knowledge.
They should, however, be used with additional help which should be elaborate when
erroneous examples first start being presented for learning, which is consistent with the
findings of Stark, Kopp & Fischer (2011).
Our current results indicate that erroneous examples should concentrate on finding the error
and explaining it, rather than on correcting it. The competency of correcting common errors or
misconceptions in the domain does not seem to be necessary for avoiding making errors.
However, it has the disadvantage of being time consuming. This is particularly important for
educational technologies as it reduces the costs of developing software, including domain
reasoners that are necessary to provide error-specific feedback and feedback modules or
authoring tools for designing or authoring this feedback.
Moreover, erroneous examples seem to be more effective when addressing conceptual
knowledge directly, as compared to only dealing with practical errors commonly committed
by students. This is true, even though practical errors are often indications of missing
knowledge, or misconceptions. In our next steps, we will be testing this finding and the
influence of grade level further.
Furthermore, when basic concepts are addressed by erroneous examples, caution should be
taken that the inconsistencies with the standard algorithmic approaches are addressed and
resolved. The aim of such caution is not just to avoid confusion, but rather to take advantage
of the cognitive conflict induced by the erroneous examples and reveal the common
underlying principle of both approaches. Specifically, in relation to the cognitive conflict
caused by erroneous examples, the delayed effects of erroneous examples should also be
tested to replicate effects from previous studies (Mclaren et al, 2012; Stark, Kopp & Fischer,
2011).
Self-efficacy seems to be a decisive learner characteristic that influences whether students
learn from erroneous examples or not.
In conclusion, these first directions for instructional design must be further tested and
elaborated. In addition, a cognitive model of how erroneous examples with help advance
learning should be sketched based on empirical results and relevant theoretical viewpoints.
This will allow the formulation and testing of hypotheses combined in a coordinated attempt.
36
Such testing should also involve, for instance, the examination of cognitive processes through
the collection and analysis of think-alouds.
Beyond learning in the classroom, learning from errors in general and acquiring
metacognitive skills of detecting and fixing errors can prove to be a key 21st century
competence especially in the context of informal learning. For instance, it can be a crucial
supplement of information validation.
8 Acknowledgements
In memory of Erica Melis who pioneered the research of erroneous examples in the context of
technology enhanced learning.
Acknowledgments. This work was supported by the DFG - Deutsche
Forschungsgemeinschaft under the ALoE project ME 1136/7.
9 References
Borasi, R. (1994). Capitalising on errors as "springboards for inquiry": A teaching experiment. Journal
for Research in Mathematics Education, 25(2), 166208.
Catrabone, R. & Holyoak, K.J. (1989). Overcoming Contextual Limitations on Problem-Solving
Transfer. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15(6), 1147
1156.
Catrambone, R. (1994). Improving examples to improve transfer to novel problems. Memory &
Cognition, 22, 606-615.
Catrambone, R. (1998). The subgoal learning model: Creating better examples so that students can solve
novel problems. Journal of Experimental Psychology: General 1998, 127(4), 355-376.
Chi M.T.H., Feltovich, P. & Glaser, R. (1981). Categorization and representation of physics problems by
experts and novices. Cognitive Science, 5, 121-152.
Durkin, K. & Rittle-Johnson, B. (2012). The effectiveness of using correct and incorrect examples to
support learning about decimal magnitude. Learning and Intruction, 22, 206214.
Durkin, K.L., & Rittle-Johnson, B. (2008). Comparison of incorrect examples in math learning. Poster
presented at the IES annual research conference, Washington, D.C..
Grosse, C.S. & Renkl, A. (2008). Finding and fixing errors in worked examples: Can this foster learning
outcomes? Learning and Instruction 17, 612-634.
Newton, J. K. (2008). An Extensive Analysis of Preservice Teachers’ Knowledge of Fractions. American
Educational Research Journal, 45(4), 1080-1110.
Kalyuga, S., Ayres, P., Chandler, P., & Sweller, J. (2003). The Expertise Reversal Effect. Educational
Psychologist, 38:1, 23-31. Lawrence Erlbaum Associates, Inc.
Knezek, G. & Rhonda, C. (1996). Validating the Computer Attitude Questionnaire (CAQ). Paper
presented at the Annual Meeting of the Southwest Educational Research Association, New Orleans,
LA, January.
Koedinger, K. R., & Anderson, J.R.. (1990). Abstract planning and perceptual chunks: Elements of
expertise in geometry. Cognitive Science, 14, 511 550.
Kopp, V., Stark, R. & Fischer, M. R. (2008). Fostering diagnostic knowledge through computer-
supported, case-based worked examples: effects of erroneous examples and feedback. Medical
Education 42, 823829.
Malle, G. (2004). Grundvorstellungen zu Bruchzahlen. Mathematik Lehren 123, 48.
Erroneous Examples: Effects on Learning Fractions in a Web-Based Setting
37
McLaren, B.M., Lim, S.J. & Koedinger, K.R.(2008). When and how often should worked examples be
given to students? New results and a summary of the current state of research. Love, B. 7. C., McRae,
K., V.M.S., (eds.). Proceedings of the 30th Annual Conference of the Cognitive Science Society,
2176--2181, Cognitive Science Society.
McLaren, B.M., Adams, D., Durkin, K., Goguadze, G. Mayer, R.E., Rittle-Johnson, B., Sosnovsky, S.,
Isotani, S. & Van Velsen, M. (2012). To err is human, to explain and correct is divine: A study of
interactive erroneous examples with middle school math students. In: A. Ravenscroft, S. Lindstaedt,
C. Delgado Kloos, & D. Hernándex-Leo (Eds.), Proceedings of ECTEL 2012: Seventh European
Conference on Technology Enhanced Learning, LNCS 7563, 222-235. Springer, Berlin.
Melis, E.(2005). Design of erroneous examples for ActiveMath. B. Bredeweg Ch.-K. Looi, G. McCalla
& J. Breuker (eds.), Artificial Intelligence in Education. Supporting Learning Through Intelligent and
Socially Informed Technology. 12th International Conference (AIED 2005) 125, 451--458. IOS
Press.
Melis, E., Goguadze, G., Homik, M.,. Libbrecht, P., Ullrich, C. & Winterstein, S. (2006). Semantic-
aware components and services in ActiveMath. British Journal of Educational Technology. Special
Issue: Semantic Web for E-learning, 37(3), 405423.
Müller, A. (2003). Aus eignen und fremden Fehlern lernen. Praxis der Naturwissenschaften 52(1), 18-21.
OECD (2001). International report PISA plus.
Ohlsson, S. (1996). Learning from Performance Errors. Psychological Review 103(2), 241262.
Oser, F. & Hascher, T. (1997). Lernen aus Fehlern - Zur Psychologie des negativen Wissens.
Schriftenreihe zum Projekt: Lernen Menschen aus Fehlern? Zur Entwicklung einer Fehlerkultur in
der Schule, Pädagogisches Institut der Universität Freiburg.
Paas, F.G., Renkl, A. & Sweller, J. (2003). Cognitive load theory and instructional design: Recent
developments. Educational Psychologist 38, 14.
Paas, F. (1992). Training strategies for attaining transfer of problem-solving skill in statistics: A
cognitive load approach. Journal of Educational Psychology, 84, 429-434.
Paas, F.G. & van Merrienboerg, J.J.G. (1994). Variability of worked examples and transfer of
geometrical problem-solving skills: A cognitive-load approach. Journal of Educational Psychology,
86(1), 122133.
Pintrich, P. R., Smith, D. A. F., Garcia, T. & McKeachie, W. J. (1991) A Manual for the use of the
Motivated Strategies for Learning Questionnaire (MSLQ). Ann Arbor, MI: National Center for
Research to Improve Postsecondary Teaching and Learning, University of Michigan.
Reimann, P. & Chi M.T.H. (1989). Human Expertise. In K.J. Gillhooly (Ed.), Human and machine
problem solving, 161-191. New York: Plenum.
Rittle-Johnson, B. & Wagner Alibali, M (2001). Conceptual and Procedural Knowledge of Mathematics:
Does One Lead to the Other? Journal of Educational Psychology, 91:1, 175-189. American
Psychological Association.
Renkl, A. (1997). Learning from worked-out examples: A study on individual differences. Cognitive
Science 21, 129.
Schmidt, R.A. & Bjork, R.A. (1992). New conceptualizations of practice: Common principles in three
paradigms suggest new concepts for training. Psychological Science, 3(4), 207-217 .
Seidel, T. & Prenzel, M. (2003). Mit Fehlern umgehen - Zum Lernen motivieren. Praxis der
Naturwissenschaften 52(1), 3034.
Siegler, R.S. (2002). Microgenetic studies of self-explanation. In N. Granott and J. Parziale (eds.)
Microdevelopment, Transition Processes in Development and Learning, 31--58. Cambridge University
Press.
Siegler, R.S., Chen, Z. (2008). Differentiation and integration: Guiding principles for analyzing cognitive
change. Developmental Science 11, 433448.
38
Skinner, B.F. (1938). The behavior of organisms: An experimental analysis. Appleton-Century, New
York, US.
Stafylidou S. & Vosniadou, S. (2004). The development of students’ understanding of the numerical
values of fractions. Learning and Instruction, 14, 503518.
Strecker, C. (1999) Aus Fehlern lernen und verwandte Themen. http://www.blk.mat.uni-
bayreuth.de/material/db/33/fehler.pdf. Retrieved September 20, 2010)
Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12,
257-285.
Sweller, J., Van Merriënboer, J. J. G., & Paas, F. (1998). Cognitive architecture and instructional design.
Educational Psychology Review, 10, 251295.
Sweller, J. & Cooper, G.A. (1985). The use of worked examples as a substitute for problem solving in
learning algebra. Cognition and Instruction, 2, 59-89.
Stark, R., Kopp, V. & Fischer, M.R. (2011). Case-based learning with worked examples in complex
domains: Two experimental studies in undergraduate medical education. Learning and Instruction 21,
2233.
Trafton, J.G. & Reiser, B.J. (1993). The contribution of studying examples and solving problems. In
Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society.
Tsamir, P. & Tirosh, D. (2003). In-service mathematics teachers' views or errors in the classroom. In
International Symposium: Elementary Mathematics Teaching, Prague.
Tsovaltzi, D., Melis, E., McLaren, B.M., Dietrich, M., Goguadze, G. & Meyer, A-K. (2009). Erroneous
Examples: A Preliminary Investigation into Learning Benefits In Cress, U., Dimitrova, V., Specht,
M. (eds.), Proceedings of the Fourth EC-TEL 2009, LNCS 5794, 688693. Springer-Verlag, Berlin,
Heidelberg.
Tsovaltzi, D., Melis, E., McLaren, B.M., Meyer, A-K., Dietrich, M. & Goguadze, G. (2010). Learning
from erroneous examples: When and how do students benefit from them? In M. Wolpers, P.A.
Kirschner, M. Scheffel, S. Lindstaedt & V. Dimitrova (eds), Proceedings of the 5th European
Conference on Technology Enhanced Learning (EC-TEL 2010), Sustaining TEL: From Innovation to
Learning and Practice, LNCS 6383, September/October, Barcelona, Spain, 357-373. Springer-Verlag
Berlin Heidelberg.
Van Gog, T., Pass, F. & van Merrienboerg, J.J.G. (2004). Process-Oriented Worked Examples:
Improving Transfer Performance Through Enhanced Understanding. Instructional Science 32, 83
98.
Van Gog, T., Paas, F., & Van Merriënboer, J.J.G. (2006). Effects of process-oriented worked examples
on troubleshooting transfer performance. Learning and Instruction, 16, 154-164.
... To support learners in recognizing the flaws and errors in their solutions and thus becoming aware of their own knowledge gaps (Loibl et al. 2017;VanLehn et al. 2003), it is promising to offer learners a comparison of an incorrect solution with the correct solution in the subsequent instruction phase (Booth et al. 2013;Corral and Carpenter 2020;Durkin and Rittle-Johnson 2012;Heemsoth and Heinze 2014;Pillai et al. 2020;Tsovaltzi et al. 2012;Van Peppen et al. 2021). One possible comparison is illustrated in Fig. 2. Similar to Finja's solution, Till's solution is also incomplete. ...
... By comparing correct and incorrect solutions, learners can identify differences between solutions (Booth et al. 2013;Durkin and Rittle-Johnson 2012;Siegler 2002;Siegler and Chen 2008) and thus specify their knowledge gaps (Loibl and Rummel 2014a). Previous studies on comparing correct and incorrect solutions differ with regard to the type of incorrect solutions: some studies used general incorrect solutions (Booth et al. 2013;Corral and Carpenter 2020;Durkin and Rittle-Johnson 2012;Heemsoth and Heinze 2014;Pillai et al. 2020;Tsovaltzi et al. 2012;Van Peppen et al. 2021), other studies used students' individual errors (Asterhan and Dotan 2018;Gadgil et al. 2012;Heemsoth and Heinze 2016;Siegler 2002;Siegler and Chen 2008). However, a direct comparison of these two types of incorrect solutions is lacking. ...
... In their study, a combination of correct and incorrect worked examples resulted in higher learning outcomes than studying either correct or incorrect worked examples. Studies that exclusively stimulated engagement with erroneous solutions merely indicated effects with strong support (Tsovaltzi et al., 2012) and investigated error processing only after the correct model was introduced (Tsovaltzi et al. 2012;Heinze 2014, 2016). In addition, the studies reveal that when comparing correct and incorrect solutions, learners need to engage in sufficiently elaborate comparison (Große and Renkl 2007). ...
Article
Full-text available
When learners acquire new content by working on a problem-solving task prior to explicit instruction, their attempts to solve the problem usually represent only partial steps on the way to the target concept. Both, theoretical assumptions on conceptual change as well as empirical findings on effective instructional formats with incorrect solutions, suggest that it is beneficial to address incorrect student solutions in a (subsequent) instruction phase by comparing incorrect and correct solutions. There is initial evidence that learning is most successful when learner compare the correct solution to an incorrect solution that reflects the learners’ conceptual understanding from the problem-solving phase. In the present study, we investigated in a highly controlled experimental design the relevance of this fit between the learners’ individual solution type from the problem-solving phase and the incorrect solution type in the instruction phase for learning success. In a computer-based learning environment, sixth graders worked on a problem-solving task to compare fractions. In the subsequent instruction phase, students in three conditions were given 1) an adaptive comparison, 2) a contra-adaptive comparison, 3) only the correct solution. Overall, there were no differences across conditions regarding the learning success. Further exploratory analyses revealed that only learners with an intermediate prior knowledge benefited from the adaptivity. This finding can be interpreted as indicator that our short intervention only induces conceptual change when basic knowledge is already available.
... Whether the erroneous examples are compared to correct examples does not seem to matter, but students must compare multiple examples to be effective [14]. Comparing examples that include erroneous examples is also linked to improved metacognition, specifically error detection [115] and more flexibility in the problem-solving process [102]. Erroneous examples that target common misconceptions can be particularly useful in guiding learners to detect and repair incorrect conceptions [76]. ...
... McLaren et al. [76,77] found that students who studied erroneous examples exhibited better long-term learning, but not immediate performance, compared to correct examples. Perhaps as a consequence of lower immediate performance, Tsovaltzi et al. [115] found that studying erroneous examples can cause lower self-efficacy than those who practiced solving problems. Similarly, studying erroneous examples causes more confusion and frustration than correct examples [2,91]. ...
... This pattern of results is common when comparing activities that promote procedural knowledge to those that promote conceptual knowledge. For example, worked examples that can be directly mapped onto near transfer problems are more efficient than erroneous examples that require comparison of multiple conceptions, but they result in worse retention and transfer [3,115]. Similarly, if students are prompted to compare worked examples to inductively extrapolate concepts, such as in analogical reasoning or by using subgoal labels, they typically gain better retention and transfer than if they receive the same examples without a prompt to compare them [63,73]. ...
... 356-364). Showing hypothetical errors can strengthen reflection and sensitivity to individual mistakes (e.g., McLaren et al., 2016;Tsovaltzi et al., 2013;Wagner et al., 2018). Thus, the current experiment aimed at investigating the influence of different types of errors in worked examples on learning processes and outcomes. ...
... In a variety of studies, it became evident that showing errors in worked examples had benefits for learning (e.g., Booth et al., 2013;Durkin & Rittle-Johnson, 2012;Gadgil et al., 2012;Große & Renkl, 2007;McLaren et al., 2015;Tsovaltzi et al., 2013;Wagner et al., 2018). Showing hypothetical errors fostered reflection processes and made learners aware of their errors. ...
Article
Learning from erroneous worked examples could enhance learning in contrast to problem-solving tasks. The type of error was hypothesized to be a moderator and accuracy of error detection and correction a mediator of this effect. This study examines the influence of simple syntactic (the structure of the code) and complex semantic (the logic or content of the code) errors in a programming scenario. Overall, 128 students were assigned to a two (syntactic errors: yes vs. no) × two (semantic errors: yes vs. no) factorial between-subjects design. Students’ accuracy in error detection and correction, learning performance, mental load, and mental effort were measured. Results showed that learners receiving syntactic errors detected and corrected errors with higher accuracy which leads to higher learning performance. Semantic errors did not influence learning-related variables since semantic errors were too difficult for novice learners to detect and fix. The postulated moderation and mediation could be supported.
... However, Patchan et al. (2013) showed that writers with high general verbal ability also received similar feedback from low-ability peers, whereas low-ability writers of initial solutions received more valuable feedback from high-ability peers. Therefore, as a leverage point, it might be recommendable to assign initial solutions of more competent learners to less competent learners and initial solutions of less competent learners to more competent learners (comparable to positive vs. erroneous worked examples; Große & Renkl, 2007;Tsovaltzi et al., 2012). ...
Article
Full-text available
Advancements in artificial intelligence are rapidly increasing. The new‐generation large language models, such as ChatGPT and GPT‐4, bear the potential to transform educational approaches, such as peer‐feedback. To investigate peer‐feedback at the intersection of natural language processing (NLP) and educational research, this paper suggests a cross‐disciplinary framework that aims to facilitate the development of NLP‐based adaptive measures for supporting peer‐feedback processes in digital learning environments. To conceptualize this process, we introduce a peer‐feedback process model, which describes learners' activities and textual products. Further, we introduce a terminological and procedural scheme that facilitates systematically deriving measures to foster the peer‐feedback process and how NLP may enhance the adaptivity of such learning support. Building on prior research on education and NLP, we apply this scheme to all learner activities of the peer‐feedback process model to exemplify a range of NLP‐based adaptive support measures. We also discuss the current challenges and suggest directions for future cross‐disciplinary research on the effectiveness and other dimensions of NLP‐based adaptive support for peer‐feedback. Building on our suggested framework, future research and collaborations at the intersection of education and NLP can innovate peer‐feedback in digital learning environments. Practitioner notes What is already known about this topic There is considerable research in educational science on peer‐feedback processes. Natural language processing facilitates the analysis of students' textual data. There is a lack of systematic orientation regarding which NLP techniques can be applied to which data to effectively support the peer‐feedback process. What this paper adds A comprehensive overview model that describes the relevant activities and products in the peer‐feedback process. A terminological and procedural scheme for designing NLP‐based adaptive support measures. An application of this scheme to the peer‐feedback process results in exemplifying the use cases of how NLP may be employed to support each learner activity during peer‐feedback. Implications for practice and/or policy To boost the effectiveness of their peer‐feedback scenarios, instructors and instructional designers should identify relevant leverage points, corresponding support measures, adaptation targets and automation goals based on theory and empirical findings. Management and IT departments of higher education institutions should strive to provide digital tools based on modern NLP models and integrate them into the respective learning management systems; those tools should help in translating the automation goals requested by their instructors into prediction targets, take relevant data as input and allow for evaluating the predictions.
... By performing elaboration processes, learners acquire knowledge of what to do and what not to do in certain situations (negative knowledge; Oser et al., 2012). A cognitive conflict (Piaget, 1985) is assumed to be a key mechanism of why dysfunctional examples support learning (Melis, 2005;Tsovaltzi et al., 2012;Booth et al., 2013Booth et al., , 2015. During learning, cognitive conflicts occur when learners are confronted with situations in which the pre-knowledge-based expectation of what happens is inconsistent or in contradiction with the actual outcome of the situation; thus, learners are encouraged to resolve the inconsistency, and learning processes are fostered (Maharani and Subanji, 2018). ...
Article
Full-text available
Everyday teaching requires teachers to deal with a variety of pedagogical issues, such as classroom disruptions. Against the background of on-going calls for an evidence-informed practice, teachers should ground their pedagogical decisions not only on subjective theories or experience-based knowledge but also on educational theories and empirical findings. However, research suggests that pre- and in-service teachers rather refer to experiential knowledge than to educational knowledge when addressing practical, pedagogical issues. One reason for the infrequent use of educational knowledge is that acquired knowledge has remained inert and cannot be applied to complex situations in practice. Therefore, implementing learning with contrastive (i.e., functional and dysfunctional) video examples in teacher education seems promising to promote pre-service teachers’ acquisition of educational knowledge. The 2×2-intervention study (N = 220) investigated the effects of the video sequence (dysfunctional-functional/functional-dysfunctional) and of video analysis prompts (with/without) on learning outcomes (concept knowledge, application knowledge) and on learning processes (written video analyses). Results revealed that the sequence dysfunctional-functional led to higher application knowledge in the post-test. There was no sequencing effect on concept knowledge. Prompted groups showed higher concept knowledge and application knowledge in the post-test. Furthermore, both experimental factors affected learning processes, which resulted in higher learning outcomes. In conclusion, learning with contrastive video examples in teacher education seems to be more effective if the video examples are presented in the sequence dysfunctional-functional and if instructional prompts guide the video analysis. The results substantiate the relevance of instructional guidance in learning with video examples and broaden the scope of validity of the concept of learning from errors.
Article
Learning from problem solving, worked examples, and Erroneous Examples (ErrEx) have all proven to be effective learning strategies. However, what kind of learning material should be provided to students with different level of prior knowledge within Intelligent Tutoring Systems (ITSs) is still an open question. Recently, alternating worked examples and problem solving (AEP) has been shown to benefit students compared to problems only or worked examples only in SQL-Tutor (Najar & Mitrovic, 2013). However, how students with different prior knowledge learn from ErrEx in SQL-Tutor is unknown. In this paper, we compared AEP to a new instructional strategy (WPEP) which provides ErrEx in addition to worked examples and problem solving to students. The results show that that both novices and advanced students improved their post-test scores significantly in either condition. Our findings also show that novices acquired significantly more debugging knowledge when erroneous examples were presented (WPEP) in comparison to the AEP condition. Moreover, both novices and advanced students benefitted from ErrEx. In particular, advanced students who studied with erroneous examples showed better performance on problem solving as measured by the number of attempts per problem.
Chapter
The McLearn Lab at Carnegie Mellon University (CMU) first designed and developed the artificial intelligence (AI) in education learning game, Decimal Point, in 2013 and 2014 to support middle school children learning decimals and decimal operations. Over a period of 10 years, the McLearn Lab has run a series of classroom experiments with the game, involving over 1,500 elementary and middle school students. In these studies, we have explored a variety of game-based learning and learning science principles and issues, such as whether the game leads to better learning—demonstrated learning gains from a pretest to a posttest and/or a delayed posttest—than a more traditional online instructional approach; whether giving students more agency leads to more learning and enjoyment; whether students benefit from hints and error messages provided during game play; and what types of prompted self-explanation lead to the best learning and enjoyment outcomes. A fascinating finding also emerged during the variety of experiments we conducted: the game consistently led to a gender effect in which girls learned more from the game than boys. In this chapter I will discuss the current state of digital learning games, how we designed and developed Decimal Point, the technology it is built upon—including AI techniques—and the key results of the various experiments we’ve conducted over the years. I conclude by discussing the important game-based learning take-aways from our studies, what we have learned about using a digital learning game as a research platform for exploring learning science principles and issues; and exciting future directions for this line of research.
Article
Full-text available
The Productive Failure (PF) approach prompts students to attempt to solve a problem prior to instruction – at which point they typically fail. Yet, research on PF shows that students who are involved in problem solving prior to instruction gain more conceptual knowledge from the subsequent instruction compared to students who receive the instruction first. So far, there is no conclusive evidence, however, that the beneficial effects of PF are explained by the attempt to generate one’s own solutions prior to instruction. The literature on example-based learning suggests that observing someone else engaging in problem-solving attempts may be an equally effective means to prepare students for instruction. In an experimental study, we compared a PF condition, in which students were actively engaged in problem solving prior to instruction, to two example conditions, in which students either observed the complete problem-solving-and-failing process of another student engaging in PF or looked at the outcome of this process (i.e., another student’s failed solution attempts). Rather than worked examples of the correct solution procedure, the students observed examples of failed solution attempts. We found that students’ own problem solving was not superior to the two example conditions. In fact, students who observed the complete PF process even outperformed students who engaged in PF themselves. Additional analyses revealed that the students’ prior knowledge moderated this effect: While students who observed the complete PF process were able to take advantage of their prior knowledge to gain more conceptual knowledge from the subsequent instruction, prior knowledge did not affect students’ post-test performance in the PF condition.
Chapter
Full-text available
Immer wieder heißt es, daß der Mensch aus Fehlern nichts lerne. Umgekehrt ist das Sprichwort bekannt, daß man durch Schaden klug werde. Wie steht es um das Lernen aus Fehlern? Unter welchen Bedingungen lernen Menschen im Alltag, lernen Schülerinnen und Schüler aus Falschem? Mit diesen Fragen setzt sich der vorliegende Beitrag auseinander. Bedingungen und Voraussetzungen des Lernens aus Fehlern werden diskutiert, und es sollen Grundsteine für eine Theorie des Fehlerwissen und der Fehlerkultur in der Schule gelegt werden.
Article
Full-text available
This study examined relations between children's conceptual understanding of mathematical equivalence and their procedures for solving equivalence problems (e.g., 3 + 4 + 5 = 3 + 9). Students in 4th and 5th grades completed assessments of their conceptual and procedural knowledge of equivalence, both before and after a brief lesson. The instruction focused either on the concept of equivalence or on a correct procedure for solving equivalence problems. Conceptual instruction led to increased conceptual understanding and to generation and transfer of a correct procedure. Procedural instruction led to increased conceptual understanding and to adoption, but only limited transfer, of the instructed procedure. These findings highlight the causal relations between conceptual and procedural knowledge and suggest that conceptual knowledge may have a greater influence on procedural knowledge than the reverse. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Conference Paper
Full-text available
Erroneous examples are an instructional technique that hold promise to help children learn. In the study reported in this paper, sixth and seventh grade math students were presented with erroneous examples of decimal problems and were asked to explain and correct those examples. The problems were presented as interactive exercises on the Internet, with feedback provided on correctness of the student explanations and corrections. A second (control) group of students were given problems to solve, also with feedback on correctness. With over 100 students per condition, an erroneous example effect was found: students who worked with the interactive erroneous examples did significantly better than the problem solving students on a delayed posttest. While this finding is highly encouraging, our ultimate research question is this: how can erroneous examples be adaptively presented to students, targeted at their most deeply held misconceptions, to best leverage their effectiveness? This paper discusses how the results of the present study will lead us to an adaptive version of the erroneous examples material.
Article
This study examined relations between children's conceptual understanding of mathematical equivalence and their procedures for solving equivalence problems (e.g., 3 + 4 + 5 = 3 + -). Students in 4th and 5th grades completed assessments of their conceptual and procedural knowledge of equivalence, both before and after a brief lesson. The instruction focused either on the concept of equivalence or on a correct procedure for solving equivalence problems. Conceptual instruction led to increased conceptual understanding and to generation and transfer of a correct procedure. Procedural instruction led to increased conceptual understanding and to adoption, but only limited transfer, of the instructed procedure. These findings highlight the causal relations between conceptual and procedural knowledge and suggest that conceptual knowledge may have a greater influence on procedural knowledge than the reverse.
Chapter
The intention of this chapter is to provide the reader with a glimpse of the kind of questions and research that have been investigated on the nature of expertise in problem solving. For more detailed descriptions of the actual research results, the reader is referred to an edited volume on the nature of expertise by Chi, Glaser, and Farr (1988). This chapter is not meant to be an integrated interpretation of problem solving theories. Such a review may be seen in a chapter by VanLehn (in press). Instead, we view this chapter as an updated version of the review of the expertise literature in problem solving as provided in Chi, Glaser, and Rees (1982), and Chi and Glaser (1985). Descriptions of our own research in the context of problem solving are discussed more extensively in Chi, Bassok, Lewis, Reimann, and Glaser (in press), and Chi and Bassok (in press).
Conference Paper
Abstract: This paper is a report on the validation of a new version of the Computer Attitude Questionnaire (CAQ) which is an instrument for measurement of student attitudes toward computers (comfort and learning with), empathy, creativity, and school. This new version, called CAQ N/I, was developed in response to a request for a more brief instrument to be used for a National Science Foundations’ Innovative Technology Experience for Students and Teachers (NSF ITEST) program research project. The CAQ N/I was found to have strong internal consistency reliability, content validity, and criterion-related validity. Cronbach’s Alpha for the instrument scales ranged from .71 to .87 with two samples of data spanning grades 6 to12. The CAQ N/I was judged to be “acceptable” to “very good” for measurement of student attitudes toward learning with computers in middle and high school years.
Article
Although teachers and researchers have long recognized the value of analyzing student errors for diagnosis and remediation, students have not been encouraged to take advantage of errors as learning opportunities in mathematics instruction. The study reported here was designed to explore how secondary school students could be enabled to capitalize on the potential of errors to stimulate and support mathematical inquiry. The article provides a case study of the proposed strategy of "using errors as springboards for inquiry" in action, identifies some important variations within the strategy, and discusses its potential contributions to mathematics instruction.