Content uploaded by Daniel Thurm
Author content
All content in this area was uploaded by Daniel Thurm on Nov 29, 2021
Content may be subject to copyright.
3 - 431
2021. In Inprasitha, M, Changsri, N., Boonsena, N. (Eds.). Proceedings of the 44th Conference of the
International Group for the Psychology of Mathematics Education, Vol. 3, pp. 431-440. Khon Kaen,
Thailand: PME.
THE INTERPLAY BETWEEN DIGITAL AUTOMATIC-
ASSESSMENT AND SELF-ASSESSMENT
Shai Olsher1 and Daniel Thurm2
1University of Haifa, Israel
2University of Duisburg-Essen, Germany
It is well established that digital formative assessment can support student
learning, for example by means of digital automatic-assessment of students' work
in rich digital environments. However, at the same time self-assessment is
regarded as important i -cognitive skills and to
put learners in a key position where they develop responsibility and ownership of
their learning. Yet, little is known about combining automatic- and self-
assessment. In this pioneering research study, we investigate the interplay
between automatic-assessment and self-assessment in the context of Example-
Eliciting-Tasks. Based on quantitative and qualitative data we demonstrate the
potentials of combining self- and automatic-assessment and outline obstacles that
can inform design principles for combining both forms of assessment.
INTRODUCTION & THEORETICAL BACKGROUND
The Interplay between Self-assessment and Automatic Digital Assessment
project (ISAA) is a design-based research project that aims at scrutinizing how
-assessment and digital automatic-assessment can be combined in
order to support student learning in the context of formative assessment. The
research reported in this paper reports on results from the first design-cycle of the
project.
Formative assessment
by teachers, and or by their students, which provide information to be used as
feedback to modify the teaching and learning activities in which they are
8). With respect to formative
assessment, it is well established that feedback is essential, and that the
effectiveness of formative assessment will therefore depend to a great extent on
the nature of the feedback (Hattie & Timperley, 2007). Feedback can range from
simple verification feedback, which merely provides information about whether
Olsher & Thurm
3 - 432 PME 44 2021
or not the student's answer is correct to more elaborated forms of feedback. For
presents information regarding central
mathematical attributes of the student solution. The meta-analysis by Van der
Kleij et al. (2015) shows that elaborated forms of feedback are more effective for
higher-order learning outcomes in mathematics. We view feedback not only as a
singular event but rather as a more holistic, bi-directional ongoing process of
interaction between the student and various parts of the activity in which both
sides of the interaction (e.g., student, technology) are affected and modified by
actions of the other side.
Digital formative assessment with example-eliciting-tasks
Technological tools can support formative assessment by providing immediate
and automatic feedback to students. For example, Harel et al. (2019) use
-eliciting-
formative assessment tasks that are built on the notion that the examples that
students generate are indicative o
(Zaslavsky & Zodik, 2014). For example, the EET depicted in figure 1 was
designed by Harel et al. (2019) to support the students in raising conjectures, as
part of a guided inquiry activity. In this EET students investigate the relations
between two non-constant linear functions and their product function in a
multiple linked representation (MLR) interactive diagram. For this, students can
dynamically drag points on the linear functions to create multiple examples and
can display the product function by pressing a button. Then, they are asked to
formulate a conjecture about which types of product functions can be obtained
and to construct three examples that support their conjectures. Subsequently
students receive AIEF that provides feedback on whether certain predefined
characteristics are fulfilled in their examples (see figure 1, characteristics that are
fulfilled are highlighted in yellow). Harel et al. (2019) show, that iteratively
working with the EET task and the AIEF can support students in improving their
conjecture.
Olsher & Thurm
PME 44 2021 3 - 433
Figure 1: Example of an EET with attribute isolation elaborated feedback
(characteristics that are fulfilled by the examples are highlighted in yellow)
Self-assessment
Besides automatic- -assessment has intensively been
researched as an essential element of formative assessment (Black & William
1998; Cizek, 2010). For example, Cizek (2010) highlights that current usage of
qually, if not to a greater extent, highlights the
notions of student engagement and responsibility for learning, student self-
assessment, and self-
approaches to formative assessment where assessm
students, but rather something in which they participate and have some element
self-assessment being part of an ongoing feedback process. In particular, self-
assessment is regarded as important as it can support students to develop meta-
cognitive skills which are urgently needed if students should become self-directed
learners in a fast-changing and complex world. Despite this, in many cases self-
assessment is rarely implemented in classrooms (e.g., Kippers et al. 2018).
Olsher & Thurm
3 - 434 PME 44 2021
RESEARCH QUESTIONS & METHODOLOGY
Research questions
While self- and automatic-assessment both carry much potential to enable rich
formative assessment the lack of integrated research that investigates the
combination of both forms of assessment in order to support student learning is
stunning. Little is known about how to best integrate both forms of assessment in
a learning environment and which opportunities and obstacles arise from an
amalgam of both forms. This study addresses this research gap by exploring the
interplay between self- and automatic-assessment in the context of an EET
learning environment. In particular the study addresses the following research
questions (RQ):
RQ1: To what extent are students' able to self-assess the EET characteristics?
RQ2: Does the self-assessment of the students improve when combining self-
and automatic-assessment with EETs?
RQ3: What potentials and obstacles can be identified with respect to the interplay
of self- and automatic-assessment of EET characteristics?
Methodology
To answer these questions, we used the Seeing the Entire Picture (STEP) platform
(Olsher, Yerushalmy & Chazan, 2016) which is a digital environment that
supports example-eliciting tasks. We combined self- and automatic-assessment
using the EET described before (see figure 1) in the following way:
A) Creating examples and formulating conjectures: Students first created three
examples and elaborated about the types and characteristics of quadratic
functions that can be obtained from multiplying two non-constant linear
functions.
B) Self-assessment: Instead of receiving the automatic-assessment right away,
students were now asked to self-assess whether the characteristics depicted in
The students were provided with a pre-structured paper sheet
where they could mark which characteristics were fulfilled for each of their three
examples.
C) Comparing self-assessment and automatic-assessment: After finishing the
self-assessment students received the automatic-assessment report. The report
Olsher & Thurm
PME 44 2021 3 - 435
indicated which characteristics were fulfilled in the examples that were submitted
by the students (part A), by highlighting them in yellow (see figure 1). Students
were now asked to compare their self-assessment (part B) with the automatic-
assessment.
In order to gain more insight into how students experienced parts A-C, students
subsequently answered various multiple-choice questions about their experience.
For example, one question asked students whether the comparison between self-
and automatic-assessment brought new insights or was surprising. Another
question captured whether students preferred the self-assessment, the automatic-
assessment or a mixture of both. After finishing these questions students entered
a second cycle working through the previously described parts once more.
Figure 2: Self- and automatic-assessment when working with EET
Nine pairs of students from grade 9 of a German upper secondary school were
video-recorded. Data was analysed in two ways. To answer research questions 1
and 2 we captured how often students' self-assessment was correct, and how often
students choose the o
Students' self-assessment was rated correct if a characteristic was fulfilled in the
submitted example and the student realized this in their self-assessment or if a
characteristic was not fulfilled in the submitted example and the student realized
Olsher & Thurm
3 - 436 PME 44 2021
this in their self-assessment (see figure 2). To identify potentials and obstacles
with respect to the interplay between self- and automatic-assessment (research
question 3) we analysed the parts of the video-recordings where students self-
assessed themselves and where they compared their self-assessment with the
automatic-assessment. Analysis was done in an exploratory manner with a focus
on how students engaged in self-assessment and comparing both types of
assessment.
RESULTS
Quantitative results
-assessment (SA) was
s of students, nine
characteristics per example and three examples submitted per cycle, there were
in total 486 (=9·9·3·2) characteristics that the students had to self-assess.
1st Cycle
n=243
2nd Cycle
n=243
Difference %
SA correct 150 194 29.33
SA incorrect 43 31 -27.91
Unsure 20 6 -70
Don't understand 24 12 -50
Missing 6 0 -100
Table 1: Students self-assessment (SA) across the two cycles
With respect to research question 1, it can be seen that in the first cycle roughly
only 61% (150/243) of the self-assessments were correct, indicating that self-
assessment was not an easy endeavor for students. However, the self-assessment
improved remarkably (research question 2) as the number of correct self-
assessments increased substantially in the second cycle to 194 which corresponds
to an increase of roughly 30 percent. These trends will also be statistically
analyzed for significant differences.
Pair
Olsher & Thurm
PME 44 2021 3 - 437
S1 S2 S3 S4 S5 S6 S7 S8 S9
C1 20 20 14 22 17 20 6 6 25
C2 27 21 20 21 24 26 14 16 25
Table 2: Number of correct self-assessments in cycle 1 (C1) and cycle 2 (C2)
for each pair of students (S1-S9)
Table 2 shows that almost all pairs of students improved with respect to the
number of correct self-assessments. The increase in correct self-assessments was
not only due to a reduction of incorrect self-assessment: In addition, students
reported in fewer cases that they were unsure about whether a characteristic was
fulfilled or not, or that they do not understand a characteristic.
In line with these results, the answers of the students to the questions that were
asked after the completion of each cycle, indicate the great appreciation of the
combination of self- and automatic-assessment by the students. Eight out of nine
pairs stated that comparing the self- and automatic-assessment brought new
insights or encouraged them to think differently and that they prefer a mix
between self- and automatic-assessment. Interestingly five groups stated that they
were surprised that there were any differences between their self-assessment and
the automatic-assessment. These findings can be further elaborated using the
qualitative results presented in the following section.
Qualitative results
The analysis of the segments where students self-assessed their work (part B) and
where they compared their self-assessment with the automatic-assessment (part
C) brought to the forefront the following four central aspects related to the
potentials and opportunities of combining self- and automatic-assessment
(research question 3).
High cognitive activation during self-assessment
During the self-assessment (part B) students often discussed whether a certain
characteristic was fulfilled or not. As mentioned before, many times students did
not self-assess the characteristics correctly (see table 1) which was often due to
students limited concept images. For example, one pair of students held the
concept image that a quadratic function is always opened upwards and therefore
the parabola in their example which was opened downwards could not be a
quadratic function. Another pair of students noted that the two linear functions
that they had produced have slightly different slopes, however they argued that
Cycle
Olsher & Thurm
3 - 438 PME 44 2021
the slopes are sufficiently similar to say that they are actually the same
(characteristic 4).
Comparing self- and automatic-assessment can lead to new insights
The combination of self- and automatic-assessment (part C) has the potential to
lead to new insights as illustrated by the following example from the first cycle.
Two students had generated the linear functions f(x)=-2.01x+10 and g(x)=-
0.4x+2. In the self-
self-assessment with the automatic-assessment they realize that the automatic-
assessment has marked this characteristic as not fulfilled. They look at the graph
and are surprised because the product function appears to have only one zero
point (see figure 1, second example). Then one student started to investigate the
graph, zooms into the zero point and concludes that they probably have not looked
close enough. Hence students encountered a cognitive conflict, and resolved it by
reanalyzing their example.
High cognitive load when comparing self- and automatic-assessment
Even though some cases appeared where students investigated conflicts between
self- and automatic-assessment (part C)) these investigations were quite rare. A
possible reason for this was that before investigating possible differences between
self- and automatic-assessment students had to identify whether their self-
assessment was aligned with the automatic-assessment or not. Figure 2 highlights
the complexity of this evaluation process as students have to distinguish between
four cases. While most students managed to identify their self-assessment as
aligned if a characteristic was fulfilled in self- and automatic-assessment (upper
left square in figure 2), the other cases were considerably more difficult for
students to evaluate and led to high cognitive load just to manage the evaluation
of the self-assessment. This cognitive demand was additionally increased by the
fact the automatic-
self-assessment was done on paper. Students had to constantly move back and
forth between screen and paper which made the comparison between self- and
automatic-assessment quite tedious.
Making graph and algebraic expressions easily accessible
self- and automatic-assessment was that the tablet which students used could not
display the automatic-assessment and the submitted examples of the students
Olsher & Thurm
PME 44 2021 3 - 439
(e.g., the graphs) at the same time. Most students scrolled down to easily oversee
the yellow highlighted characteristics but did not bother to scroll back and forth
between seeing the characteristics and the graphs of their examples.
SUMMARY AND DISCUSSION
The Interplay between Self-assessment and Automatic Digital Assessment
-assessment and automatics
assessment can be combined in order to support student learning. The results of
the first design- -assessment did improve remarkably
throughout the two cycles (table 1 and 2). This is particularly striking since this
improvement was not moderated or scaffolded by any teacher intervention.
Rather, one of the likely reasons for the improvements was the fact that students
worked in pairs which allowed interactions and discussions between the group
members. The qualitative analysis revealed how the comparison of differences
between self- and automatic-assessment can create cognitive conflict that can
lead to new insights. However, we also identified several challenges that can
inform the design of learning environments that combine self- and digital
automatic-assessment. First, evaluating the self-assessment with respect to the
automatic- assessment was not easy for students and created high cognitive load
which impeded a deeper engagement with the differences between self-and
automatic-assessment. A possible way to reduce cognitive load would be to
embed the self-assessment into the digital environment, and automatically
highlight differences between self-assessment and automatic-assessment within
the digital environment. This would allow students to immediately investigate the
difference between the two forms of assessment. Another aspect would be to
increase the simultaneous accessibility to all relevant information for example by
presenting the interactive diagram on the same screen as the report without having
to scroll between them.
Self- and automatic-assessment carry both tremendous potentials to support
formative assessment. We have shown that combining self-assessment and
automatic-assessment has the potential to enhance students learning and outlined
important design considerations. However, while we gained many important
insights, the results of this study are somewhat limited by the small number of
students that were investigated. The next design cycle of the ISAA project will
comprise a larger group of students, a further development of the technological
environment with the goal of supporting an easier and deeper engagement with
the differences between self- and automatic-assessment. Furthermore, we will
Olsher & Thurm
3 - 440 PME 44 2021
investigate whether the combination of self- and automatic-assessment increases
the quality of students conjectures in the EET.
References
Black, P., & William, D. (1998). Assessment and classroom learning. Assessment in
Education, 5(1), 7 74.
Bull, J. & McKenna, C. (2004). Blueprint for computer-assisted assessment. London:
Routledge-Falmer.
Cizek, G. J. (2010). An introduction to formative assessment: History, characteristics,
and challenges. In G. J. Cizek & H. L. Andrade (Eds.), Handbook of formative
assessment (pp. 3 17). New York: Routledge.
Harel, R., Olsher, S., & Yerushalmy, M. (2019). Designing online formative assessment
Pohl, H. Ruchniewicz, F. Schacht & D. Thurm (Eds.), Proceedings of the 14th
International Conference on Technology in Mathematics Teaching (pp. 181 188).
Essen: University of Duisburg-Essen.
Hattie, J., & Timperley, H. (2007). The Power of Feedback. Review of Educational
Research, 77(1), 81 112.
Kippers, W. B., Wolterinck, C. H., Schildkamp, K., Poortman, C. L., & Visscher, A. J.
(2018). Teachers' views on the use of assessment for learning and data-based
decision making in classroom practice. Teaching and teacher education, 75, 199
213.
Olsher, S., Yerushalmy, M., & Chazan, D. (2016). How might the use of technology in
formative assessment support changes in mathematics teaching? For the Learning of
Mathematics, 36(3), 11 18.
Van der Kleij, F. M., Feskens, R. C., & Eggen, T. J. (2015). Effects of feedback in a
computer- -
analysis. Review of educational research, 85(4), 475 511.
Zaslavsky, O., & Zodik, I. (2014). Example-generation as indicator and catalyst of
mathematical and pedagogical understandings. In Y. Li, A. E. Silver & S. Li (Eds.),
Transforming mathematics instruction: Multiple approaches and practices (pp.
525 546). Cham: Springer.