ArticlePublisher preview available

Learning by evaluating (LbE) through adaptive comparative judgment

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Traditional efforts around improving assessment often center on the teacher as the evaluator of work rather than the students. These assessment efforts typically focus on measuring learning rather than stimulating, promoting, or producing learning in students. This paper summarizes a study of a large sample of undergraduate students (n = 550) in an entry-level design-thinking course who engaged with Adaptive Comparative Judgment (ACJ), a form of assessment, as a learning mechanism. Following random assignment into control and treatment sections, students engaged in identical activities with the exception of a 20-minute intervention we call learning by evaluating (LbE). Prior to engaging in a Point Of View (POV) creation activity, treatment group students engaged in LbE by viewing pairs of previously-collected POV statements through ACJ; in each case they viewed two POV statements side-by-side and selected the POV statement they believed was better. Following this experience, students created their own POV statements and then the final POV statements, from both the control and treatment students, were collected and evaluated by instructors using ACJ. In addition, qualitative data consisting of student comments, collected during ACJ comparisons, were coded by the researchers to further explore the potential for the students to use class knowledge while engaging in the LbE review of peer work. Both the quantitative and qualitative data sets were analyzed to investigate the impact of the LbE activity. Consistent with other ACJ research findings, significant positive learning gains were found for students who engaged in the intervention. Researchers also noted that these findings did not indicate the actual quality of the assignments, meaning the while students who engaged in the LbE intervention were better than their peers, they were not necessarily “good” at the assignment themselves. Discussion of these findings and areas for further inquiry are presented.
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
International Journal of Technology and Design Education (2022) 32:1191–1205
https://doi.org/10.1007/s10798-020-09639-1
1 3
Learning byevaluating (LbE) throughadaptive comparative
judgment
ScottR.Bartholomew1 · NathanMentzer2· MatthewJones3· DerekSherman4·
SwetaBaniya5
Accepted: 11 November 2020 / Published online: 21 November 2020
© Springer Nature B.V. 2020
Abstract
Traditional efforts around improving assessment often center on the teacher as the evalua-
tor of work rather than the students. These assessment efforts typically focus on measuring
learning rather than stimulating, promoting, or producing learning in students. This paper
summarizes a study of a large sample of undergraduate students (n = 550) in an entry-level
design-thinking course who engaged with Adaptive Comparative Judgment (ACJ), a form
of assessment, as a learning mechanism. Following random assignment into control and
treatment sections, students engaged in identical activities with the exception of a 20-min-
ute intervention we call learning by evaluating (LbE). Prior to engaging in a Point Of View
(POV) creation activity, treatment group students engaged in LbE by viewing pairs of pre-
viously-collected POV statements through ACJ; in each case they viewed two POV state-
ments side-by-side and selected the POV statement they believed was better. Following this
experience, students created their own POV statements and then the final POV statements,
from both the control and treatment students, were collected and evaluated by instructors
using ACJ. In addition, qualitative data consisting of student comments, collected during
ACJ comparisons, were coded by the researchers to further explore the potential for the
students to use class knowledge while engaging in the LbE review of peer work. Both the
quantitative and qualitative data sets were analyzed to investigate the impact of the LbE
activity. Consistent with other ACJ research findings, significant positive learning gains
were found for students who engaged in the intervention. Researchers also noted that these
findings did not indicate the actual quality of the assignments, meaning the while students
who engaged in the LbE intervention were better than their peers, they were not necessar-
ily “good” at the assignment themselves. Discussion of these findings and areas for further
inquiry are presented.
Keywords Adaptive comparative judgment· Assessment· Design-thinking· Learning by
evaluating· Peer assessment
* Scott R. Bartholomew
scottbartholomew@byu.edu
Extended author information available on the last page of the article
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... Along with the written justification for the decisions, studies found that ACJ can be implemented as a meaningful assessment and feedback tool, in which students can improve learning and achievement (Bartholomew & Jones, 2022;Bartholomew et al., 2019). In this case, our research team has named the priming process Learning by Evaluating (LbE) which stimulates and promotes meaningful learning for students (Bartholomew, 2021;Bartholomew & Yauney, 2022;Bartholomew et al., 2020). Beginning student efforts through LbE builds foundational understanding and has been shown to produce meaningful learning for students (Bartholomew, 2021;Bartholomew & Yauney, 2022;Bartholomew et al., 2020). ...
... In this case, our research team has named the priming process Learning by Evaluating (LbE) which stimulates and promotes meaningful learning for students (Bartholomew, 2021;Bartholomew & Yauney, 2022;Bartholomew et al., 2020). Beginning student efforts through LbE builds foundational understanding and has been shown to produce meaningful learning for students (Bartholomew, 2021;Bartholomew & Yauney, 2022;Bartholomew et al., 2020). ...
... Several recent studies have incorporated adaptive comparative judgment (ACJ) into STEM education setting, especially in project-based design thinking processes (Bartholomew et al., 2018a(Bartholomew et al., , 2018b(Bartholomew et al., , 2019(Bartholomew et al., , 2020Dewit et al., 2021;Strimel et al., 2021). Bartholomew et al., (2018aBartholomew et al., ( , 2018b examined ACJ to evaluate the middle school students' learning through an open-ended problem assigned in a technology and engineering education course. ...
Article
Full-text available
Adaptive comparative judgment (ACJ) has been widely used to evaluate classroom artifacts with reliability and validity. In the ACJ experience we examined, students were provided a pair of images related to backpack design. For each pair, students were required to select which image could help them ideate better. Then, they were prompted to provide a justification for their decision. Data were from 15 high school students taking engineering design courses. The current study investigated how students’ reasoning differed based on selection. Researchers analyzed the comments in two ways: (1) computer-aided quantitative content analysis and (2) qualitative content analysis. In the first analysis, we performed sentiment analysis and word frequency analysis using natural language processing. Based on the findings, we explored how the design thinking process was embedded in student reasoning, and if the reasoning varied depending on the claim. Results from sentiment analysis showed that students tend to reveal more strong positive sentiment with short comments when providing reasoning for the selected design. In contrast, when providing reasoning for those items not chosen, results showed a weaker negative sentiment with more detailed reasons. Findings from word frequency analysis showed that students valued the function of design as well as the user perspective, specifically, convenience. Additionally, students took aesthetic features of each design into consideration when identifying the best of the two pairs. Within the engineering design thinking context, we found students empathize by identifying themselves as users, define user’s needs, and ideate products from provided examples.
... Further, this assessment becomes especially difficult in the context of collaborative, project-based design thinking assignments which demand a high level of creativity (Mahboub et al., 2004), especially in terms of organizing the content and structure of the rubric (Chapman and Inman, 2009). Bartholomew et al. have also noted that traditional teachercentric assessment models (e.g., rubrics) are not always effective at facilitating students' learning in a meaningful way (Bartholomew et al., 2020a) and other studies have raised questions about the reliability and validity of the rubric-based assessment, such as subjectivity bias of the graders (Hoge and Butcher, 1984), one's leniency or severity (Lunz and Stahl, 1990;Lunz et al., 1990;Spooren, 2010), and halo effect due to the broader knowledge of some students (Wilson and Wright, 1993). ...
... In contrast to rubrics, Adaptive comparative judgement (ACJ) has been implemented as an efficient and statistically sound measure to assess the relative quality of each student's work (Bartholomew et al., 2019;Bartholomew et al., 2020a). In ACJ, an individual compares and evaluates pairs of items (e.g., the POV statements) and chooses the better of the two; this process is repeated-with different pairings of items-until a rank order of all items is created (Thurstone, 1927). ...
... To achieve a level of consensus in ACJ, professionally trained judges' with collective expertise are often considered ideal; however, studies have also demonstrated that students-with less preparation and/ or expertise-can also be proficient judges with levels of reliability and validity similar to professionals (Jones and Alcock, 2014). For examples, studies investigating concurrent validity of peerevaluated ACJ showed that the results generated by peerevaluated ACJ had a high correlation with the results of experts (e.g., professionally trained instructors, graders) (Jones and Alcock, 2014;Bartholomew et al., 2020a). Jones and Alcock (Jones and Alcock, 2014) conducted peer-evaluated ACJ in the field of mathematics, to see the conceptual understanding of multivariable calculus. ...
... Further, this assessment becomes especially difficult in the context of collaborative, project-based design thinking assignments which demand a high level of creativity (Mahboub et al., 2004), especially in terms of organizing the content and structure of the rubric (Chapman and Inman, 2009). Bartholomew et al. have also noted that traditional teachercentric assessment models (e.g., rubrics) are not always effective at facilitating students' learning in a meaningful way (Bartholomew et al., 2020a) and other studies have raised questions about the reliability and validity of the rubric-based assessment, such as subjectivity bias of the graders (Hoge and Butcher, 1984), one's leniency or severity (Lunz and Stahl, 1990;Spooren, 2010), and halo effect due to the broader knowledge of some students (Wilson and Wright, 1993). ...
... In contrast to rubrics, Adaptive comparative judgement (ACJ) has been implemented as an efficient and statistically sound measure to assess the relative quality of each student's work (Bartholomew et al., 2019;Bartholomew et al., 2020a). In ACJ, an individual compares and evaluates pairs of items (e.g., the POV statements) and chooses the better of the two; this process is repeated-with different pairings of items-until a rank order of all items is created (Thurstone, 1927). ...
... To achieve a level of consensus in ACJ, professionally trained judges' with collective expertise are often considered ideal; however, studies have also demonstrated that students-with less preparation and/ or expertise-can also be proficient judges with levels of reliability and validity similar to professionals (Jones and Alcock, 2014). For examples, studies investigating concurrent validity of peerevaluated ACJ showed that the results generated by peerevaluated ACJ had a high correlation with the results of experts (e.g., professionally trained instructors, graders) (Jones and Alcock, 2014;Bartholomew et al., 2020a). Jones and Alcock (Jones and Alcock, 2014) conducted peer-evaluated ACJ in the field of mathematics, to see the conceptual understanding of multivariable calculus. ...
Article
Full-text available
Adaptive comparative judgment (ACJ) is a holistic judgment approach used to evaluate the quality of something (e.g., student work) in which individuals are presented with pairs of work and select the better item from each pair. This approach has demonstrated high levels of reliability with less bias than other approaches, hence providing accurate values in summative and formative assessment in educational settings. Though ACJ itself has demonstrated significantly high reliability levels, relatively few studies have investigated the validity of peer-evaluated ACJ in the context of design thinking. This study explored peer-evaluation, facilitated through ACJ, in terms of construct validity and criterion validity (concurrent validity and predictive validity) in the context of a design thinking course. Using ACJ, undergraduate students ( n = 597) who took a design thinking course during Spring 2019 were invited to evaluate design point-of-view (POV) statements written by their peers. As a result of this ACJ exercise, each POV statement attained a specific parameter value, which reflects the quality of POV statements. In order to examine the construct validity, researchers conducted a content analysis, comparing the contents of the 10 POV statements with highest scores (parameter values) and the 10 POV statements with the lowest scores (parameter values)—as derived from the ACJ session. For the criterion validity, we studied the relationship between peer-evaluated ACJ and grader’s rubric-based grading. To study the concurrent validity, we investigated the correlation between peer-evaluated ACJ parameter values and grades assigned by course instructors for the same POV writing task. Then, predictive validity was studied by exploring if peer-evaluated ACJ of POV statements were predictive of students’ grades on the final project. Results showed that the contents of the statements with the highest parameter values were of better quality compared to the statements with the lowest parameter values. Therefore, peer-evaluated ACJ showed construct validity. Also, though peer-evaluated ACJ did not show concurrent validity, it did show moderate predictive validity.
... T. Bartholomew et al. (2019a) divided students in a large undergraduate course focused on design thinking into control and experimental groups. All students were given an assignment to develop point-of-view (POV) statements while designing; these POV statements focus on describing a need, user, and an insightful solution to a problem. ...
... The authors noted no significant different in student achievement between groups. This finding, which contradicts the findings from other research (e.g., Bartholomew et al. 2019a) highlights the need for additional research into the potential for ACJ-a tool originally designed for assessment-to be used as a tool for learning. How often, when, and with whom ACJ should be used for learning are all questions for additional research. ...
Article
Full-text available
Adaptive Comparative Judgment (ACJ), an approach to the assessment of open-ended problems which utilizes a series of comparisons to produce a standardized score, rank order, and a variety of other statistical measures, has demonstrated high levels of reliability and validity and the potential for application in a wide variety of areas. Further, research into using ACJ, both as a formative and summative assessment tool, has been conducted in multiple contexts across higher education. This systematized review of ACJ research outlines our approach to identifying, classifying, and organizing findings from research with ACJ in higher education settings as well as overarching themes and questions that remain. The intent of this work is to provide readers with an understanding of the current state of the field and several areas of potential further inquiry related to ACJ in higher education.
... Through academic exchanges and interactions, students can broaden their horizons and deepen their knowledge,and focus on measuring learning rather than stimulating, promoting, or producing learning in students. [27] ...
... According to Bandura's social learning theory, learners can gain new skills, strategies and perspectives by observing others. In addition, evaluating the designs of their peers as well as observing them also contributes to their learning (Groenendijk et al. 2013) because playing an active role in the assessment process helps students to gain clear information about the evaluation criteria, thus helping them to perform their tasks better (Bartholomew et al. 2022). Therefore, increasing students' knowledge about evaluating designs has a key role in improving their creative skills (Lai & Hwang 2015). ...
Article
This study aimed to investigate the views of students enrolled on a desktop publishing course of the flipped classroom model adapted to a design course conducted in an online learning environment. The model was implemented over one semester, and at the end, semi-structured interviews were conducted with 65 volunteer students. Content analysis was used to analyse the students' views. It was determined that delivering course content through instructor-created videos had a positive effect on student views of the course. In addition, the students stated that doing assignments outside the classroom and evaluating them during the course contributed significantly to their learning design. Finally, student views on the feasibility of conducting the course through traditional design teaching methods in an online learning environment were examined. The students stated that delivering the course in live online classes may have both positive and negative aspects.
... The effect, for example, could have come from the experimental group simply being exposed at the mid-way point to a greater volume of examples (an exposure effect), to having to make judgments on quality or critique peer work (a judgment effect), or as only the experimental group assessed all work at the end, they may have judged in favor of familiar work (a recognition effect). Subsequent work addressed many of these limitations by mitigating the possible recognition effect (Bartholomew et al., 2020a), and the on-going "Learning by Evaluating" project (Bartholomew and Mentzer, 2021) is actively pursuing the qualification of explicit effects which can stem from ACJ, a need commented on further in Theme 2. A related issue comes from Newhouse (2014) where a cohort of judges noted that the digitized work presented in the ACJ session was a poor representation of the actual student work. One assessor commented on how the poor quality of some photographs made it more difficult to see faults which were easier to see in real life. ...
Article
Full-text available
There is a continuing rise in studies examining the impact that adaptive comparative judgment (ACJ) can have on practice in technology education. This appears to stem from ACJ being seen to offer a solution to the difficulties faced in the assessment of designerly activity which is prominent in contemporary technology education internationally. Central research questions to date have focused on whether ACJ was feasible, reliable, and offered broad educational merit. With exploratory evidence indicating this to be the case, there is now a need to progress this research agenda in a more systematic fashion. To support this, a critical review of how ACJ has been used and studied in prior work was conducted. The findings are presented thematically and suggest the existence of internal validity threats in prior research, the need for a theoretical framework and the consideration of falsifiability, and the need to justify and make transparent methodological and analytical procedures. Research questions now of pertinent importance are presented, and it is envisioned that the observations made through this review will support the design of future inquiry.
Article
Full-text available
Adaptive Comparative Judgment (ACJ) is an assessment method that facilitates holistic, flexible judgments of student work in place of more quantitative or rubric-based methods. This method “requires little training, and has proved very popular with assessors and teachers in several subjects, and in several countries” (Pollitt 2012, p. 281). This research explores ACJ as a holistic, flexible, interdisciplinary assessment and research tool in the context of an integrated program that combines Design, English Composition, and Communications courses. All technology students at Polytechnic Institute at Purdue University are required to take each of these three core courses. Considering the interdisciplinary nature of the program’s curriculum, this research first explored whether three judges from differing backgrounds could reach an acceptable level of reliability in assessment using only ACJ, without the prerequisites of similar disciplinary backgrounds or significant assessment experience, and without extensive negotiation or other calibration efforts. After establishing acceptable reliability among interdisciplinary judges, analysis was also conducted to investigate differences in student learning between integrated (i.e., interdisciplinary) and non-integrated learning environments. These results suggest evaluators from various backgrounds can establish acceptable levels of reliability using ACJ as an alternative assessment tool to more traditional measures of student learning. This research also suggests technology students in the integrated/interdisciplinary environment may have demonstrated higher learning gains than their peers and that further research should control for student differences to add confidence to these findings.
Conference Paper
Full-text available
Understanding the best practices of providing, receiving, and improving the formative feedback process in design is critical to improving student creative graphics education. Situated in a university-level computer graphics course this research studied the impacts on student performance of students engaged in adaptive comparative judgment (ACJ), as a formative learning and assessment tool, during several open-ended design problems. Students participated in ACJ, acting as judges of peer work and providing and receiving feedback to, and from, their peers. This paper will examine the relationships between using ACJ and student achievement and will specifically visit the implications of situating ACJ in the midst of an open-ended graphic design project. Further, this paper will explore the potential of using ACJ as a formative assessment and feedback tool.
Article
Full-text available
While research into the effectiveness of open-ended problems has made strides in recent years, less has been done around the assessment of these problems. The large number of potentially-correct answers makes this assessment difficult. Adaptive Comparative Judgment (ACJ), an approach based on assessors/judges working through a series of paired comparisons and selecting the better of two items, has demonstrated high levels of reliability and effectiveness with these problems. Research into using ACJ, both formative and summative, has been conducted at all grade levels within K-16 education (ages 5-18), with a myriad of findings. This paper outlines a systematic review process used to identify articles and synthesizes the findings from the included research around ACJ in K-16 education settings. The intent of this systematic review is to inform decision-makers weighing the potential for ACJ integration in educational settings with researched-based findings around ACJ in K-16 educational settings. Further, this review will also uncover potential areas for future researchers to investigate further into ACJ and its' implications in educational settings.
Article
Full-text available
Since our 1998 review of research on classroom assessment and learning was published, we have contributed to theorising formative assessment, but recognise that this work is incomplete. In this paper, we take up a suggestion by Perrenoud that any theory of formative assessment must be embedded within a wider theoretical field, specifically, within a theory of pedagogy. We propose a model whereby the design of educational activities and associated assessments is influenced by the theories of pedagogy, instruction and learning, and by the subject discipline, together with the wider context of education. We explore how teachers may develop productive relationships between the formative and summative functions of classroom assessment, so that their judgements may inform the formal external assessment of students, thus increasing the validity of those assessments. We also show how the model informs the development of theories that give appropriate weight to the role of assessment as part of pedagogy.
Article
Full-text available
While design-based pedagogies have increasingly been emphasized, the assessment of design projects remains difficult due to the large number of potentially “correct” solutions. Adaptive comparative judgment (ACJ), an approach based on assessors/judges working through a series of paired comparisons and selecting the better of two items, has demonstrated high levels of inter-rater reliability with design projects. Efforts towards using ACJ for assessing design have largely centered on summative assessment. However, evidence suggests that ACJ may be a powerful tool for formative assessment and design learning when undertaken by students. Therefore, this study investigated middle school students participated in ACJ at the midpoint and conclusion of a design project, both receiving and providing feedback to/from their peers through the ACJ process. Findings demonstrated promise for using ACJ, as a formative assessment and feedback tool, to improve student learning and achievement.
Article
Considering the challenges associated with the teaching of engineering design, and recognizing potential differences in design values of individuals from various backgrounds, this study investigated the utility of adaptive comparative judgment (ACJ) as a method for informing the teaching and practice of engineering design. The authors investigated a series of research questions to examine the use of ACJ as a formative method for influencing an engineering student’s design decision-making as well as identifying/comparing design values of different engineering education stakeholders. The study results indicate that the ACJ process enabled students to gain insights for enhancing their designs through the critique of peer-work and the receipt of feedback on their own projects. Also, the results revealed similarities/differences between the ways instructors, students, and practicing engineers judged design projects. In light of these findings, it appears ACJ can be a valuable formative assessment tool for informing the practice/processes of education related to engineering design.
Article
Comparative Judgement (CJ) aims to improve the quality of performance-based assessments by letting multiple assessors judge pairs of performances. CJ is generally associated with high levels of reliability, but there is also a large variation in reliability between assessments. This study investigates which assessment characteristics influence the level of reliability. A meta-analysis was performed on the results of 49 CJ assessments. Results show that there was an effect of the number of comparisons on the level of reliability. In addition, the probability of reaching an asymptote in the reliability, i.e., the point where large effort is needed to only slightly increase the reliability, was larger for experts and peers than for novices. For reliability levels of .70 between 10 and 14 comparisons per performance are needed. This rises to 26 to 37 comparisons for a reliability of .90.
Article
Improving graphics education may begin with understanding best practices for providing, receiving, and improving formative feedback. Challenges related to anonymity, efficiency, and validity in peer critique settings all contribute to a difficult-to-implement process. This research investigates university-level computer graphics students while engaged in adaptive comparative judgement (ACJ), as a formative learning, assessment, and feedback tool, during several open-ended graphics design projects. A control group of students wrote feedback on papers in small group critiques while the experimental group students participated in ACJ, acting as judges of peer work and providing and receiving feedback to, and from, their peers. Relationships between the paper-based group approach and the ACJ approach and student achievement were explored. Further, this paper discusses the potential benefits, and challenges, of using ACJ as a formative assessment and peer feedback tool as well as student impressions of both approaches toward peer formative assessment and feedback.
Book
This book examines the history of formative assessment in the US and explores its potential for changing the landscape of teaching and learning to meet the needs of twenty-first century learners. The author uses case studies to illuminate the complexity of teaching and the externally imposed and internally constructed contextual elements that affect assessment decision-making. In this book, Box argues effectively for a renewed vision for teacher professional development that centers around the needs of students in a knowledge economy. Finally, Box offers an overview of systemic changes that are needed in order for progressive teaching and relevant learning to take place.
Book
This book goes back to the basic purpose of assessment to show teachers what your students know and are able to do. The 22 activities in this book will help your students become active, engaged, responsible, and caring learners. This “how to” book is filled with activities which will enable you to keep your students active and engaged, facilitate cooperative group projects without losing control, raise academic achievement, apply multiple intelligences in your classroom, and teach your students how to think.