Content uploaded by Kimberleigh Felix Hadfield
Author content
All content in this area was uploaded by Kimberleigh Felix Hadfield on Jul 29, 2023
Content may be subject to copyright.
Utah State University Utah State University
DigitalCommons@USU DigitalCommons@USU
All Graduate Theses and Dissertations Graduate Studies
8-2023
The Impact of Formative Assessment Cycles on Students' The Impact of Formative Assessment Cycles on Students'
Attitudes and Achievement in a Large-Enrollment Undergraduate Attitudes and Achievement in a Large-Enrollment Undergraduate
Introductory Statistics Course Introductory Statistics Course
KimberLeigh Felix Had9eld
Utah State University
Follow this and additional works at: https://digitalcommons.usu.edu/etd
Part of the Curriculum and Instruction Commons
Recommended Citation Recommended Citation
Had9eld, KimberLeigh Felix, "The Impact of Formative Assessment Cycles on Students' Attitudes and
Achievement in a Large-Enrollment Undergraduate Introductory Statistics Course" (2023).
All Graduate
Theses and Dissertations
. 8856.
https://digitalcommons.usu.edu/etd/8856
This Dissertation is brought to you for free and open
access by the Graduate Studies at
DigitalCommons@USU. It has been accepted for
inclusion in All Graduate Theses and Dissertations by an
authorized administrator of DigitalCommons@USU. For
more information, please contact
digitalcommons@usu.edu.
THE IMPACT OF FORMATIVE ASSESSMENT CYCLES ON STUDENTS’
ATTITUDES AND ACHIEVEMENT IN A LARGE-ENROLLMENT
UNDERGRADUATE INTRODUCTORY STATISTICS COURSE
by
KimberLeigh Felix Hadfield
A dissertation submitted in partial fulfilment
of the requirements for the degree
of
DOCTOR OF PHILOSOPHY
in
Education
Approved:
__________________________________ __________________________________
Katherine Vela, Ph.D. Patricia S. Moyer-Packenham, Ph.D.
Major Professor Committee Member
__________________________________ __________________________________
Jessica F. Shumway, Ph.D. Tyson S. Barrett, Ph.D.
Committee Member Committee Member
__________________________________ __________________________________
Kady Schneiter, Ph.D. D. Richard Cutler, Ph.D.
Committee Member Vice Provost for Graduate Studies
UTAH STATE UNIVERSITY
Logan, Utah
2023
ii
Copyright © KimberLeigh Felix Hadfield 2023
All Rights Reserved
iii
ABSTRACT
The Impact of Formative Assessment Cycles on Students’ Attitudes and Achievement in
a Large-Enrollment Undergraduate Introductory Statistics Course
by
KimberLeigh Felix Hadfield, Doctor of Philosophy
Utah State University, 2023
Major Professor: Katherine Vela, Ph.D.
Department: School of Teacher Education and Leadership
Although there is much research on the importance of formative assessments and
feedback in education literature, implementing these formative assessment practices is
largely missing in large-enrollment undergraduate introductory statistics courses.
Additionally, large-enrollment introductory statistics courses tend to use few high-stakes
tests rather than frequent formative assessment with reassessment opportunities.
Therefore, this study aimed to investigate the impact of student attitudes toward statistics
and student achievement after engaging in large-enrollment introductory statistics course
curriculum using continuous formative assessments with feedback and reassessment
opportunities. Using regression discontinuity to investigate course achievement in
semesters with and without formative assessment cycles (FACs) in large-enrollment
introductory statistics, meaningful differences in course achievement were evident,
suggesting co-requisite courses with FACs could help students successfully navigate this
iv
quantitative requirement. Using the SATS-36 pre- and post-survey scores from students
in semesters of large-enrollment introductory statistics courses implementing FACs,
students’ attitudes in affect, cognitive competence, and difficulty showed 4-5 times greater
improvement than past studies on attitudes, with past studies showing 1-2 times greater
decreases in effort, interest, and value than in this study. Students’ final grades
significantly moderated these changes in attitudes from pre- to post-survey despite the
students filling out the surveys 4 weeks before the end of the course. Students with higher
final course grades showed greater improvements in attitudes that increased on average.
For those attitudes which decreased on average, students with higher course achievement
showed little to no declines. These findings suggest that large-enrollment introductory
statistics courses implementing FACs can improve student achievement which moderates
attitudes, improving students’ enjoyment of the course, beliefs of their computational
abilities, and feelings that the course is less complicated or difficult than at the beginning
of the course.
These findings have the potential to contribute towards more intentional policy
development for statistics programs, better course designs, and additional pathways for
student success in these courses. As students experience increased opportunities in
introductory statistics through self-assessment from formative feedback and repeating
assessment opportunities, FACs could improve students’ attitudes towards statistics and
increase student achievement, preparing a new generation of statistically literate citizens
for a data-driven world.
(250 pages)
v
PUBLIC ABSTRACT
The Impact of Formative Assessment Cycles on Students’ Attitudes and Achievement in
a Large-Enrollment Undergraduate Introductory Statistics Course
KimberLeigh Felix Hadfield
This study aimed to investigate the impact of student attitudes toward statistics
and student achievement after engaging in large-enrollment introductory statistics course
curriculum using continuous formative assessments with feedback and reassessment
opportunities. This framework, called Formative Assessment Cycles (FACs) was
implanted, providing students formative assessments both in and out of the classroom,
with feedback and reassessment. A quasi-experimental, quantitative research design
allowed for the investigation of course achievement from pre-FACs to FACs semesters
using regression discontinuity methodology. Changes in attitudes from pre- to post-
survey in semesters using a curriculum with FACs were analyzed by multilevel
regression techniques. Course achievement improved in the co-requisite introductory
statistics course using FACs for those who have less mathematical knowledge, suggesting
the need for co-requisite courses and formative feedback and reassessment to provide
students successful pathways to achieve their quantitative literacy requirement.
Additionally, students with higher course achievement had significantly better attitudes
towards statistics than their peers with lower final course grades. These students
experienced more appreciation for the course and the science of statistics in their field of
study, improved feelings of competence to do statistical calculations, and believed the
vi
course was less confusing and easier than they first believed at pre-survey. These
attitudes exhibited in this study were higher than previous studies on students’ attitudes
toward statistics, suggesting that students who have opportunities to learn from their
mistakes enjoy their introductory statistics class better and feel empowered by their
newfound understanding.
vii
ACKNOWLEDGMENTS
First, I must thank my committee chair, Dr. Vela, for the amazing input, feedback,
assistance, and time spent reading my edits, who took me on less than a year ago, and
who has shouldered the weight of the most important part of this process, thank you. Dr.
MacDonald, thank you for teaching me to write a published literature review and a
dissertation proposal, and helping me find my conceptual framework for my study. Dr.
Moss, thank you for teaching and leading me to my first publication. Dr. Moyer-
Packenham, you pushed me through this program with expectations of excellence. I loved
learning from you. Thank you for molding me into a research writer. Dr. Shumway, thank
you for starting me out on the right foot with my very first doctoral class. Your patient
feedback to this novice writer after 25 years since getting my master’s degree was the gas
I needed to get going. Dr. Schneiter, how lucky I am to be your colleague. I am so
grateful for your constant encouragement. Dr. Schwartz, your kindness and patience
helped me through my coding issues and statistical understanding. Thank you. And, Dr.
Barrett, thank you for saying the words “regression discontinuity” which had been
brewing in my mind since my first year of graduate studies. Thank you so much, Dr.
Barrett, for your compassion and encouragement throughout this process. You all have
left an indelible mark on my future scholarship and service on student committees.
Thank you to my cohort who have academically travelled the past five years with
me and who have become dear friends. Amy, Carrie, Danielle, Joey, Kent, Will, Allison,
and finally, my writing buddy and my shoulder to cry on, Lise. Thank you. Thank you.
Thank you.
viii
I am so grateful for my colleagues in the Mathematics and Statistics Department,
who all were so supportive of me taking on this goal to complete my education. Your
patience, your help, and your encouragement have been priceless. Adele and Jenny, thank
you two, most of all. You two carried me when I thought I was drowning. I couldn’t have
survived without you both. Hannah, the conversation we had where you helped me
develop the Formative Assessment Cycle was monumental. Thank you.
I have so many friends, true friends, who kept inviting me to parties and brunches,
knowing full well I would say no. You knew I would come through this—but thank you
for sticking around despite my absence from your lives. Aimee—thank you for being my
friend/therapist and treadmill buddy. Shelley—thank you for your cheerleading and love.
I have missed spending time with my family, immediate and extended, through
the past five years. I haven’t been able to be there for my extended family—all the
birthdays and celebrations, pinochle nights, dinners, and get-togethers. Yet, you never
complained. None of you. Ever. Rather, you kept me going. Sydney, thanks for all the
reels and texts to keep me laughing through this. You’ve truly been such a supportive
niece and friend.
Mom and dad, seriously, how can I ever thank you for all your support? Almost
50 years of nothing but unconditional love, patience, and understanding. You two are
truly the best, and everything I have accomplished in my life is because I had you two.
You supported me, held me up, let me vent, and kept pushing me forward. You taught me
to be a fierce, strong, independent but loving woman from day one. I love you so much.
Thank you, times infinity.
ix
Now, to my sweet boys. Emerson, I started this journey when you were just 9
years old. You have seen me cry a lot—because hard things are hard and exhausting. But
I hope you have witnessed grit and perseverance from my example. Tennyson, who
reminds me that “we aren’t average, we are excellent,” meaning we sacrifice to achieve
our dreams, thank you for your example of pushing through your own trials in life, and
finding that consistent effort leads to results. I am so proud of you, my sons. You two are
my world. You both keep me laughing. You are the best.
Bretton, thank you for your partnership and love. Who knew that my best friend
and crush in eighth grade would still be my best friend and crush 37 years later? Thank
you for letting me pursue this crazy adventure. I could not have done this without your
patience and support. You never once doubted my abilities, even though I constantly
doubted myself. You told me what I was doing would make a difference for students.
(Fingers crossed.) Thank you for everything. We have the best little family, that is the
reality of my happiest dreams. Thank you, Mr. Dr. Hadfield, Esq. (Inside joke.)
I love you all. Thank you.
Finally, I dedicate this effort to my brother, Kaden Lloyd Felix, who passed away
February 19, 2021, in the middle of this journey, and Becky Y. Hadfield, my angelic
mother-in-law, who I know would be proud. Thank you, angels.
KimberLeigh Felix Hadfield
x
CONTENTS
Page
ABSTRACT .................................................................................................................. iii
PUBLIC ABSTRACT .................................................................................................. v
ACKNOWLEDGMENTS ............................................................................................ vii
LIST OF TABLES ........................................................................................................ xiii
LIST OF FIGURES ...................................................................................................... xv
CHAPTER I: INTRODUCTION .................................................................................. 1
Background of the Problem .............................................................................. 3
Purpose of the Study ......................................................................................... 6
Significance of the Study .................................................................................. 7
Research Questions ........................................................................................... 8
Summary of Research Design ........................................................................... 8
Summary ........................................................................................................... 9
Definition of Terms........................................................................................... 9
CHAPTER II: LITERATURE REVIEW ..................................................................... 12
Introduction ....................................................................................................... 12
Formative Assessment and Student Attitudes Toward Statistics and
Student Achievement: A Conceptual Framework ................................... 21
Formative Assessment Cycle for Large-Enrollment Courses ........................... 29
Three Elements of the Formative Assessment Cycle and Student Attitudes
and Achievement ..................................................................................... 31
Conclusion ........................................................................................................ 41
CHAPTER III: METHODOLOGY ............................................................................. 44
Overview ........................................................................................................... 44
Research Questions and Hypotheses ................................................................ 44
Research Design................................................................................................ 45
Participants and Setting..................................................................................... 48
Existing Data Set............................................................................................... 50
Instruments and Data Sources ........................................................................... 51
Data Analysis .................................................................................................... 58
xi
Page
Summary ........................................................................................................... 69
CHAPTER IV: RESULTS ............................................................................................ 71
Analysis for Research Question 1 ..................................................................... 72
Analysis for Research Question 2 ..................................................................... 90
Summary ........................................................................................................... 115
CHAPTER V: DISCUSSION ....................................................................................... 116
Overview of the Study ...................................................................................... 116
Discussion of the Analysis of Research Question 1 ......................................... 117
Discussion of the Analysis of Research Question 2 ......................................... 123
Limitations ........................................................................................................ 138
Generalizability ................................................................................................. 141
Implementing the Formative Assessment Cycle in Large-Enrollment
Introductory Statistics Courses ................................................................ 142
Future Research ................................................................................................ 145
Conclusions ....................................................................................................... 146
REFERENCES ............................................................................................................. 150
APPENDICES .............................................................................................................. 161
Appendix A: Data Sources, Deidentified for Proposed Study ....................... 162
Appendix B: Permission to Use SATS-36 ..................................................... 163
Appendix C: SATS-36 Pre-Survey ................................................................ 164
Appendix D: SATS-36 Post-Survey ............................................................... 169
Appendix E: Additional Questions Asked on SATS-36 Pre- and
Post-Surveys ............................................................................. 174
Appendix F: Introductory Statistics Course Objectives ................................. 176
Appendix G: Learning Mastery Gradebook Examples: Teacher and
Student View ............................................................................. 184
Appendix H: Units Covered in the Stat 1040/1045 Courses .......................... 186
Appendix I: Numerical and Graphical Summaries with Crossovers
Removed ................................................................................... 188
Appendix J: Numerical and Graphical Summaries with Extreme Cases
Removed ................................................................................... 191
Appendix K: Numerical and Graphical Summaries with Crossovers and
Extreme Cases Removed .......................................................... 194
Appendix L: Simple Slopes Analysis............................................................. 197
Appendix M: LOESS Curve on Model 1 ........................................................ 198
xii
Page
Appendix N: Side by Side Plots of Final Grade Percentage at the Cutoff,
Pre-FACs to FACs Semester .................................................... 199
Appendix O: Descriptive and Summary Statistics by Prerequisite ................ 200
Appendix P: Descriptive and Summary Statistics of Students who Took
the ALEKS Math Placement Exam .......................................... 201
Appendix Q: Additional Descriptive and Summary Statistics ....................... 203
Appendix R: Summary Statistics of All Students by Recitation TA .............. 204
Appendix S: Histograms of Attitude Components ........................................ 206
Appendix T: Person-Profile Violin Plots by Course for Attitude
Components .............................................................................. 212
Appendix U: Estimated Marginal Means for Female Students’ Changes in
Attitude Components from Pre- to Post-Survey for Final
Course Grades of 75%, 85%, and 95% ..................................... 217
Appendix V: Estimated Marginal Mean Plots by Sex on Attitude
Components Change from Pre- to Post-Survey for Final
Course Grades of 75%, 85%, and 95% ..................................... 218
Appendix W: MLM Models for ALEKS Subset by Attitude Component ...... 220
Appendix X: Permission to Reprint from Statistics Education Research
Journal ...................................................................................... 222
CURRICULUM VITAE ............................................................................................... 223
xiii
LIST OF TABLES
Page
Table 1 Research Questions, Data Source and Instruments, Participants, and
Analysis....................................................................................................... 46
Table 2 Pre-FACs and FACs Cohorts for Proposed Research Question 1 .............. 47
Table 3 Introductory Statistics Courses ................................................................... 48
Table 4 Number of Participants by Research Question ........................................... 50
Table 5 SATS-36 Attitude Components, Definitions, Examples, and Number
of Items Per Component with Cronbach Alpha Ranges per Component ... 52
Table 6 Descriptive and Summary Statistics of Participants for Research
Question 1 ................................................................................................... 74
Table 7 Regression Estimates for Baseline Models at the Cutoff............................ 78
Table 8 Models Fit to EQ(full) ................................................................................ 80
Table 9 Three-Way Interaction Models ................................................................... 81
Table 10 Standardized Mean Differences and Pairwise t Tests for Change in
Final Grade Before and After the Cutoff (Less than 30 – At least 30)
for all Four Models ..................................................................................... 85
Table 11 Standardized Mean Differences (Effect Sizes) and Pairwise t Tests for
the Difference in Final Grade Change from Pre-FACs to FACs
(Pre-FACs – FACs) for all Four Models .................................................... 88
Table 12 Students Participation in the SATS-36 Survey: Total and After
Removing Student Responses Who Withdrew or Failed ............................ 92
Table 13 Descriptive and Summary Statistics of the Participation in the Surveys .... 93
Table 14 Cronbach’s Alphas for the SATS-36 Attitude Survey by Attitude
Component .................................................................................................. 94
Table 15 Summary of Dependent Variables .............................................................. 102
xiv
Page
Table 16 Baseline Models for Multilevel Models for Attitude Components ............ 103
Table 17 Final Multilevel Models for Attitude Components ..................................... 106
Table 18 Standardized Mean Differences (Effect Sizes) and Pairwise t tests for
Change in Pre- to Post-Survey Attitudes by Final Grade ........................... 110
Table I.1 Descriptive and Summary Statistics of Participants for Research
Question 1, Crossovers Removed ............................................................... 188
Table J.1 Descriptive and Summary Statistics of Participants for Research
Question 1, Extreme Cases Removed ......................................................... 191
Table K.1 Descriptive and Summary Statistics of Participants for Research
Question 1, Crossovers and Extreme Cases Removed ............................... 194
Table L.1 Simple Slopes Analysis............................................................................... 197
Table O.1 Descriptive and Summary Statistics by Prerequisite .................................. 200
Table P.1 Descriptive and Summary Statistics of Students who Took the ALEKS
Math Placement Exam ................................................................................ 202
Table Q.1 Descriptive and Summary Statistics of Participants by Major ................... 203
Table R.1 Student Participation by Recitation Teacher (TA)...................................... 204
Table R.2 Participants in the Analysis by Recitation TA ............................................ 205
Table W.1 MLM Models for ALEKS Subset by Attitude Component ........................ 221
xv
LIST OF FIGURES
Page
Figure 1 Situating Formative Assessment in Theoretical Perspectives in Higher
Education ................................................................................................. 20
Figure 2 Conceptual Framework for Formative Assessment Cycles in Large-
Enrollment Introductory Statistics Courses ............................................. 22
Figure 3 Assessment Continuum ............................................................................ 24
Figure 4 Simulated Data Representing Observed Data Points Along a Running
Variable Below and Above Some Binding Cutoff .................................. 61
Figure 5 Graphical Depiction of the Nested Nature of the Data for Research
Question 1 ................................................................................................ 62
Figure 6 Nesting Structure for Analyzing Attitudes as a Repeated Measure for
Research Question 2 ................................................................................ 67
Figure 7 Histograms of ALEKS Placement Scores by Course .............................. 75
Figure 8 Histogram of the Final Grade Percentages by Course ............................. 76
Figure 9 Estimated Means of Final Grade Percentage by ALEKS scores Around
the Cutoff, Pre-FACs and FACs .............................................................. 82
Figure 10 Side by Side Plots of Final Grade Percentage at the Cutoff, Pre-FACs
to FACs Semester for Model 4 ................................................................ 87
Figure 11 Estimated Means of Final Grade Percentage by ALEKS Scores
Around the Cutoff for Model 4 ................................................................ 89
Figure 12 Affect Person-Profile Plot by TA ............................................................ 96
Figure 13 Cognitive Competence Person-Profile Plot by TA .................................. 97
Figure 14 Difficulty Person-Profile Plot by TA ....................................................... 98
Figure 15 Value Person-Profile Plot by TA ............................................................. 99
Figure 16 Interest Person-Profile Plot by TA ........................................................... 100
xvi
Page
Figure 17 Effort Person-Profile Plot by TA ............................................................. 101
Figure 18 Person-Profile Violin Plot by Course for Cognitive Competence ........... 104
Figure 19 Estimated Marginal Means for Males’ Changes in Attitude
Components from Pre- to Post-Survey for Final Course Grades of
75%, 85%, and 95% ................................................................................. 107
Figure G.1 Teacher View of the LMG for the Course ............................................... 184
Figure G.2 Test Student View of the LMG ................................................................ 185
Figure I.1 Histogram of ALEKS Scores in Pre-FACs and FACs Semesters,
Crossovers Removed ............................................................................... 189
Figure I.2 Histogram of Final Grades in Pre-FACs and FACs Semesters,
Crossovers Removed .............................................................................. 190
Figure J.1 Histogram of ALEKS Scores in Pre-FACs and FACs Semesters,
Extreme Cases Removed ......................................................................... 192
Figure J.2 Histogram of Final Grades in Pre-FACs and FACs Semesters,
Crossovers Removed ............................................................................... 193
Figure K.1 Histogram of ALEKS Scores in Pre-FACs and FACs Semesters,
Crossovers and Extreme Cases Removed ................................................ 195
Figure K.2 Histogram of Final Grades in Pre-FACs and FACs Semesters for
Students with Crossovers and Extreme Cases Removed ......................... 196
Figure M.1 LOESS Curve on Model 1 ....................................................................... 198
Figure N.1 Side by Side Plots of Final Grade Percentage at the Cutoff,
Pre-FACs to FACs Semester ................................................................... 199
Figure S.1 Histograms of Students’ Pre-and Post-Survey Affect Scores .................. 206
Figure S.2 Histograms of Students’ Pre-and Post-Survey Cognitive
Competence Scores .................................................................................. 207
Figure S.3 Histograms of Students’ Pre-and Post-Survey Difficulty Scores ............ 208
xvii
Page
Figure S.4 Histograms of Students’ Pre-and Post-Survey Effort Scores .................. 209
Figure S.5 Histograms of Students’ Pre-and Post-Survey Interest Scores ................ 210
Figure S.6 Histograms of Students’ Pre-and Post-Survey Value Scores .................. 211
Figure T.1 Person-Profile Violin Plot by Course for Affect...................................... 212
Figure T.2 Person-Profile Violin Plot by Course for Difficulty ................................ 213
Figure T.3 Person-Profile Violin Plot by Course for Effort ...................................... 214
Figure T.4 Person-Profile Violin Plot by Course for Interest .................................... 215
Figure T.5 Person-Profile Violin Plot by Course for Value ...................................... 216
Figure U.1 Estimated Marginal Means for Female Students’ Changes in Attitude
Components from Pre- to Post-Survey for Final Course Grades of
75%, 85%, and 95% ................................................................................. 217
Figure V.1 Estimated Marginal Mean Plots by Sex on Attitude Components
Change from Pre- to Post-Survey for Final Course Grades of
75%, 85%, and 95% ................................................................................. 219
CHAPTER I
INTRODUCTION1
“…given the present strength of the evidence for the effectiveness of formative
assessment, or assessment for learning, it is somehow surprising that the
implementation of better classroom practises has not been more evident.”
(Hopfenbeck, 2018, p. 548, paraphrasing D. Wiliam, 2018)
The American Statistical Association (ASA) and the Mathematical Association of
America (MAA) delineate several recommendations for improving assessments in
undergraduate introductory statistics courses. One recommendation by the ASA in the
Guidelines for Assessment and Instruction in Statistics Education (GAISE) College
Report (American Statistical Association Revision Committee, 2016) states that
assessments should be used both formatively to improve learning and summatively to
evaluate learning continually. Formative assessments are those that inform the teacher as
to their teaching and instructional practices and inform the student of where they are in
their understanding by using feedback from the assessment process (Black & Wiliam,
1998; Cowie & Bell, 1999; Ghaicha, 2016; Harlen, 2012; Shute, 2008). In contrast,
evidence obtained from summative assessments provides judgments about student
understanding and reports on the achievement with no cycle of feeding back (Harlen,
2012). Both the ASA and the MAA have published recommendations and guidelines to
incorporate formative and summative assessments in the undergraduate introductory
statistics curriculum.
The GAISE College Report (ASA, 2016) stresses the need for introductory
1 Portions and sections of Chapters I, II, and V have been submitted and published in the Statistics Education
Research Journal, 2023, vol. 22 (see Appendix X for permission to reprint in part or whole).
2
statistics courses to provide feedback to students regarding their learning by utilizing
frequent formative assessments. In particular, the MAA Instructional Practices Guide
(Abell et al., 2018) provides vignettes with examples of different formative assessment
strategies for instructors and course designers to implement in order to improve
curriculum and instruction in undergraduate quantitative courses. This guide also
emphasizes the need for evidence-based assessment practices in large-enrollment courses
to enhance various cognitive and performance-based student outcomes (Abell et al.,
2018). Additionally, several principles for assessments outlined in the MAA Instructional
Practices Guide echo the GAISE College Report. These reports recommend that courses
integrate assessments, not as stand-alone events, but rather as s a “continuous cycle” of
assessment throughout the course (Abell et al., 2018; ASA Committee, 2016). The
GAISE College Report and Instructional Practices Guide call for updating assessments in
quantitative courses to motivate student learning through frequent formative assessments
with feedback rather than the heavy focus of summative examinations.
Despite the calls for improved assessment practices from the ASA and MAA,
transforming assessment practices has been slow to be implemented in large-enrollment
undergraduate introductory statistics courses due to the volume of students and the time-
intensive nature of the assessment process. For instance, a large study of university
courses found that students noticed a stark difference in the types of assessments assigned
in large- and small-enrollment classes (Cash et al., 2017). Unsurprisingly, the findings
revealed that assessments used in large-enrollment courses were less frequent and did not
often vary in their design; rather, they utilized summative assessments (Cash et al., 2017).
3
In fact, summative high-stakes assessments accounted for more than 95% of the
assessments in large-enrollment classes (Cash et al., 2017). In addition, the findings
suggested the students’ assessments comprised of one to two midterm exams and one
final exam in these large courses, a glaring departure from the MAA and ASA
recommendations to utilize a continuous cycle of formative assessment with ongoing
feedback (Abell et al., 2018; ASA Committee, 2016). Given this, the focus of this study
was to investigate the impact of formative assessment practices in large-enrollment
introductory statistics courses to improve student achievement and student attitudes
towards statistics, providing students a pathway to complete their quantitative literacy
requirement successfully.
Background of the Problem
Introductory statistics is one of the most critical quantitative courses in a student’s
university experience. Moreover, the 21st century requires students to critically navigate
statistical information in a data-driven world (Rumsey, 2002; Tishkovskaya & Lancaster,
2012). Additionally, the introductory statistics course is quickly becoming the course that
satisfies the quantitative literacy requirement in higher education, replacing college
algebra (Hoang et al., 2017). The increased enrollment of undergraduate students in
introductory statistics courses brings diverse student interests, majors, and mathematical
background knowledge (ASA Revision Committee, 2016; Blair et al., 2018). These
students often do not major in science, technology, engineering, or mathematics (STEM)
fields, and the introductory statistics course becomes their only quantitative literacy
4
course in their program of study. Due to this growth and diversity of students, Fong et al.
(2015) found that students enrolled in introductory mathematics courses lack the
fundamental mathematical knowledge needed for mathematical thinking. This deficiency
in mathematical knowledge results in a financial burden for many students as they
attempt multiple semesters to successfully pass their introductory statistics class. These
multiple attempts create a “bottleneck” by slowing the students’ path toward further
studies (Complete College America, 2012; Peck, 2019). To further exacerbate these
problems, introductory statistics courses tend to be large-enrollment courses in large
universities, disallowing students individualized academic attention (Blair et al., 2018;
Cash et al., 2017). Additionally, research on class size suggests that as a developmental
mathematics class size increases, the probability of students successfully passing
decreases (Fong et al., 2015). Instructors of large-enrollment introductory statistics
courses find it nearly impossible to provide individualized student attention and feedback
on students’ assessments with the added challenge of attending to the students’ different
mathematical preparedness for statistics (Cash et al., 2017). As universities address the
increased enrollment and demand for introductory statistics courses, large-enrollment
courses become the answer.
Students’ challenges in navigating large-enrollment statistics courses impact their
experiences, affect whether they value statistics, and influence their attitudes toward
statistics throughout their adult lives (Ramirez et al., 2012; Tichkovskaya & Lancaster,
2012). Evidence suggests that students who are not STEM majors experience greater
statistics anxieties and tend to avoid courses in statistics, which negatively affects their
5
attitudes toward statistics and their achievement in the course (Chew & Dillon, 2014;
Chiesi & Primi, 2010; Lavidas et al., 2020; Onwuegbuzie & Wilson, 2003; Williams,
2015). These negative attitudes toward statistics stick with students long after they
experience their introductory course, affecting their motivation and appreciation for
statistical literacy (Ramirez et al., 2012; Xu & Schau, 2019). Additionally, negative
attitudes are associated with decreased achievement (Chiesi & Primi, 2010; Emmioğlu &
Capa-Aydin, 2012). These negative attitudes, linked to testing and assessment
performance in students’ introductory statistics courses, can be attributed to their lack of
mathematical understanding and preparation from prior mathematics courses (Chiesi &
Primi, 2010; Harlow et al., 2002; Malik, 2015; Onwuegbuzie & Wilson, 2003). With the
large-enrollment course designs, high degrees of negative attitudes, and use of high-
stakes examinations with minimal, if any, feedback provided, many students struggle to
be successful in these statistics courses (Cash et al., 2017; Onwuegbuzie & Wilson, 2003;
Peck, 2019). Where large-enrollment classes answer the need to support increased
enrollment and the aforementioned bottleneck by many non-STEM students, many of
these statistics courses aggravate students’ negative attitudes toward statistics, affecting
their achievement.
Conversely, teachers using innovative assessments in higher education courses
can better influence their students’ attitudes, persistence, and achievement in introductory
statistics (Abell et al., 2018; ASA Revision Committee, 2016). Specifically, students’
positive attitudes toward statistics are associated with students’ self-reported degrees of
motivation and higher achievement scores (Chiesi & Primi, 2010; Ramirez et al., 2012).
6
Furthermore, when classrooms use formative assessment with feedback focused on
learning rather than performance, students experience a change in their mindset and
relationship with their learning (Boaler & Confer, 2017). Finally, recent research asserts
that a “comprehensive approach to designing a successful statistics pathway” is needed to
provide support structures in introductory statistics for underprepared students to
“effectively complete their college-level statistics course” (Peck, 2019, p. 35). Heeding
this call, the research purports that introductory statistics courses utilizing formative
assessments with feedback can improve the important student outcomes of attitudes
toward statistics and achievement.
Purpose of the Study
Although there is much research on the importance of formative assessments and
feedback in education literature, implementing these formative assessment practices is
absent in most large-enrollment undergraduate introductory statistics courses. Large-
enrollment introductory statistics courses tend to use few high-stakes tests rather than the
recommended frequent formative assessments with feedback. The many challenges of
large-enrollment courses, including less individualized attention and feedback, burden
students financially and academically as repeated attempts at the course may be required
for successful completion. As enrollments in introductory statistics continue to climb,
pathways for successful completion of those courses must be prioritized (Peck, 2019).
Therefore, the purpose of the study was to investigate the impact of an embedded cycle of
formative assessment with feedback and reassessment opportunities in the curriculum of
7
large-enrollment introductory statistics courses on student attitudes toward statistics and
student achievement scores.
Significance of the Study
The effects of this study have far-reaching implications for mathematics and
statistics departments, instructors, and students. This study provided important
information as a feasibility study for a future large-scale endeavor. Regression
discontinuity served as a viable way to determine the impact of formative assessment
cycles (FACs) on course achievement. Additionally, this study found that course
achievement moderated student attitudes toward statistics. These findings can
significantly contribute to mathematics and statistics departments’ introductory statistics
curriculum design. Departments must allocate resources to assist instructors and
curriculum creators of these large-enrollment courses to transform their assessments with
smaller, more frequent formative assessment practices, with feedback and reassessment
opportunities. Creating corequisite courses can benefit students of different mathematical
backgrounds. Moreover, with future research, introductory statistics courses will readily
employ FACs in the curriculum. The findings of this study suggested that FACs assisted
students with successful completion of their quantitative literacy requirement, especially
students with lower math placement scores. Additionally, this study suggested that
students’ attitudes toward statistics improved with greater student achievement,
empowering non-STEM majors to find STEM fields more accessible. Thus, these
findings have the potential to contribute toward more intentional policy development for
8
statistics programs, better course designs in statistics, and additional pathways for
students to succeed who are often disempowered in STEM fields.
Research Questions
This study sought to answer the following two research questions regarding the
effects of FACs on student attitudes toward statistics and statistics achievement.
Research Question 1
How do formative assessment cycles (FACs) affect student achievement in large-
enrollment introductory statistics courses for different mathematically prepared students?
Research Question 2
After allowing for student-to-student variability, which student attitude
components change after a semester of a large-enrollment introductory statistics course
with FACs? Also, how do demographic factors impact attitude, and do these effects
change over time?
Summary of Research Design
To quantitatively analyze the impact of FACs on students’ attitudes and
achievement scores, this dissertation research used a quasi-experimental research design
(Cresswell & Cresswell, 2018). To investigate these research questions, I utilized
surveys, exam scores, course grades, and demographic information from existing student
data to investigate changes in students’ attitudes and student achievement in large-
9
enrollment introductory statistics courses. A quasi-experimental study was most
appropriate because it was impossible to randomize the participants into treatment and
control groups (Scher et al., 2015). Through these approaches, this research design
allowed for effective quantitative analysis of the research questions.
Summary
Because more and more students are choosing introductory statistics to fulfill a
quantitative literacy requirement, student enrollments in introductory statistics courses
are ever-increasing. Large class sizes are the norm for navigating both the student’s
educational trajectory and for instructors to teach statistical thinking. As such, the need
for students to successfully complete their introductory statistics course to mitigate the
bottleneck of their educational progress is evermore paramount. In particular, assessment
practices that benefit students, such as formative assessment and feedback, are lacking in
large-enrollment courses, affecting student attitudes towards statistics and their
achievement. Thus, this study investigated FACs and their impact on student attitudes
and achievement, using a quasi-experimental research design to analyze the research
questions. A list of key terms and definitions completes this chapter.
Definition of Terms
Terms that are key to this study are defined below.
Assessment for learning: “Any assessment for which the first priority in its
design and practice is to serve the purpose of promoting students’ learning” (Black et al.,
10
2004, p. 10).
Assessment of learning: Assessment is carried out only for the purposes of
grading and reporting (Assessment Reform Group [ARG], 2002).
Corequisite course: A corequisite course concomitantly teaches the prerequisite
knowledge needed for the current course to eliminate the need for an extra semester for
course preparation (Complete College America, 2012).
Feedback: “Information communicated to the learner that is intended to modify
their thinking or behavior to improve learning” (Shute, 2008, p. 154).
Formative assessment: Provides evidence about student achievement, which is
obtained and utilized by the teacher and/or the student to make decisions about the next
steps in the learning process, intending to improve and elucidate student understanding
(Black & Wiliam, 2009).
Formative assessment cycles (FACs): A cyclical formative assessment process
using formative assessments with immediate feedback for students’ self-assessment to
learn from their mistakes, with reassessment opportunities offered for the student to show
an increased understanding.
Introductory statistics (Stat 1040): Quantitative literacy course requirement for
most non-scientific majors. The course covers descriptive and inferential statistical
methods emphasizing conceptual understanding and statistical thinking. This is a 3-credit
hour, one-semester course (https://catalog.usu.edu/).
Introductory statistics with algebra (Stat 1045): Co-requisite course integrating
elements of algebra with Introductory Statistics Stat 1040 for students whose math
11
placement score is below what is required for Stat 1040. This is a 5-credit hour, one-
semester course (https://catalog.usu.edu/).
Large-enrollment: The number of students in a class or course is greater than
what one instructor can teach on one’s own, requiring multiple smaller recitation or lab
sections to facilitate both student and teaching needs (Hornsby & Osman, 2014).
Summative assessment: The evidence obtained from summative assessments are
judgments about the achievement in order to report on the achievement with no cycle of
feeding back to instruction (Harlen, 2012). Most summative assessments are considered
Assessments of Learning, but when coupled with corrective feedback can be used by
students as Assessments for Learning (Harlen, 2012).
Reassessment: When learning tasks are assigned to a learning criterion, multiple
attempts at learning from those criteria are allowed to gain an understanding about one’s
progress through a curriculum (Abell et al., 2018).
12
CHAPTER II
LITERATURE REVIEW
“Assessment is operationally defined as part of the educational process where
instructors appraise student achievements by collecting, measuring, analyzing,
synthesizing and interpreting relevant information…under controlled conditions in
relation to curricula objectives set for their levels” (Ghaicha, 2016, p. 211).
Introduction
Recommendations for improving assessments are detailed in reports from both the
ASA and the MAA. For instance, the GAISE College Report (ASA Committee, 2016)
explicitly addressed how instructors should employ assessments in introductory statistics
courses. Specifically, to “[u]se assessments to improve and evaluate student learning” (p.
3), the GAISE authors stressed that (a) students should receive timely feedback
throughout the course, (b) assessments should align with learning outcomes, and (c)
instructors should maximize the use of varying types of formative assessments in addition
to summative examinations. Summative assessments are designed to judge achievement
with no cycle of feedback to instruction (Harlan, 2012). Summative assessments are
usually considered high-stakes exams because they contribute to a large portion of the
overall grade, such as midterm and final exams. In contrast, formative assessments are
lower-stakes assessments that provide evidence about achievement through feedback
utilized by the teacher or the student to inform the learning process and subsequent
actions (Black & Wiliam, 2009; Harlan, 2012). These reports from the ASA and MAA
described guidelines and recommendations for undergraduate quantitative courses for
13
formative assessment practices.
The MAA Instructional Practices Guide (Abell et al., 2018) provided an
assessment framework, gleaning from research about the benefits of evidence-based
assessment practices, ASA recommendations, and the National Council of Teachers of
Mathematics (NCTM) standards (ASA Committee, 2016; Gold et al., 1999; NCTM,
2000; Steen, 2006). The assessment framework is based on the following six principles.
1. Assessment is not a single event but a continuous cycle.
2. Assessment must be an open process.
3. Assessment must promote valid inferences.
4. Assessment that matters should always employ multiple measures of
performance.
5. Assessment should measure what is worth learning, not just what is easy to
measure.
6. Assessment should support every student’s opportunity to learn important
mathematics (Steen, 1999, p. 5-6).
Several of MAA’s principles for assessments overlapped with the GAISE College
Report’s (ASA Committee, 2016) recommendations, specifically that assessments are not
singular events but should be considered cycles, where feedback cycles back to the
teacher and student throughout the course (Steen, 1999). Additionally, the MAA
Instructional Practices Guide stressed that teachers and students must use learning goals
and course objectives in the feedback process (Abell et al., 2018). The MAA
Instructional Practices Guide devoted a section to assessment in large-enrollment
courses, suggesting online response systems, online homework systems, and the use of
technology to support instructors in providing timely feedback to students (Abell et al.,
14
2018). Research must address how instructors can take these recommendations to create
meaningful assessments in large-enrollment courses.
With the focus on contributing to the field of statistics education and heeding the
call to create successful student pathways to complete the introductory statistics course,
the purpose of this literature review is to examine formative assessment concepts and
their relationships in the literature to student attitudes and achievement. Before presenting
the conceptual framework, this introduction continues with a broad synthesis of the
research on student attitudes toward statistics and student achievement in statistics. I
introduce the important variable of mathematical preparedness, providing frameworks in
the research where this variable was significantly associated with student attitudes toward
statistics and student achievement. Then, I provide the theoretical context for the
conceptual framework, in which the lens of andragogy informs formative assessment
theory in higher education. The second section of the chapter presents the conceptual
framework for formative assessment cycles (FACs). I clarify the importance of the three
main elements of the FAC in formative assessment literature: formative assessments as
Assessments for Learning, feedback and self-assessment, and reassessment. The third
section of the chapter summarizes the statistics education research on utilizing
technology and computer-based assessments to import the FAC into large-enrollment
introductory statistics courses. Finally, the fourth section expounds on the relations
between the three formative assessment elements and students’ attitudes and achievement
from the literature. In conclusion, I discuss how the conceptual framework informs the
proposed research as a curricular intervention to impact student attitudes toward statistics
15
and achievement in the large-enrollment introductory statistics course.
The Association Between Student Attitudes
Toward Statistics and Student Achievement
Research has confirmed the existence of associations between student attitudes
toward statistics and their achievement in undergraduate introductory statistics courses.
Several studies have demonstrated a positive association between students’ attitudes and
achievement in introductory statistics (Chiesi & Primi, 2010; Emmioğlu & Capa-Aydin,
2012; Evans, 2007; Maure & Marimon, 2014; Ramirez et al., 2012; Schau, 2003).
Additionally, the research suggested that high levels of statistics anxiety were associated
with negative attitudes toward statistics (Chiesi & Primi, 2010, Kesici et al., 2011; Malik,
2015; Slootmaeckers et al., 2014; Williams, 2015). These high levels of statistics anxiety
were also associated with decreased student achievement in introductory statistics courses
(Chew & Dillon, 2014; Chiesi & Primi, 2010). Specifically, students’ self-concept and
perceived worth of statistics predicted their final exam scores in online and face-to-face
large student sections of introductory statistics courses (Zimmerman & Austin, 2020).
Moreover, high levels of anxiety evidenced an inverse relationship with attitudes, which
negatively affected achievement (Chiesi & Primi, 2010). These associations demonstrate
an important connection between students’ attitudes and their achievement in
introductory statistics.
The authors of the studies on attitudes and achievement provided suggestions and
recommendations for further research to explore ways to improve student attitudes
toward statistics due to the connections between student attitudes and achievement.
16
Chiesi and Primi (2010) encouraged implementing interventions like formative
assessments to lower anxiety and increase positive attitudes in introductory statistics
students with various mathematical preparedness. Malik (2015), in a phenomenological
study of undergraduate introductory statistics, interviewed students and found that
students had high levels of statistics anxiety concerning their being assessed in statistics.
These students overwhelmingly stated that testing situations caused the most significant
anxiety response. As per the ASA and MAA recommendations, using formative
assessments with feedback in a lower-stakes environment is a possible pedagogical
intervention that could decrease anxiety for students, improve attitudes, and lead to
greater student achievement (Abell et al., 2018; ASA Revision Committee, 2016;
Ramirez et al., 2012). Given these findings, it is apparent that students’ attitudes can be a
gateway or a gatekeeper to achievement in statistics or other quantitative courses,
especially in less mathematically prepared students.
Using the Survey of Attitudes Toward Statistics (SATS), Schau (2003) found
students’ attitudes toward statistics negatively changed over the course; however, positive
attitudes correlated with increased achievement. Thus, Schau believed that changes in
attitudes impacted students’ achievement in introductory statistics. This association was
further confirmed by an analysis completed by Ramirez et al. (2012). The authors found
17 studies that evaluated the relationships between SATS attitudes and achievement,
where 15 had significant positive associations. A meta-analysis of the research on student
attitudes using the SATS instrument and student achievement conducted by Emmioğlu
and Capa-Aydin (2012) looked at effect sizes and components of the SATS instrument
17
across different countries. Across all countries, attitudes toward statistics and students’
achievement were positively associated; however, larger effect sizes were evidenced in
studies on United States’ students regarding the two components of the SATS: affect and
cognitive competence. Thus, research has demonstrated the effectiveness of measuring
student attitudes toward statistics using the SATS instrument and its association with
achievement.
The Associations Between Mathematics Preparedness
and Student Attitudes Toward Statistics and
Student Achievement
Measuring mathematical preparedness for introductory statistics differs in the
research, such as a mathematics placement exam, a survey of prior mathematics courses,
or a mathematics assessment given prior to class to measure a student’s mathematical
knowledge. All research presented in this section utilized correlational studies to measure
a student’s mathematical understanding in these various ways before the introductory
course and evaluated the students’ preparedness with course attitudes and achievement.
Several studies developed frameworks relating students’ mathematical
preparedness prior to introductory statistics and student attitudes as measured by the
SATS instrument and student achievement. Schau (2003) identified that students’ prior
achievement in mathematics influenced students’ course achievement in introductory
statistics: her model related prior mathematics achievement with pre-course attitudes and,
subsequently, post-course attitudes. In addition to Schau, Ramirez et al. (2012) developed
a model relating students’ prior achievement, attitude, and course outcomes from a meta-
analysis. They identified ten studies that examined the relationship between past
18
mathematical experience and statistics attitudes. All ten studies showed significant
positive associations between prior mathematical achievement and course attitudes.
Those studies which utilized the SATS instruments to measure attitudes showed higher
positive associations with the attitude components of affect, cognitive competence, and
difficulty when the students had more prior experience with mathematics. Additionally,
Ramirez et al. found six studies that examined the relationship between prior
mathematics experience and course achievement, and all six showed a significant positive
correlation. These studies confirm that prior student experience with mathematics is an
important variable that impacts both student achievement and student attitudes toward
statistics.
Several studies specifically measured students’ pre-course mathematical abilities
to determine the effect on their attitudes toward introductory statistics and achievement.
In a large-scale study in Italy, Chiesi and Primi (2010) utilized a pre- and posttest design
measuring introductory statistics students’ (80% female) anxiety, attitudes using the
SATS instrument, and achievement. A unique aspect of this study was the varied
experiences and degrees of mathematical preparedness and statistics experience these
students had before their enrollment. Findings indicated that students who were less
mathematically prepared had less confidence, more negative attitudes, and thought
statistics was relatively more complicated than students who were more mathematically
prepared. At the end of the course, both the more and less mathematically prepared
students’ attitudes improved, but the less prepared students had greater self-confidence
and felt that statistics had value (Chiesi & Primi, 2010). Using structural equation
19
modeling, Harlow et al. (2002) found a significant link between the students’ pre-course
mathematical ability and their post-course achievement: 38% of the variation in post-
course achievement was due to the students’ mathematical preparedness. Additionally,
students' pre-course attitudes significantly predicted their mathematics pre-course ability
(Harlow, 2002). Taken together, these studies provide important evidence that current
research must measure the students’ prior mathematical knowledge as it relates to
students’ attitudes and achievement in introductory statistics.
Theoretical Perspectives to Examine Formative
Assessment in Higher Education
Andragogy is a theoretical perspective explaining how adult learners engage in
their learning environments (Knowles, 1978). Viewing learning theories through the lens
of andragogy provides unique insight and understanding as to how adult learners learn
and process information. It posits five central tenets.
1. Adult learners are motivated and desire to learn.
2. Adult learners want to apply information to life situations directly.
3. Adults’ life experiences provide a valuable resource for their learning.
4. Adult learners are self-directed.
5. As age increases, differences across individuals are vast and contextualized
(Knowles, 1978; Lindeman, 1926; Merriam, 2001).
For nearly a century, these tenets have critically informed adult education worldwide
(Merriam, 2001). As stated by Knowles (1978), “adult education is an attempt to discover
a new method and create a new incentive for learning” (p. 11). Additionally, andragogy is
learner-centered, with educators given the charge to “involve learners in as many aspects
20
of their education as possible,” such that the educational climate fosters positive learning
(Houle, 1996, p. 30). Using these tenets, higher education can more effectively value and
address adult learners’ differences and unique abilities that they bring to their educational
environments. Viewing learning theories through the lens of andragogy provides context
for understanding undergraduate students and their experience in introductory statistics.
These learning theories include social-cultural theories to situate formative
assessment in undergraduate education (see Figure 1). By taking up an andragogical lens,
we can draw from these tenets of andragogy to leverage adult learners’ self-efficacy and
self-regulation through assessment design in large-enrollment introductory statistics
classes.
Figure 1
Situating Formative Assessment in Theoretical Perspectives in Higher Education
21
To consider relations between assessments and adults’ learning, self-efficacy is
used to explain motivation with both constructs. Andragogy posits that adult learners are
internally motivated and more self-directed in their learning (Knowles, 1978; Merriam,
2001; Sosibo, 2019). Self-efficacy theory regards the beliefs in one’s actions that can
produce or thwart desired or damaging results (Bandura, 2001). Thus, without these
beliefs and motivation, adult learners would have “little incentive to act or to persevere in
the face of difficulties” (Bandura, 2001, p. 10). Additionally, self-regulation is the theory
of using those beliefs and motivation to actively engage in learning. The actions of adult
learners are rooted in their core beliefs that their efforts will produce the desired
outcomes in their education (Bandura, 2001). Thus, if adult learners believe in their
abilities to achieve their goals, their actions will stem from those beliefs (Mangels et al.,
2006). Together, these theoretical perspectives lay the foundation for the conceptual
framework of formative assessment cycles (FACs) and the outcomes of student attitudes
toward statistics and student achievement.
Formative Assessment and Student Attitudes Toward Statistics and Student
Achievement: A Conceptual Framework
The review of the assessment literature resulted in a three-pronged approach to
formative assessment in large-enrollment introductory statistics courses: frequent lower-
stakes assessments (i.e., Assessments for Learning), feedback, and reassessment
opportunities. Additionally, the research on assessment in undergraduate quantitative
literacy courses revealed empirical studies implementing at least one of the three-prongs
22
of formative assessment and measuring student attitudes and/or student achievement.
Thus, the conceptual framework for the proposed study displays the three-prongs as a
formative assessment cycle (FAC) embedded in the introductory statistics curriculum to
improve both attitudes and achievement. The dotted blue arrows in Figure 2 indicate
those relationships representing the research questions this study investigated.
Figure 2
Conceptual Framework for Formative Assessment Cycles in Large-Enrollment
Introductory Statistics Courses
The formative assessment cycle employs the use of frequent low-stakes
assessments, feedback, and reassessment. These three elements, embedded into a large-
enrollment course, create the continuous, open assessments suggested by the MAA
recommendations (Abell et al., 2018; Steen, 1999). First, by employing frequent, low-
stakes assessments rather than a few high-stakes assessments, students are given more
assessment opportunities instead of the few midterm exams which account for a large
23
portion of their grade. The MAA described assessment as more than a few high-stakes
tests, but rather, as a “wider set of measures,” in which varied assessments measure
students’ progress on learning outcomes (Abell et al., 2018, p. 50). Second, instructors
must connect timely feedback from the formative assessments to learning goals. The
learning goals allow students to self-assess their level of understanding and create the
next actions for improving their knowledge. Third, when students take advantage of
reassessment opportunities of the learning goals not yet mastered, students see their
learning and understanding improve. The literature explicated formative assessment as
Assessments for Learning, where students employed self-assessment by using feedback
to improve their understanding and used reassessment to reevaluate their knowledge.
Formative Assessments as Assessments for Learning
Assessment for Learning, defined by Black et al. (2004), is “any assessment for
which the first priority in its design and practice is to serve the purpose of promoting
students’ learning” (p. 10). The Assessment Research Group (2002) defines Assessment
of Learning as summative assessments that are only used for the purposes of grading and
reporting. Harlen (2012) created a framework for assessments which places assessments
on a continuum, ranging from formative to summative assessments and broadly as
Assessment for Learning to Assessment of Learning (see Figure 3).
According to Harlen (2012), assessments can be categorized as informal
formative, formal formative, informal summative, or formal summative. Formative
assessments are those that inform the teacher as to their teaching and instructional
practices and, additionally, inform the student of where they are in their understanding
24
Figure 3
Assessment Continuum
Note. Adapted from Harlen (2012, 1998). Assessment for Learning overlaps
Assessment of Learning when informal summative assessments use feedback
that cycles back to the learner to inform them of their understanding.
using feedback from the assessment process (Black & Wiliam, 1998; Cowie & Bell,
1999; Ghaicha, 2016; Harlen, 2012; Shute, 2008). This definition of formative
assessment has been used extensively over the past twenty years and is currently used by
both ASA and MAA in their recommendations for assessment practices. Formative
assessments inform Assessment for Learning, but informal summative assessments can
be used formatively if they incorporate feedback for students and teachers (Harlen, 2012).
Figure 3 shows this overlap of Assessment for Learning and Assessment of Learning
depending on the instructor’s use of informal summative assessments.
Assessments can range from in-class questions (informal formative) to final
25
examinations (formal summative). Formal formative assessments are graded with
feedback cycling back to the student and instructor, such as homework and quizzes.
Quizzes and midterms can be informal summative if feedback does not inform the
teaching process but is also used formatively to inform the student of their learning
(Davies & Marriott, 2010; Harlen, 2012). Thus, these classifications of assessments can
be seen as a continuum from “Assessment for Learning” to “Assessment of Learning”
(Harlen, 2012) in Figure 3. Additionally, Stiggins (2002) and Black and Wiliam (2009)
stressed the importance of moving more assessments from the Assessments of Learning
category to those in the Assessments for Learning category. Together, these definitions
situate Assessment for Learning and encompass the types of assessments utilized in a
FAC.
Learning outcomes are a necessary component of Assessments for Learning.
Black and Wiliam (2009) posited that to implement Assessment for Learning, the
instructor must employ three critical processes: (1) clarify learning outcomes, (2) create
tasks consistent with those outcomes to provide evidence of student learning, and (3)
provide feedback (p. 8). Choosing the tasks for formative assessments must be “justified
in terms of the learning aims that they serve” (Black & Wiliam, 1998, p. 143). Thus,
learning outcomes must be clearly defined and communicated for formative feedback to
be utilized (Stiggins, 2002). Stiggins suggested that instructors must know and
communicate those learning outcomes in the syllabus before students engage in course
material over those objectives. This communication allowed adult learners to recognize
the learning goals or outcomes expected in the course curriculum (Yorke, 2003). Ghaicha
26
(2016) also expressed the importance of learning objectives in assessments. Specifically,
Ghaicha defined assessments as an integral part of the learning process where instructors
evaluate student achievements by “collecting, measuring, analyzing, synthesizing, and
interpreting relevant information…under controlled conditions in relation to curricula
objectives set for their levels” (p. 211). Indeed, specifying the course's learning outcomes
assists the instructor in creating tasks relevant to the course material.
The tasks in formative assessments provide evidence of the learning outcomes
and allow the adult learner to act upon the feedback from the assessments (Davies &
Marriott, 2010). Thus, formative assessments allow students to progress through the
course’s learning objectives by receiving feedback and taking ownership of their learning
(Black & Wiliam, 2009). Assessment for Learning encourages adult learners by engaging
with their learning directly through assessments that reflect the course’s content. As self-
directed learners, the students act upon formative feedback to increase their
understanding.
Feedback and Self-Assessment
Formative assessments comprise low to no-stakes assessments with feedback.
Simple yet effective feedback is a necessary condition for an assessment to be formative.
Black and Wiliam (1998) made a case for formative assessment in classroom practice
through a meta-analysis of over nine years of empirical research. In a meta-analysis by
Hattie and Timperley (2007), feedback was most effective when it related to learning
goals. Hattie and Timperley described feedback as “one of the most powerful influences
on learning and achievement” (p.81) by answering students’ questions about their
27
progress and direction in their learning. Steen (1999) maintained that assessment must be
an open process through learning goals to inform both the teacher and the student of the
student’s progress. Thus, the onus is on the adult learner to use that feedback to improve
their understanding (Sosibo, 2019). Feedback that aligns with learning outcomes can help
identify gaps in student knowledge and assist adult learners in seeing where they can
improve.
Adult learners are motivated to learn and are self-directed. Thus, assessments that
employ self-assessment from the feedback allow the student to learn from the feedback.
Students who engage in self-assessment become more motivated to act on feedback to
improve their learning (Ghaicha, 2016). “Indeed, unless linked to an effective process of
reflection, assessment can easily become what many faculty fear: a waste of time and
effort” (Steen, 1999, p. 5). Black and Wiliam (2009) drew on a partnership between
teachers and students in their formative assessment framework, stating that students must
be resources for themselves by taking ownership of their learning. Black and Wiliam
(1998) cautioned that formative assessment cannot be productive without students being
able and willing to self-assess to further their learning goals. Students’ self-assessment is
the “essential component of formative assessment” (p. 143). Additionally, Stiggins
(2002) underscored critical features of formative assessment by explaining that when
students employ continuous self-assessment aligned with learning outcomes, they can
better reflect on their knowledge developed over time. By incorporating regular self-
assessment and feedback as part of the assessment cycle, students watch themselves grow
as learners (Black & Wiliam, 2009; Stiggins, 2002; Wride, 2017). One of the outcomes of
28
Assessment for Learning, which Boaler and Confer (2017) found in their work, is that
students changed perceptions of who they were as learners, thinkers, and problem-solvers
after engaging in formative assessment procedures. Thus, Assessment for Learning can
change the landscape of mathematics and statistics education by improving quantitative
learners’ self-efficacy—their hope and belief in successfully meeting their educational
pursuits (Boaler & Confer, 2017). Taken together, the body of research provides evidence
that feedback with self-assessment assists adult learners with the desire and motivation to
learn from their mistakes.
Reassessment
Reassessment allows for multiple attempts at learning tasks to progress through a
curriculum. Having utilized feedback by self-assessment, reassessment allows students an
additional opportunity to show evidence of learning. Because one purpose of formative
assessment regards the learning process, reassessment can foster self-motivation, goal-
orientation, and positive motivational beliefs of persistence and confidence; specifically,
the ability to persevere despite an initial poor performance (Duckworth et al., 2007;
Dweck, 2008; Yin et al., 2008). Steen (1999) asserted, “assessment that matters should
always employ multiple measures of performance” (p. 4). One way of creating multiple
measures of performance is by setting up assessments with opportunities for numerous
attempts, retakes, or reassessments, giving students additional chances to demonstrate
learning from the formative feedback (Abell et al., 2018; Dweck, 2008; Grant & Dweck,
2003). The opportunity to learn from their mistakes gives adult learners ownership of
their learning and promotes self-efficacy and self-regulation.
29
Cognitive psychology has consistently found that errors on tests can spark
significant learning and retention, but only if the feedback is immediate, not delayed
(Brame & Biel, 2015; Hays et al., 2013). Brame and Biel (2015) recommended that low-
and no-stakes testing environments offer the most benefit of “test-enhanced learning” (p.
9). Test-enhanced learning is when testing becomes a learning opportunity for students.
Additionally, Brame and Biel (2015) found that incorporating frequent testing
opportunities such as reassessments or retakes increased student learning in
undergraduate science courses. The increased frequency of formative feedback allowed
students to view formative assessments as “learning events,” and thus, students began to
evaluate their errors based on learning outcomes for the course, creating the potential for
greater recall on the reassessment (p. 10). This research offered evidence that Assessment
for Learning provides many benefits to the learner when reassessment is allowed, such as
retention of material, self-efficacy, and self-regulation. Together, these elements form a
formative assessment cycle. How these elements of the FAC can be employed in a large-
enrollment course is needed for instructors to import FACs in the curriculum
successfully.
Formative Assessment Cycle for Large-Enrollment Courses
Although emphasized as important for student learning, formative assessment
cycles are not yet fully incorporated in large-enrollment courses. Cash et al. (2017)
discovered that the majority of large-enrollment courses use high-stakes summative
assessments. To rectify this, the MAA suggested that large-enrollment courses utilize
30
technology to create continuous assessments with feedback and reassessments to achieve
course objectives and learning goals (Abell et al., 2018). Computerized testing and
automatic feedback are vehicles for timely, concise feedback and consistent grading in
large-enrollment courses (Hattie & Timperley, 2007; Shute, 2008; Stirling, 2010). In fact,
the use of technology in mathematics and statistics courses has been studied for over 35
years. An overwhelming body of evidence supports its use, such that both the MAA and
ASA recommend technology be used throughout the introductory courses (Abell et al.,
2018; ASA Revision Committee, 2016). The benefits of technology were further
illuminated in a study by the American Mathematical Society (2009). They stated that
students and instructors reported positive experiences using technology for online
homework and quizzes. Having surveyed over 1,200 mathematics and statistics
departments in universities in the U.S., they found three main advantages to online
homework and learning systems: immediate feedback, multiple attempts allowed for
incorrect problems, and less grading. Years of research on computerized assessments
with automated feedback implicates that using technology is not just beneficial for
students but recommended for introductory quantitative courses, as it allows for
reassessment without an additional burden on the instructor and offers individualized
attention to the student in large-enrollment courses.
Implementing formative assessments with feedback and self-assessment with
reassessment opportunities creates a continuous cycle rather than isolated assessment
events in large-enrollment courses. Furthermore, the use of technology can aid instructors
and course designers in providing timely and consistent feedback that students and
31
instructors can use to determine the achievement of course objectives and learning goals.
Finally, given the opportunity to reassess their understanding, adult learners in large-
enrollment courses can better retain their learning and improve their self-efficacy and
self-regulation, promoting greater achievement.
Three Elements of the Formative Assessment Cycle and Student
Attitudes and Achievement
The three elements of formative assessment cycles are frequent formative
assessments, feedback, and the opportunity to reassess. Large-enrollment courses have
lagged in implementing formative assessments (Cash et al., 2017), yet formative
assessments are recommended in the curriculum to help students improve both their
attitudes and achievement (Abell et al., 2018; ASA Committee, 2016). Implementing
FACs in large-enrollment introductory statistics courses could be a catalyst to successful
student experiences, especially for students who are non-STEM majors. The literature on
these three elements of the FAC elucidates the benefits to both students’ achievement and
attitudes; however, the entirety of the cycle has yet to be incorporated as a framework for
the large-enrollment introductory statistics course curriculum.
Frequent, Low-Stakes Assessments and
Attitudes and Achievement
Evidence suggests that formative assessments as frequent, smaller assessments
assist students’ attitudes and achievement in large-enrollment undergraduate courses by
increasing motivation. Broadbent et al. (2018) suggested breaking up larger summative
assessments into smaller, lower-stakes assessments to assist large-enrollment courses by
32
increasing the number of formative assessments. Although this was tested in a large-
enrollment introductory psychology course and not a quantitative one, this research found
that students reported increased motivation, improved ability to self-assess, and greater
learning in the course (Broadbent et al., 2018). Breaking up larger summative
assessments into more frequent formative assessments heeds the call to move more
assessments from Assessments of Learning to Assessments for Learning.
Formative assessments can be employed both in-class and out of class. For
example, in-class assessment strategies can utilize an online student response system
(OSRS) or polling systems. Several studies suggested these in-class, no-stakes
assessments effectively elicited student motivation and activated learning (Freeman et al.,
2014; Gundlach et al., 2015, Muir et al., 2020). What was apparent was that through
using these in-class low and no-stakes formative assessments, the curriculum embedded
active-learning approaches, which positively affected many aspects (i.e., student
achievement, attitudes, motivation, engagement, and perceived achievement) of the
educational climate in large-enrollment courses (Freeman et al., 2014; Gundlach et al.,
2015, Muir et al., 2020). For example, in a meta-analysis of 225 studies in STEM
courses, online student response systems promoted active learning in large-enrollment
course lectures. In these activated lectures, exam scores increased by 6% over traditional
lectures (Freeman et al., 2014). These studies suggest that informal formative
assessments positively impact important student outcomes in undergraduate STEM
courses.
Using online formative activities as formative assessments also improved the
33
student outcomes of attitudes and achievement in large-enrollment introductory statistics
courses. Gundlach et al. (2015) used both an OSRS and online homework in a large face-
to-face section of introductory statistics. The authors measured student attitudes using the
SATS (Schau, 2003) instrument and found that student attitudes improved in both the
affect and cognitive competence subscales. In addition, these students’ summative exam
scores were higher than in the flipped and online introductory statistics sections
(Gundlach et al., 2015). In another large-enrollment introductory statistics study for non-
statistics majors by Hodgson and Pang (2012), online formative activities (OFAs)
improved self-regulation. More than 60% of students reported increased motivation, and
over 70% stated that the OFAs helped their learning and understanding of statistics
(Hodgson & Pang, 2012). These studies provide evidence that student attitudes toward
statistics and student achievement are associated with and impacted by online formative
activities.
Other quantitative introductory courses have also seen increased course
performance using computer-assisted assessments such as online homework. For
example, in a large study of both freshmen mathematics and statistics students, computer-
based formative assessments helped identify underperforming students and improved
final exam scores in learning mathematics and statistics (Tempelaar et al., 2014). Another
large-scale study evaluated the use of web-based homework for a calculus course. The
grades of the freshman students who utilized the web-based homework improved on
average by two letter grades over those who did not (Hirsch & Weibel, 2003). Other
studies with web-based homework showed advantages to online homework systems.
34
These web-based programs administer and provide automatically generated feedback to
both the student and the instructor. Physiology students reported that using OFAs helped
them prepare for and improve their scores on summative exams by using feedback from
the OFAs to identify the gaps in their knowledge (de Kleijn et al., 2013). In a study of
introductory statistics, the online homework system helped students increase exam
performance significantly. The authors attributed this to the immediate feedback from the
online homework (Balta & Güvercin, 2016). Thus, there is evidence that online formative
activities can improve summative exam scores and final grades in large-enrollment
undergraduate courses.
Another example of these smaller, lower-stakes assessments in large-enrollment
quantitative literacy courses is the Just in Time Teaching (JiTT) model developed in 1996
to aid undergraduate science and mathematics students in using out-of-class time more
effectively (Novak et al., 1999). With web-enhanced learning, the use of JiTT quizzes
before class to inform teachers of student understanding relative to learning outcomes has
been improved and made simpler (Abell et al., 2018). Natarajan and Bennett (2014) used
modified JiTT quizzes in their study of calculus courses. The students made significant
academic gains in calculus topics when formative quizzes on review material before class
were implemented. This study provided evidence that altering the just-in-time teaching
assessment protocol still improved student learning outcomes (Natarajan & Bennett,
2014). The implementation of JiTT has also been successful in introductory statistics
courses at major universities. Testing JiTT’s effectiveness by analyzing pre- and posttest
scores resulted in higher average posttest scores than in semesters where JiTT was not
35
implemented (McGee et al., 2016). Implementing these formative assessments before
class time can improve student participation and ownership of class material (McGee et
al., 2016; Natarajan & Bennett, 2014). These studies provide evidence that the JiTT
model, which utilizes web-based formative assessments, improved several important
student outcomes, from student attitudes to student achievement.
Therefore, the body of research suggests that computer-assisted formative
assessments as frequent, smaller assessments improve students’ attitudes and
achievement in large-enrollment undergraduate courses. Online or computer-assisted
formative assessments, before, during, or after class, benefited students’ attitudes and
academic achievement. Several studies have attributed the academic improvement to the
automatic feedback that the online or computer-assisted formative assessments provided,
allowing students to utilize self-assessment for greater understanding and performance on
summative assessments.
Formative Feedback and Attitudes and
Achievement
Students in large-enrollment courses have reported that communication with the
professor lacks one-on-one, personable interaction (Cash et al., 2017), making feedback
that students receive from these classes even more critical for student learning. Bayerlein
(2014) investigated undergraduate students’ perceptions of feedback: both the timeliness
of feedback and constructiveness. Constructiveness is concerned with the use of
automatically generated feedback versus handwritten feedback. Interestingly,
undergraduate students found the automatically generated feedback to be substantially
36
more constructive than the manually written feedback (Bayerlein, 2014). Simple,
automatic feedback with non-judgmental wording aligned with learning outcomes is all
that is needed to create constructive feedback (Stiggins, 2002). Complex feedback is
unnecessary for students to gain information about correctness and learning goals;
instead, productive, concise feedback improves learning and student outcomes (Abell et
al., 2018; Shute, 2008). Simply stated, “provide feedback that moves learning forward”
(Black & Wiliam, 2009, p. 8). These studies illustrate the benefit of automated feedback
in providing succinct information regarding the students' understanding of the learning
goals for the course.
Although complex feedback is unnecessary, the importance of feedback cannot be
undersold as it is the component of formative assessment that is, in fact, “formative”
(Black & Wiliam, 2008; Hattie & Timperley, 2007; Shute, 2008). It is often studied apart
from formative assessment as students report motivation and competence with self-
assessment and find that the formative feedback assists them in improving their
performance on assessments (Abell et al., 2018; ASA Revision Committee, 2016; Shute,
2008). Two meta-analyses conducted on formative feedback and achievement found that
providing feedback positively affected student achievement (Hattie & Timperley, 2007;
Wisniewski et al., 2020). Specifically, computer-generated and corrective feedback
benefited student learning (d = 0.79; Hattie & Timperley, 2007). Wisniewski et al. (2020)
found that feedback is most effective when it includes several factors: timeliness,
strategy, and self-regulation (d = 0.48). Additionally, feedback is associated with
increased performance and learning when it maintains timeliness, concise elaboration on
37
the learning outcomes, and computer delivery (Shute, 2008). As the adult learner is self-
directed, these meta-analyses provide important evidence that concise, automated
feedback is sufficient to elicit self-regulation and affect student achievement.
Feedback can also improve self-efficacy in adult learners. Several studies have
reported that the automated feedback in the formative assessments contributed to
students’ attitudes regarding their learning experience (Balta & Güvercin, 2016; Beemer
et al., 2018; Hodgson & Pang, 2012; Krause et al., 2009; Posner, 2011). For example,
Krause et al. (2009) found that the online learning environment provided university
students with perceived competence in statistics. Additionally, the feedback helped
students’ achievement with little prior statistics knowledge or experience, which is
particularly notable as students experience anxiety toward statistics and lack self-efficacy
regarding their quantitative abilities in introductory statistics courses (Onwuegbuzie &
Wilson, 2003; Williams, 2015). When Broadbent et al. (2018) studied a large-enrollment
undergraduate course (N ~ 1,500) that used formative assessments, they found over 83%
of the students agreed that the online formative feedback on assessments motivated them
to learn, improved their understanding, and increased their learning. Moreover, because
the instructors improved upon the feedback for the following semesters, students’ average
achievement increased by over 10% in the subsequent semesters as the feedback
improved (Broadbent et al., 2018). Thus, feedback can provide essential information to
the adult learner in large-enrollment courses, resulting in improved attitudes and
achievement.
Large-enrollment introductory statistics courses employing formative assessments
38
with feedback have reported improved learning outcomes as well. Massing et al. (2018)
studied computer-assisted assessments with automated feedback in a large-enrollment
statistics course. They found three significant increases in academic outcomes due to
using computer assessments with automatic feedback: student effort, student success in
achieving learning outcomes, and final grades (Massing et al., 2018). Additionally, in
another study on computer-assisted formative assessments in large-enrollment
introductory statistics courses, the authors attributed the learning gains students made to
the formative feedback that the online assessments promptly provided (Balta & Güvercin,
2016). Together, these studies support formative assessment with feedback as
instructional interventions to improve student outcomes in large-enrollment introductory
statistics courses. However, to make the cycle complete, allowing adult learners to retest
their knowledge and learn from their mistakes is needed.
Reassessment and Student Attitudes and
Achievement
An often-overlooked aspect of the formative assessment cycle is reassessment.
With formative feedback being a necessary condition of formative assessment, allowing
students to learn from their mistakes is a natural next step to creating a formative
assessment cycle (FAC). Unfortunately, there is less literature studying reassessment
opportunities in large-section introductory quantitative courses. However, as andragogy
posits that adult learners are motivated and desire to learn through self-assessment,
reassessment is a valuable element in the FAC for creating successful student pathways
in the introductory statistics curriculum. For instance, Grant and Dweck (2003) studied
39
pre-med majors in a large chemistry class. They found that the ability for students to
recover from a poor initial attempt through using achievement goals was the key to
making retakes work as students coped better, increased their motivation, and performed
better on future exams (Grant & Dweck, 2003). Reassessment allows adult learners to
persist in their educational goals by learning from mistakes using the feedback related to
learning outcomes for the course.
A study of undergraduate mathematics students specifically investigated
reassessment opportunities through mastery-based testing (Collins et al., 2019). The
assessments allowed multiple attempts with credit only for mastery. Over 80% of the
students felt that the assessments and reassessment opportunities helped them understand
the material, prepared them for problem-solving, and reflected their knowledge. Students
reported feeling less pressure during examinations due to the reassessment opportunities
and believed their attitudes improved because the reassessments added additional
opportunities for success in the course (Collins et al., 2019). In addition, the
reassessments also showed that students felt motivation to utilize the feedback, which
stated simply whether the student “mastered,” was “progressing,” or was “insufficient” in
that concept for the next assessment attempt. Because the learning outcomes were
directly related to assessment material, the study suggested that students revisited course
material, developing further understanding (Collins et al., 2019). Thus, the reassessment
opportunities can provide students with the desire to self-assess their learning from the
assessment feedback to progress in their mathematics course.
In a large study of undergraduate mathematics students (N ~ 1,200) using web-
40
based homework with the opportunity for students to revise and resubmit answers, Hirsch
and Weibel (2003) found a high correlation (r = .944) between attempts made at
homework problems and the percentage of problems solved in a calculus course. This
study suggested that students persisted with the web-based homework system: they kept
working on a problem until they achieved the correct solution, despite their prior
mathematical ability for the course. Thus, students’ persistence was more a function of
“effort rather than ability” (Hirsch & Weibel, 2003, p. 14). And in another study of
mathematics undergraduates, Lenz (2010) found students’ homework scores improved
due to the use of web-based homework, which utilized both feedback and the opportunity
to resubmit homework multiple times. These studies suggest that students of varying
mathematical backgrounds, when given an opportunity to rework missed homework
problems, show persistence and greater achievement by working through their mistakes.
Two studies that evaluated reassessment specifically in introductory statistics
found that attitudes toward statistics improved when students were provided opportunities
to resubmit work. Posner (2011) found that those students who chose to resubmit work
increased their proficiency in introductory statistics concepts, comparable to those who
achieved proficiency with only one submission. Hodson and Pang (2012) noted that
computer-assisted assessments are robust for creating favorable attitudes when online
formative activities used feedback and multiple attempts. This study found that students
were highly satisfied with the online approach and noted that their abilities to self-assess
increased as they actively worked on finding solutions. Thus, providing opportunities for
students to resubmit work allows students to gain a deeper understanding of the content
41
and improve their attitudes and success in the course.
Conclusion
With the current focus on improving student achievement and creating successful
experiences for all students in large-enrollment introductory statistics courses, this
chapter offered a conceptual framework for a formative assessment cycle. The FAC
comprises three elements: frequent low-stakes assessments aligned with learning
outcomes, computer-generated feedback, and reassessment. Research has suggested that
each of these elements of the formative assessment cycle is integral for assessments to be
formative and improve student learning through adult learners’ self-efficacy and self-
regulation. When an undergraduate quantitative course curriculum incorporates at least
one of the three elements of the formative assessment cycle and measures student
attitudes and achievement outcomes, the research indicates that students make positive
gains in their understanding with an increased motivation to learn. Thus, the literature
review has provided evidence that the three elements of formative assessment cycles are
associated with positive attitudes and increased achievement. However, there is a lack of
studies in statistics education research utilizing all three elements of the formative
assessment cycle together to provide a comprehensive approach to introductory statistics
courses to impact students’ attitudes toward statistics and student achievement.
Findings from the research on student attitudes and achievement emphasized that
both attitudes and achievement are essential outcomes in introductory statistics courses
(Emmioğlu & Capa-Aydin, 2012; Ramirez et al., 2012; Xu & Schau, 2019). Student
42
attitudes toward statistics were most often measured by the Survey of Attitudes Towards
Statistics (SATS; Schau, 2003) in the research. The quantitative analyses of these studies
provide a backbone for analyzing students' attitudes using the SATS instrument.
The focus of this study was to implement formative assessment cycles (FACs) in
the introductory statistics curriculum to benefit both student attitudes and student
achievement for students of different mathematical backgrounds. The body of assessment
research provided evidence for formative assessments to be viewed as evaluative and
transformative for student learning when accompanied by immediate feedback and the
opportunity to reassess. The review also found that using technology aids instructors of
large-enrollment courses to import formative assessment cycles into the curriculum.
Individualized student attention can be achieved through computerized testing with
automatic feedback. Where feedback is effective when it is concise, automated, and
linked to the course's learning outcomes, students know whether they are achieving the
course objectives. Additionally, reassessing students is done simply and efficiently
through computerized test banks over course objectives. The reduction of time spent
grading is also a benefit to instructors of these large-enrollment courses.
The formative assessment process has changed students’ beliefs about testing
situations and their abilities by directing the learner’s goals from a focus on performance
to a focus on learning (Grant & Dweck, 2003). Feedback accomplished this shift in goal
orientation by helping students view learning as a skill advanced by practice, effort, and
mistakes (Grant & Dweck, 2003; Shute, 2008). Building confidence, students have
shifted responsibility for their learning to themselves, leading to life-long learning and
43
success in future classes (Ghaicha, 2016; Hassi & Laursen, 2015; Shute, 2008; Wride,
2017). The benefits of formative assessment on student learning and students’ self-
efficacy are evident. Still, many large-enrollment courses have yet to adopt formative
assessment practices to improve upon the important outcomes of attitudes and
achievement. Taken together, embedding formative assessment cycles (FACs) in the
introductory statistics curriculum could transform the experience of the introductory
statistics student, encourage improved student attitudes toward statistics, and inspire
greater academic achievement.
44
CHAPTER III
METHODOLOGY
Overview
Formative assessment, with feedback and multiple attempts for learning,
embedded in an introductory statistics large-enrollment course can be a comprehensive
pathway for students to achieve their quantitative literacy requirement in higher
education. FACs provide introductory statistics students with frequent assessment
opportunities, computer-generated feedback on their assessments, and multiple
assessment attempts. This study implemented a quasi-experimental design to
quantitatively evaluate the impact of FACs on both student achievement and student
attitudes toward statistics.
This chapter describes the methodology for which the research questions were
analyzed. The research design and the description of the participants and setting follow,
after which the existing data set and instruments are discussed. The chapter concludes
with the methods of data analyses for each research question.
Research Questions and Hypotheses
The purpose of this study was to quantitatively analyze the impact of FACs on
student achievement and student attitudes toward statistics in large-enrollment
introductory statistics courses. To meet the study’s purpose, the following research
questions were offered.
45
Research Question 1
How do formative assessment cycles (FACs) affect student achievement in large-
enrollment introductory statistics courses for different mathematically prepared students?
The hypotheses for Research Question 1 were as follows.
H0: The difference in average achievement (average test scores) between students
below an ALEKS score of 30 and those above a score of 30 before using FACs in
large-enrollment introductory statistics courses is unchanged after FACs.
HA: The difference in average achievement (average test scores) between students
in large-enrollment introductory statistics courses below an ALEKS score of 30
and those above a score of 30 before using FACs in introductory statistics is
reduced after FACs are implemented. In other words, FACs reduce the
achievement gap between students below a placement score of 30 and those
above.
Research Question 2
After allowing for student-to-student variability, which student attitude
components change after a semester of a large-enrollment introductory statistics course
with FACs? Also, how do demographic factors impact attitude, and do these effects
change over time?
The hypotheses for Research Question 2 were as follows.
H0: Student attitudes remain unchanged after a semester of introductory statistics
with FACs.
HA: Student attitudes change after a semester of introductory statistics with FACs.
Research Design
To examine these phenomena, I utilized a quasi-experimental research design
with existing and deidentified data based on cohorts described in Table 1 to analyze the
46
research questions. A quasi-experimental design is utilized when a randomized,
controlled trial is not feasible (Cresswell & Cresswell, 2018). A quasi-experimental
design is most appropriate for this study because it is impossible to randomize the
participants into treatment and control groups (Scher et al., 2015).
Table 1
Research Questions, Data Source and Instruments, Participants, and Analysis
Research question
Instruments or
data sources
Participants
Proposed data
analysis
RQ 1
How do formative assessment
cycles affect student
achievement in large-
enrollment introductory
statistics courses for different
mathematically prepared
students?
Final exam scores,
Course percentage
grade
ALEKS placement
scores
Pre-FACs cohorts
(Fall 2017 and
Spring 2018) and
FACs-cohort (Fall
2019)
Regression
discontinuity;
Slope plots;
Descriptive
statistics
RQ 2
After allowing for student-to-
student variability, which
student attitude components
change after a semester of a
large-enrollment introductory
statistics course with FACs?
SATS-36 attitudes
instrument (Schau,
2003) pre- and
post-surveys
ALEKS placement
scores
Demographic Data
obtained in the
pre-survey
FACs Cohorts:
Fa 2021, Sp 2022
Descriptive
statistics;
Multilevel
Modelling (MLM)
How do demographic factors
impact attitude, and do these
effects change over time?
Effect sizes of
significant main
effects or
covariates in the
model
This research took place in two phases. The first phase is regarded as Research
Question 1. This quantitative, quasi-experimental design used existing and deidentified
data from students in Fall 2017 through Fall 2019. Variables collected regarded students’
47
mathematics placement exam scores and achievement scores, described in Table 2. A
quantitative, quasi-experimental design was most appropriate for this study to utilize the
regression discontinuity method to effectively analyze Research Question 1
(Cunningham, 2021).
Table 2
Pre-FACs and FACs Cohorts for Proposed Research Question 1
Pre-FACs
───────────────────────────────
FACs
────────────────────────────────
Semester
Course
Instructor
Exam type
Semester
Course
Instructor
Exam type
Sp 2018
1040-
001
T1
Two identical paper-
pencil midterms,
final exam
Fa 2019
1040-
001
T2
Six (T1) or seven
(T2) computer exams
with q's from
identical question
banks, two attempts
allowed, final exam
Fa 2017
1040-
001
T2
Two identical paper-
pencil midterms,
final exam
1045-
001
T1
1045-
001
T1
Note. T1 and T2 are two different lecturers.
The second phase of this quantitative, quasi-experimental research design was a
pre- and post-survey methodology (Creswell & Creswell, 2018) to explore students’
attitudes toward statistics in semesters of large-enrollment introductory statistics courses
implementing FACs. A quantitative design using a pre- and post-survey method was
most appropriate to explore Research Question 2 using multilevel modeling (hierarchical
regression) techniques as the students were nested in recitation sections and further
nested in lecture sections within semesters (Hox et al., 2018). The survey data was
obtained from existing and deidentified student survey data from the Fall 2021 and
Spring 2022 semesters of the large-enrollment introductory statistics sections.
48
Participants and Setting
This study was set in an R1 university located in the western portion of the United
States. The university population is primarily Caucasian, with an average undergraduate
age of 22, and approximately 55% of the students are female (https://www.usu.edu/aaa/
enroll_infographic.cfm). For the purposes of this study, the participants were
undergraduate students enrolled in one of two introductory statistics courses to satisfy
their quantitative literacy general education requirement, see Table 3.
Table 3
Introductory Statistics Courses
Course number
Semester credit
Minimum ALEKS
placement score
Lecture hours per
week (max. Size
of class)
Recitation hours
per week (max.
Size of class)
Stat 1045
5
14
4 (350)
1.67 (30)
Stat 1040
3
30
2.5 (170)
1.67 (30)
Non-STEM majors primarily take the introductory statistics courses (Stat 1040
and Stat 1045). The two courses both emphasize conceptual understanding and statistical
reasoning and cover the same statistical concepts: “types of studies, summarizing data,
probability, [and] hypothesis testing” (http://catalog.usu.edu/). Additionally, these
courses require students to complete the same final examination. However, the courses
differ regarding the ALEKS mathematics placement exam score required for registration.
Introduction to Statistics (Stat 1040) requires a score of 30 on the ALEKS mathematics
placement exam. Introduction to Statistics with Elements of Algebra (Stat 1045) requires
49
a 14 on the ALEKS mathematics placement exam. Stat 1045 is considered a co-requisite
course as it covers the algebra skills needed for the statistical topics, allowing less
mathematically prepared students entry into the course. Due to the co-requisite nature of
Stat 1045, it is a 5-credit semester-long course to include foundational algebra topics,
while Stat 1040 is a 3-credit course. Both courses meet in a large-student lecture format
twice a week: Stat 1040 has 2.5 student contact hours per week, and Stat 1045 meets for
4 student contact hours per week. Also, students in both courses register for a recitation
section with enrollments of 20-30 students that meet twice a week for 50 minutes.
Recitation leaders, who are students employed by the Mathematics and Statistics
Department, lead the recitation sections. The recitation leaders are usually in their senior
year of undergraduate study in a mathematics or statistics major or are mathematics or
statistics graduate students. The recitation leaders work closely with the course instructor
to provide similar instruction across the recitations, consistent with the content for the
week outlined in the syllabus. The Stat 1040 and Stat 1045 weekly statistical content is
identical.
It should be noted that the instructors (T1 and T2 in Appendix A) of the Stat 1045
and Stat 1040 courses for this study have a combined 55 years of teaching experience in
teaching introductory statistics. These teachers are also award-winning instructors who
students rate above the institutional and departmental averages in the university teaching
evaluations each semester.
The USU registration website provided the number of participants enrolled in the
Stat 1040 and 1045 courses used in this study, detailed by semester in Appendix A. The
50
aggregated numbers of participants are displayed in Table 4 by research question.
Table 4
Number of Participants by Research Question
Research question
Course
Participants
Research Question 1
Stat 1045
n = 527
Stat 1040
n = 905
Research Question 2
Stat 1045
n = 194
Stat 1040
n = 347
Existing Data Set
I obtained Institutional Review Board (IRB) permission to receive the existing
and deidentified data set from the semesters listed in Appendix A. The Center for Student
Analytics and Register’s Office prepared and deidentified the data prior to my obtaining
the data. Furthermore, my permission to access any of the student information from the
Canvas course management system from the course sections listed in Appendix A was
removed.
The existing data set was obtained from both large-enrollment introductory course
sections of Stat 1045 and Stat 1040 detailed in Appendix A. For the semesters described
in Table 2, from Fall 2017 through Fall 2019, these data were used to investigate
Research Question 1. Due to the pandemic affecting Spring 2020, Fall 2020, and Spring
2021 semesters, course data from these semesters were not used for the study. The Fall
2021 semester was entirely webcast, with the Fall 2021 and Spring 2022 semesters
51
having the homework and final exam in a different format than Fall 2019. Thus, the Fall
2021 and Spring 2022 semesters were not included in the analysis of Research Question
1. To analyze Research Question 2, the SATS-36 (Schau, 2003) pre- and post-survey data
were obtained from the large-enrollment sections of Stat 1045 and Stat 1040 in Fall 2021
and Spring 2022 semesters which have been embedded with FACs in the curriculum. The
Registrar’s Office provided ALEKS placement score data, matching the students to their
ALEKS scores before I accessed the deidentified data.
Instruments and Data Sources
Appendix A, “Data Sources, Deidentified for Proposed Study,” displays the data
sources per course and semester that the Center for Student Analytics obtained and
deidentified for this study. The data sources and instruments are explained as follows.
Survey of Attitudes Toward Statistics: SATS-36
In the Fall 2021 and Spring 2022 semesters, all students were assigned the SATS-
36 (Schau, 2003) pre- and post-surveys. Candace Schau (Schau, C., personal
communication, October 27, 2020) granted permission to use the SATS pre- and post-
surveys through email correspondence (see Appendix B). These surveys are designed to
measure students’ attitudes towards statistics both at the beginning and end of the
semester. The pre- and post-surveys are found in Appendices C and D, respectively.
The pre-survey was assigned for students to complete during the first two weeks
of class. Rather than the end of the semester to assign the post-survey, the post-survey
was assigned over weeks 12 and 13 of a 15-week course. The post-surveys were assigned
52
in the Fall 2021 and Spring 2022 semesters during weeks 12 and 13 to not coincide with
a unit test or final examination preparation.
The 36-item SATS pre- and post-surveys measure six attitude components: affect,
cognitive competence, value, difficulty, interest, and effort. See the definitions of these
components in Table 5 and see Appendix E and F for the full surveys. The 36-item
surveys use a seven-point Likert scale (1 = “strongly disagree,” 2 = “agree,” 3 =
“somewhat agree,” 4 = “neither agree nor disagree,” 5 = “somewhat disagree,” 6 =
“disagree,” and 7 = “strongly agree”) (Schau, 2003).
Table 5
SATS-36 Attitude Components, Definitions, Examples, and Number of Items Per
Component with Cronbach Alpha Ranges per Component
Component
Definition
Example items
No.
Cronbach’s
alpha
affect
students’ feelings concerning
statistics
“I am scared by statistics.” a
“I will like statistics.”
6
.80-.89
cognitive
competence
students’ attitudes about their
intellectual knowledge and skills
when applied to statistics
“I can learn statistics.”
“I will make a lot of math errors in
statistics.”
a
6
.77-.88
value
students’ attitudes about the
usefulness, relevance, and worth of
statistics in life
“I use statistics in my everyday life.”
“Statistics is not useful to the typical
professional.”
a
9
.74-.90
difficulty
students' attitudes about the
difficulty of statistics as a subject
“Most people have to learn a new way
of thinking to do statistics.” a
“Statistics formulas are easy to
understand.”
7
.64-.81
interest
students' level of interest in
statistics
“I am interested in using statistics.”
4
.85b
effort
amount of work the student expends
to learn statistics
"I plan to work hard in my statistics
course."
4
.79c
Note. The table is taken from https://www.evaluationandstatistics.com/.
a Negatively worded items.
b,c Interest and Effort components are pooled and reported from Nolan et al. (2012).
53
Additionally, the SATS-36 pre- and post-surveys ask global attitude items, and
the post-survey also contains a global effort item. Additional items ask for relevant
demographic and academic background information. Appendices C and D contain these
specific questions for the pre- and post-survey, respectively.
Validity of the SATS-36 Instrument
Nolan et al. (2012) assessed the SATS (Schau, 2003) survey’s structural validity
using confirmatory factor analysis, resulting in six factors congruent to the survey's six
attitude components. Cronbach’s alpha coefficients measured the internal consistency of
the SATS instrument's pre- and post-surveys (pooled administration of the surveys).
Regarding the reliability of the SATS-36, the Cronbach alpha coefficients ranged
between .79 and .90 for the six components, indicating internal consistency (Nolan et al.,
2012). Templaar et al. (2014) found predictive validity with the SATS-36 instrument.
The cognitive competence component accounted for as much as 14% of the variability in
student academic achievement. Generally, cognitive competence accounted for 2-14% of
the variability in student academic outcomes (Nolan et al., 2012).
Additional Questions to Assess Students'
Experience of FACs
Appendix E displays the additional questions I asked students in both the pre- and
post-surveys to rate their expected and end-of-course experience with FACs.
Achievement Scores
Achievement scores for the pre-FACs Fall 2017 and Spring 2018 semesters of
54
Stat 1040 and Stat 1045 include two midterm test scores, the final examination score, and
the final course percentage for each student. For the FAC semester, Fall 2019,
achievement scores include all unit test scores for all attempts, the final examination
score, and the final course percentage for each student.
Midterm and Unit Test Scores
Students took two midterm exams in Stat 1040 and Stat 1045 during class on
paper in the pre-FACs semesters of Fall 2017 and Spring 2018. During the Fall 2018 and
Spring 2019 semesters, test banks were created over the course learning objectives. The
test banks were used to create the midterms for Fall 2018 and Spring 2019 semesters in
both Stat 1040 and Stat 1045 courses. Thus, the midterms for Fall 2018 and Spring 2019
semesters were now computer-generated tests from the test banks. The students also took
these exams at the university’s testing center. By the Fall 2019 semester, the test banks
were fully created, aligned to learning objectives in the LMS, and the FACs were
embedded in the courses as unit tests, rather than two midterms, and given in the testing
center over a testing window of four days with retakes allowed.
In March 2020, during the Spring 2020 semester, students were sent home to
study remotely due to the beginning of the pandemic. Fall 2020 and Spring 2021
semesters were also disrupted by physical distancing requirements and a hybrid teaching
model: half the class attended while the other half joined via Zoom. For the Fall 2021
semester, all large-enrollment courses were required to teach via webcast. During the
Spring 2021 semester, the introductory statistics courses were as similar as possible to a
pre-pandemic teaching and learning environment; however, the final examination had
55
changed from pencil and paper to computer-graded, and the homework assignments were
improved. Additionally, student attendance was low due to the high numbers of Covid-19
illnesses, suspected illness, or contact with someone with the illness. Thus, as regression
discontinuity requires consistency across the pre-FACs and FACs semesters, the
semesters of student achievement scores collected for analysis for Research Question 1
come from the Fall 2017, Spring 2018, and Fall 2019 semesters of Stat 1040 and Stat
1045 students.
After the year of creation of the computer-generated examinations from test banks
on learning objectives (see Appendix F for the list of learning objectives for the
introductory statistics course), FACs were completely implemented in the large-
enrollment sections of Stat 1040 and Stat 1045 beginning Fall 2019 semester. Thus,
instead of two midterms, students were assessed by smaller and more frequent unit tests
using the FAC framework. Tests were computer-generated by randomly selecting
questions from test banks on course objectives, automatic feedback was given
immediately on each test question regarding correctness and the tested learning objective,
and each unit test allowed two attempts. Both examination attempts were completed at
the university’s testing center over a 4-day testing window. Upon completing the test, the
university’s learning management system (LMS) automatically graded the exam and
provided immediate feedback. The student could review their completed test to see what
they did and did not get correct before leaving the testing center. Because the questions
come from test banks linked to learning objectives, the LMS provides a Learning Mastery
Gradebook (LMG) for students and instructors to view the learning objectives and
56
whether they were “mastered.” Appendix G displays pictures of the instructor’s view of
the LMG (Figure G1) and a test student’s view of the LMG in their gradebook (Figure
G2). The LMG requires at least an 80% level of correctness on each learning objective
for mastery. Students can take the unit test again within the testing window; however, the
test would not be identical to the first attempt because the computer randomly selects
questions from the test banks. The student’s gradebook reflects the best of the two
examination attempts. Thus, for the Fall 2019 FACs semester, the course data includes
the achievement scores of all unit tests for all attempts.
The Final Examination
The final examination is a formal summative assessment given to students during
the final examination week after each semester in all the introductory statistics courses
included in this study. Since the Fall 2016 semester, the final examination has been a
departmental examination. Thus, students in Stat 1040 and Stat 1045 all receive the same
final examination in their respective semesters. The final exam questions change from
semester to semester but cover the same topics. The examination is comprehensive,
covering all seven units of the course, detailed in Appendix H, with approximately 33%
of the final material from Units 1-3, 33% covering Units 4-6, and 33% covering Unit 7.
A team of instructors assigned to teach the course write the final examination
each semester. The team also checks the final for accuracy and coverage. Students took
the final examination on paper until the Spring 2020 semester when the pandemic closed
campus. Beginning the Spring 2020 semester, the LMS administered, proctored, and
graded the final examination, changing the final examination format to a digital version
57
of the paper and pencil version. Prior to Spring 2020, the instructor and the recitation
leaders graded the final exam, with each page specifically graded by only one recitation
leader to minimize grading variability.
Final Course Percentage
In addition to the midterm scores (pre-FACs), unit test scores (FACs), and final
examination scores, the data set included the overall final course percentage for each
student.
ALEKS Math Placement Examination Score
Beginning Fall 2017 semester, the USU Mathematics and Statistics Department
used the ALEKS math placement examination (http://www.aleks.com) to serve as an
indicator of mathematical readiness for quantitative coursework at the university. The
scores on the ALEKS test are integer values between 0 and 100. The points come from
correctly responding to prompts that assess prealgebra to trigonometric content. For
example, if a student scores 100 on the ALEKS test, their performance is judged to be a
perfect understanding of the content. A score of 30 means that the student understands
30% of the content. Students’ ALEKS placement scores are one way that determines
placement into either Stat 1040 or Stat 1045. As a reference, students must score a
minimum of 14 on the ALEKS test to take Stat 1045, the co-requisite course that teaches
the exact curriculum as the Stat 1040 course but with elements of algebra. A minimum of
30 is required for enrollment into introductory statistics without the algebra co-requisite
(Stat 1040), and a minimum of 46 to take pre-calculus (https://www.usu.edu/mathprep/
58
aleks-ppl). The dataset included the ALEKS math placement score for students who used
the ALEKS examination pre-requisite for registration into Stat 1040 and 1045.
Data Analysis
Multilevel modeling (MLM), sometimes referred to as Hierarchical Linear
Models or Linear Mixed Effects Models, was used to determine significant “relationships
between variables that are measured at a number of different hierarchical levels” (Hox et
al., 2018, p. 3). This allows the use of regression for situations where there are clusters
(e.g., students within a class), which otherwise violates assumptions of linear regression.
The intraclass correlation (ICC) quantifies the within-level variability, with higher levels
indicating higher dependency within clusters. More than one level of clustering is also
straightforward to manage in this framework. For instance, analysis of students nested
within recitation sections and classes is possible with the incorporation of additional
random effects in MLM and can be investigated.
Each research question employs MLM analyses but with different goals. All data
analysis was conducted with R 4.2.2 (R Core Team, 2022), and MLM was implemented
with the lme4 package (Bates et al., 2015). Unless otherwise stated, a significance level
of .05 was utilized.
Analyses for Research Question 1
The data set captured the following variables on each student for each semester
listed in Appendix A to analyze Research Question 1. For the pre-FACs and FACs
semesters, the following qualitative and quantitative variables were obtained for each
59
student. Variables for descriptive statistics (DS) and exclusionary factors (EV) are noted
below. For the MLM analysis of Research Question 1, independent variables (IV),
possible nesting variables (NV), and dependent variables (DV) are also noted. The
qualitative variables were:
• sex (DS)
• class rank: freshman, sophomore, junior, senior, graduate (DS)
• attempt at the class (first attempt, second, third, etc.) (EV)
• whether the student completed or withdrew from the course (EV)
• pre-requisite utilized: ACT score, previous math class, ALEKS (EV)
• semester: fall or spring (NV)
• year of course (NV)
• type of introductory statistics taken: Stat 1040, Stat 1045 (IV)
• recitation section (TA) (NV)
• instructor (coded as T1 or T2) (NV)
The quantitative variables were:
• age (DS)
• ALEKS math placement exam score (IV)
• final exam percentage (DV)
• course final grade percentage (DV)
• each unit or midterm test percentage (DV)
• the reassessed unit test scores (if applicable, DV)
To answer Research Question 1, regression discontinuity methodology was used.
Regression discontinuity investigates causation using a quasi-experimental design by
providing unbiased regression estimates on a treatment effect without randomization of
the subjects to treatment (Thistlewaite & Campbell, 1960). This is accomplished by using
an exogenous cutoff score from a continuous variable measured prior to the treatment.
Thus, using the mathematics placement exam score, which measures the mathematical
preparedness of a student before enrolling in introductory statistics is, as Boylan et al.
(1999) describe, an appropriate way to create groups in a regression discontinuity design.
60
Moreover, by using the cutoff to assign students to the two groups who are otherwise
similar with respect to all variables, controlling for covariates is not needed in order to
obtain the regression estimates on the treatment effect (Lesik, 2006). Even more
importantly, regression discontinuity is unique in that its methodology provides causal
inferences by using a sample of participants on either side of the cutoff (Cunningham,
2021). The key to providing causal inferences in the regression discontinuity
methodology is its ability to eliminate selection bias (Cunningham, 2021). Using the
cutoff as the determining factor to place students into treatment and control groups, the
methodology requires that these groups are otherwise similar. Due to the pandemic
affecting the teaching modality and the way the final exam was administered, only the
Fall 2019 semester is the FACs semester for which all other variables are as similar as
possible to the pre-FACs semesters of Fall 2017 and Spring 2018.
Simulated data is shown in Figure 4 to provide an example of regression
discontinuity using the ALEKS placement cutoff score of 30 and the achievement score
as the dependent variable. Students scoring below 30 are placed into the corequisite
introductory statistics course, Stat 1045, and students who score at least 30 are placed
into Stat 1040.
Because students below a mathematics placement score of 30 placed into Stat
1045 and those who score above 30 placed into Stat 1040, I compared the achievement
scores of these students around the cutoff (D = 30) to find a baseline difference between
the achievement of each group. The university began using the ALEKS math placement
exam beginning Fall 2017. In Fall 2017 through Spring 2018, both Stat 1040 and 1045
61
Figure 4
Simulated Data Representing Observed Data Points Along a Running
Variable Below and Above Some Binding Cutoff (D= 30)
Note. Figure recreated from Causal Inference: A Mixtape (Cunningham, 2021, p. 255-260).
received the same two midterms and final examinations in their respective semesters. The
difference between the slopes of the achievement scores around the cutoff provided a
baseline estimate of the achievement gap between the two courses.
Statistical inference was used to detect changes at the discontinuity via MLM due
to the nested nature of the data. Figure 5 shows the nested structure of the data. As
shown, students are nested within the recitation section in the class and the semester.
Random intercepts by class and semester helped to control for section-to-section
variability, accounting for the teaching effect of the teaching assistant.
62
Figure 5
Graphical Depiction of the Nested Nature of the Data for Research Question 1
Using random intercepts to control for teacher-to-teacher variability of the
recitation section and centering the ALEKS scores at the cutoff of 30, the baseline model
was as follows.
achievementDi ~ (aleksi – 30) + (1|section) where = 1 30
0 <30 (1)
Thus, combining the two models in Equation 1 for the slopes before and after the cutoff
yielded Equation 2. This equation was used for both baseline and FACs comparisons
separately.
=+ + + (×) + + (2)
where ~(0, ) and ~ ,
The analyses were investigated at the baseline and again for the FACs semester (Fall
2019). The coefficients of interest were (whether there is a difference in achievement
63
right below and right above the cutoff) and (whether the slope of ALEKS differs for
those below and above the cutoff). First, I investigated whether there is a gap in the
achievement scores. Second, I determined the significance of the relationship of the
interaction between the student being above or below the cutoff and the math placement
score. This first analysis helped determine the overall patterns for baseline and FACs
separately.
There was a second analysis to determine whether the performance gap depended
on FACs depicted in Equation 3.
=+ + + (×) + ()+(×) + + (3)
where ~(0, ) and ~ , and =0
1
The coefficient of interest was (whether there is a difference in achievement right
below and right above the cutoff and if that difference depends on whether they had
FACs or baseline). Non-significant results for this estimate did not indicate that the
difference in performance depended on FACs. Notably, this model specification assumes
that the slopes did not significantly differ in the first analyses (the coefficient labeled
in that model). Then a three-way interaction was analyzed between ALEKS scores, the
cutoff indicator (D) and T (the indicator for FACs or baseline) within this model to allow
the slopes to vary by both indicators. This three-way interaction allowed for interaction
plots and simple slopes analyses to be conducted, and regression estimates calculated.
Assumptions of MLM were visually assessed with residual diagnostics with no
evident violations.
64
Threats to Validity
Three main threats to validity that could affect the regression estimates in
regression discontinuity analysis in this quasi-experimental design (Lesik, 2006): (a)
scores from students who take introductory statistics more than once; (b) scores from
students who do not take the ALEKS placement exam but place into the class via their
ACT or prior math course grade; and (c) scores from students who took Stat 1045 even
though they placed in Stat 1040 based on their ALEKS score.
Taking the course multiple times. The deidentified data provided a unique ID
number for each student. Thus, students who drop or retake the course showed up in the
data set multiple times and then were dropped from the analysis. Thus, after dropping
these students, there were no students in the pre-FACs semesters who appeared in the
FACs semester; the students in these groups were mutually exclusive. Naturally, care was
taken that all students in the analysis received the same treatment amount to have
unbiased estimates of the treatment effect. I utilized the method from Lesik (2006) to
verify that the student received one and only one full semester of introductory statistics.
That is, students who took the final exam and received a D- or better grade received the
full semester, and those who withdrew or received an F grade did not and were dropped
from the analysis.
Not taking the placement exam. A necessary condition of regression
discontinuity is the cutoff, or the math placement exam score. Because the university
allows students to place into math classes using methods other than ALEKS scores, only
students with ALEKS scores were included in the regression discontinuity analysis.
65
Crossovers. Regression discontinuity relies on “perfect compliance. . . based
solely on the student's placement test score” (Lesik, 2006, p. 11). Students who
“crossover” and choose to take Stat 1045 even though they placed in Stat 1040, if
random, can be dropped from the analysis to ensure compliance. Thus, care was taken to
ensure that students in the analysis were not crossovers. Additionally, the estimates and
standard errors were computed with and without crossovers to verify that the crossovers
were not affecting the estimates.
Limitations
As a quasi-experimental study, there are limitations to the results. Generalizability
beyond this university is cautioned, as other settings, students, course curriculum,
teaching practices, and assessments can impact the achievement outcomes (Cresswell &
Cresswell, 2018). Second, the students used in the analysis were from semesters before
the pandemic. The pandemic and subsequent stresses placed on students could impact
students in various ways, mentally and physically, barring students from a semester
similar to a pre-pandemic experience.
Analysis for Research Question 2
The purpose of Research Question 2 was to investigate changes in student
attitudes toward statistics over a semester of Introductory Statistics with FACs. To
analyze this research question, variables obtained on students in the Fall 2021 and Spring
2022 semesters include those used for descriptive statistics, covariates, nesting variables
and independent and dependent variables. For the MLM analysis, independent variables
66
(IV), nesting variables (NV), and dependent variables (DV) are noted. They are as
follows.
• introductory statistics course taken: Stat 1040, Stat 1045 (qualitative, NV)
• ALEKS math placement exam score (quantitative, IV)
• SATS-36 survey scores (quantitative, DV)
• semester: fall or spring (qualitative, NV/IV)
• recitation section (qualitative, TA) (NV)
• Other survey information and demographics listed in Appendix E
(quantitative and qualitative, IV)
After controlling for student-to-student variability, again using MLM, statistical
inference detected whether there was improvement in each of the six attitude components
from pre- to post-survey. This approach allows the investigation of any longitudinal
changes similar to repeated measures analysis of variance (RM ANOVA) or multivariate
analysis of variance (MANOVA) but offers additional advantages. Specific to this
research, MLM allowed for the incorporation of incomplete data while accounting for the
dependence among observations (Hox et al., 2018). Additionally, both categorical and
continuous covariates were explored to control for potential confounding or moderating
effects. As before, further nesting of students within the recitation section and lecture is
possible with the incorporation of additional random intercepts in MLM and were
investigated in the analysis of Research Question 2. See Figure 6 for the nesting structure
of the data for this research question.
The following explains the models and equations used for the analysis of
Research Question 2.
67
Figure 6
Nesting Structure for Analyzing Attitudes as a Repeated Measure for Research Question
2
• Model0: attitude ~ time + (1|id)
• EQ0: = + + + ,
~(0, ); ~ (0, )
1. After allowing for student-to-student variability using random intercepts for
the nesting, how do student attitude components (dependent variables, DV)
change after a semester (repeated measure at two time points) of large-
enrollment introductory statistics courses with FACs?
2. Can further nesting allow for recitation section or recitation leader (TA)
variability (random effects for section and TA)?
• Model1: attitude ~ time + (1|ta/id)
• EQ1: = + + + + ,
~(0, ); ~ (0, ); ~(0, )
3. Also, how do demographic factors impact attitude, and do these factors affect
68
change over time? (The models below depend on whether significance of
nesting students by recitation leader is determined in Model 1. (For ease, the
nesting is not included in the following models, and error terms are defined in
EQ1.)
• Model2: attitude ~ time + covariate1 + … + (1|id)
• EQ2: = + + ×++ +
• Model3: attitude ~ time*covariate + (1|id)
• EQ3: = + + ×+ ×
+ +
Potential confounding and moderating effects of covariates were investigated by
incorporating fixed effects and associated likelihood ratio tests (LRT) between nested
MLM models optimized with maximum likelihood (ML). Final MLM models were
reoptimized with restricted maximum likelihood (REML), parameter estimates tabulated,
and estimated marginal means visualized for attitude components in which a statistically
significant change was found. Post hoc pairwise t-tests were computed, and effect sizes
were calculated as standardized mean differences (a Cohen’s d-like measure), in which
point differences are divided by the pooled standard deviation. The pooled standard
deviation was estimated by pooling the variance components of the best fitting MLM.
Assumptions of MLM were visually assessed with residual diagnostics, and no
evident violations were found. All data analyses were conducted with R 4.2.2 (R Core
Team, 2022), and MLM was implemented with the lme4 package (Bates et al., 2015).
Unless otherwise stated, a significance level of .05 was utilized.
Limitations
During the Fall 2021 semester, all large-enrollment courses were required to be
taught webcast, different from Spring 2022. Thus, the teaching modality could be a major
69
predictor of student attitudes, and if so, controlling for this variable will be needed in the
MLM.
As with all quasi-experimental research, generalization of the results beyond this
university is cautioned. The data for this proposed study will come from a
nonrandomized group of students in two semesters using the FACs curriculum. Also, the
ability to model the nested structure of recitation sections with random slopes could be
limited if the sample size is too small (sparsity).
Summary
Using existing data, a quasi-experimental quantitative research design in two
phases was conducted to analyze the impact of FACs in undergraduate large-enrollment
introductory statistics course curricula on student attitudes toward statistics and student
achievement. Research Question 1 used regression discontinuity to identify changes in
student achievement around the ALEKS placement exam score of 30 between pre-FACs
semesters and the semester where FACs was implemented. The ALEKS placement score
determined whether the student placed in Stat 1045 or Stat 1040. The results and analysis
of Research Question 1 provide feasibility: the successful import of FACs under ideal
circumstances with experienced instructors.
Research Question 2 investigated student attitudes across the Fall 2021 and
Spring 2022 semesters using the SATS-36 (Schau, 2003) instrument. To analyze these
data, MLM was used to address the nesting structure of students within sections and the
dependence among observations. Additionally, MLM measured the longitudinal change
70
from pre- and post-survey responses. This study contributes to the field of statistical
education research by providing empirical evidence for utilizing formative assessment
research in large-enrollment courses to improve the important student outcomes of
students’ attitudes towards statistics and student achievement. Additionally, these results
provide a foundation for other colleges and universities to study how formative
assessment cycles can be employed in their large-enrollment courses. Thus, this research
exemplifies a successful pathway for students to complete their undergraduate
quantitative requirements, creating a positive experience that transcends the classroom to
their citizenship in a data-centric world.
71
CHAPTER IV
RESULTS
“True scientists do not collect evidence in order to prove what they want to be true
or want others to believe. That is a form of deception and manipulation called
propaganda, and propaganda is not science.” (Cunningham, 2021, p. 10)
The purpose of this study was to investigate the impact of an embedded cycle of
formative assessment with feedback and reassessment opportunities in the curriculum of
large-enrollment introductory statistics courses on student attitudes toward statistics and
student achievement scores. Using a quasi-experimental quantitative research design, this
study sought to answer the following two research questions regarding the effects of
FACs on student attitudes toward statistics and statistics achievement.
1. How do formative assessment cycles (FACs) affect student achievement in
large-enrollment introductory statistics courses for different mathematically
prepared students?
2. After allowing for student-to-student variability, which student attitude
components change after a semester of a large-enrollment introductory
statistics course with FACs? Also, how do demographic factors impact
attitude, and do these effects change over time?
This chapter details the results of this analysis by research question. The Center
for Student Analytics at Utah State University gathered and deidentified the pre-existing
dataset for this analysis after IRB approval. Appendix A details the student data I
obtained on Stat 1040 and Stat 1045 students from the Fall 2017, Spring 2018, Spring
2019, Fall 2021, and Spring 2022 semesters. The pre-existing dataset included
achievement data gathered from all semesters. Student responses for the SATS-36
(Schau, 2003) pre-and post-surveys in Fall 2021 and Spring 2022 semesters measured the
students’ attitudes toward statistics. The Center for Student Analytics also provided
72
demographic data such as sex, class rank, recitation section, and recitation teacher.
Additionally, ALEKS math placement scores were matched to students who used the
ALEKS exam for their prerequisite requirement. The next two sections detail the
analyses for each research question. The chapter ends with a summary of these results.
Analysis for Research Question 1
Research Question 1 examined students’ achievement in semesters before and
after FACs were implemented in large-enrollment introductory statistics courses. MLM
was used to analyze the overall course achievement (final grade percentage, DV) on their
math placement exam scores (ALEKS scores, IV) using the methodology of regression
discontinuity. This methodology allowed me to explore the students’ achievement at the
exogenous cutoff, placing students below this cutoff in Stat 1045 and students above this
cutoff in Stat 1040. In this section, I provide the demographic and summary statistics,
exploratory data analysis, and MLM results and conclusions. The complete analysis in an
R markdown file is available upon request.
Descriptive and Summary Statistics
Students’ achievement scores from Fall 2017, Spring 2018, and Spring 2019 were
collected, including the student’s prerequisite math course or placement exam score.
Demographic data, such as sex, class rank, age, and the number of previous attempts at
the course, were also obtained on these students. A full list of the variables collected on
these students by semester is found in Appendix A. After I obtained the data, I removed
the 549 students who did not use the ALEKS math placement exam for their prerequisite
73
from this dataset, leaving 730 students. Then I removed 50 students who had taken the
class previously. Finally, I removed 104 students with a final grade percentage below
60% or who withdrew from the course during these semesters, leaving a sample of 576
students. This sample of 576 students were those who had an ALEKS math placement
score and achieved a D- or better for their first time taking introductory statistics in one
of Fall 2017, Spring 2018, or Spring 2019 semesters. Students in Fall 2017 and Spring
2018 semesters were assessed with two midterms, a final, and no formative assessments
(pre-FACs). Students in Spring 2019 experienced a semester with a FAC-based
curriculum, including formative assessments, feedback, and reassessments throughout the
course. Table 6 shows the summary statistics of the 576 unique students.
Figure 7 displays the histograms of all ALEKS scores from the Fall 2017, Spring
2018, and Spring 2019 semesters of Stat 1040 and Stat 1045. Figure 7 shows that some
students who scored over 30 chose to take Stat 1045 rather than Stat 1040. Additionally,
there are extremely high ALEKS placement scores. Students can choose to take Stat 1040
or Stat 1045 even if they place into a higher introductory statistics course due to their
ALEKS scores; some students’ majors only require Stat 1040 or Stat 1045, and therefore
that is what they register for despite having had a strong mathematical background.
The distribution of the final grades of the 576 students in the sample is shown in
Figure 8. The distributions appear similar between the courses.
Crossovers and Extreme Cases
An ALEKS score of 14 places students into the co-requisite introductory statistics
course, Stat 1045, and a score of 30 places students in Stat 1040, although students are
74
Table 6
Descriptive and Summary Statistics of Participants for Research Question 1
Total (N = 576)
─────────────────
Pre-FACs (N = 383)
─────────────────
FACs (N = 193)
─────────────────
Variable n % M SD n % M SD n % M SD p
Sex .233
Female 361 62.7 233 60.8 128 66.3
Male 215 37.3 150 39.2 65 33.7
Course .986
Stat1040 345 59.9 230 60.1 115 59.6
Stat 1045 231 40.1 153 39.9 78 40.4
Class rank .338
Freshman
272
47.2
173
45.2
99
51.3
Sophomore 166 28.8 113 29.5 53 27.5
Junior 91 15.8 61 15.9 30 15.5
Senior
47
8.2
36
9.4
11
5.7
Age
20.70
3.94
20.69
3.18
20.72
5.14
.947
Final grade % 82.19 9.82 81.78 9.60 82.99 10.22 .162
ALEKS Score 34.27 11.42 34.32 10.81 34.17 12.58 .876
Note. Pre-FACs semesters were the Fall 2017 and Spring 2018 students enrolled in Stat 1040 or Stat 1045. Spring
2019 was the FACs semester. All differences between the Pre-FACs and FACs groups are not statistically
significant.
not forced to take Stat 1040 even if they place into it. If students want to take Stat 1045,
they are allowed—these students are crossovers. There were 59 crossovers in the sample
(10.2%). Figure 7 shows the extremely large ALEKS math placement scores and the
number of crossovers above a score of 30 in Stat 1045.
Appendix I shows the summary and descriptive statistics when the 59 crossovers
in Stat 1045 are removed. Table I.1 shows the descriptive statistics with crossovers
removed with no significant differences between the Pre-FACs and FACs groups. Figure
I.1 shows the distribution of ALEKS scores among Pre-FACs and FACs semesters with
crossovers removed, and Figure I.2 shows the distribution of the final grade percentages.
75
Figure 7
Histograms of ALEKS Placement Scores by Course
Note. The black vertical line shows an ALEKS score of 30. N = 576.
Students with a minimum score of 46 on the ALEKS placement exam are placed
into a higher-level introductory statistics course (Stat 2000) that requires a pre-calculus
mathematical understanding. However, students are not required to skip Stat 1040 if they
place in a course beyond this introductory level. Thus, even with a higher level of
mathematical preparedness, students may take Stat 1040. I will call these students
“extreme cases.” Seventy-nine students (13.7 %) in the sample scored 46 or higher on the
ALEKS math placement exam. Further restricting the ALEKS score to those below 46
provided a window of 15 points around the cutoff. Cunningham (2021) suggested that the
76
Figure 8
Histogram of the Final Grade Percentages by Course
Note. N = 576.
window around the cutoff must be narrowed as the regression discontinuity design only
provides causality around the cutoff. Limiting the ALEKS scores to 15 on either side of
the cutoff localized the ALEKS scores around the cutoff. Figures J.1 and J.2 in Appendix
J display the distributions of ALEKS scores and final grade percentages by Pre-FACs and
FACs groups. Table J.1 displays the descriptive and summary statistics of the students’
demographics. There were no significant differences between the Pre-FACs and FACs
groups with extreme cases removed.
Twenty students were both crossovers and extreme cases (3.5%), which accounts
for 33.9% of the crossovers and 25.3% of the extreme cases. With the crossovers and
77
extreme cases removed (n = 123), 453 students remained in the sample. Appendix K
contains the descriptive statistics for the data with crossovers and extreme cases removed.
Table K.1 shows the descriptive and summary statistics, histograms of ALEKS scores are
provided in Figure K.1, and final grade percentages by Pre-FACs and FACs groups are
found in Figure K.2. There were no significant differences between the Pre-FACs and
FACs groups with the crossovers and extreme cases removed.
MLM Analyses for Regression Discontinuity
Using random intercepts to control for recitation section-to-section variability and
centering the ALEKS scores at the cutoff of 30, the baseline model was fit to determine
whether there was a change in the slopes from the pre-FACs to FACs achievement. I
performed an Analysis of Variance (ANOVA) using Type III Sums of Squares with
Satterthwaite’s method on each model to sequentially test the fixed effects for
significance of the main effects since there was no significant interaction between the
ALEKS score and whether or not the student’s score was 30 and above, see Table 7.
The slope at the discontinuity for the FACs group was significant (b = -6.45, p =
.033). Thus, there was a significant difference in achievement right below and right above
the cutoff. The slope of the ALEKS scores was significant for the FACs group (b = .45, p
= .003). Using the regression estimates in Table 7, calculations for the final grade at the
discontinuity for the pre-FACs and FACs groups can be made. For the pre-FACs group,
at an ALEKS of 30, the Stat 1040 student averaged 80.04% in the class. Students who
scored a 29 on the ALEKS placement exam were placed into Stat 1045 and averaged
81.04% in the class. However, in the FACs group, students with an ALEKS score of 30
78
Table 7
Regression Estimates for Baseline Models at the Cutoff (ALEKS = 30)
Pre-FACs
───────────────────
FACs
───────────────────
Effect
b
SE
Variance
n
b
SE
Variance
n
Fixed effects
Intercept
81.15***
1.86
86.28***
2.76
ALEKS scorea
0.11
0.19
0.45**
0.26
30 & aboveb
-1.11
2.04
-6.45*
2.99
ALEKS scorea × 30 &
above
b
0.16
0.20
-0.10
0.27
Random effects
Recitation section
6.45
2.39
Residual
83.33
92.52
Sample size
Recitation sections
30
15
Participants
383
193
Note. Significance was determined by performing a Type III Sums of Squares ANOVA with Satterthwaite's
method.
aALEKS score was centered at 30.
b 0 = false, 1 = true.
*p < .05. **p < .01. ***p < .001.
averaged 79.83% in the class, and students with an ALEKS score of 29 averaged 85.83%
in the class.
Because the ALEKS scores at the discontinuity was significant for the FACs
group in Table 7 but not significant for the interaction between the ALEKS score and
whether the student’s ALEKS score was 30 and above, two-way interactions were fit.
EQ(full) in Chapter III shows the equation for the model. The first interaction in the
model is the ALEKS score and whether the student’s ALEKS score was 30 and above,
and second interaction is whether the students’ ALEKS scores were 30 and above and
whether the students were in the FACs semester. Both interactions were not significant in
79
the model, and Type III Sums of Squares ANOVA with Satterthwaite's method was
performed to sequentially test the main effects for significance. Model 1 of Table 8
shows there was statistical significance for the slope of the ALEKS scores (b = 0.24, p <
.001).
To visualize what is happening at the cutoff, I fit the data to a three-way
interaction model between the variables of the students’ ALEKS scores, whether the
scores are above 30, and whether the students were in the FACs semester (see Model 1 of
Table 9). However, the three-way interaction did not show significance, so Type III Sums
of Squares ANOVA with Satterthwaite's method was performed to test the fixed effects
sequentially. Only the main effects of the ALEKS score (b = .10, p < .001) and whether
the student’s ALEKS scores were above 30 (b = -1.29, p = .031) were statistically
significant.
Then, I performed simple slopes analysis on the three-way interaction model.
Although the three-way interaction was not significant, simple slopes allowed me to
investigate the slopes at the cutoff within the Pre-FACs and FACs groups. Model 1 in
Appendix L shows that the slopes significantly differ from 0 in pre-FACs and FACs
groups for those who scored at least 30 on the ALEKS math placement exam.
The three-way interaction model allowed me to plot the student achievement on
the ALEKS scores before and after the discontinuity for the pre-FACs and FACs groups.
Figure 9 provides a visual estimate of the slopes of the achievement around the cutoff,
suggesting that there are differences between the final course percentages at an ALEKS
score of 30 despite the three-way interaction not being significant. It is important to note
80
Table 8
Models Fit to Equation 3
Model 1
───────────────────
Model 2
───────────────────
Model 3
───────────────────
Model 4
───────────────────
Effect
b
SE
Variance
n
b
SE
Variance
n
b
SE
Variance
n
b
SE
Variance
n
Fixed effects
Intercept
82.33***
1.62
82.36***
1.63
82.86***
1.65
82.77***
1.63
ALEKS scorea
0.24***
0.15
0.24***
0.15
0.23***
0.15
0.23***
0.15
30 & aboveb
-2.72
1.77
-3.59*
1.89
-4.41*
1.90
-5.29**
1.98
FACsc
1.90
1.90
1.64
1.91
1.60
1.95
1.63
1.88
ALEKS score × 30 & aboveb
0.06
0.16
0.20
0.20
0.12
0.16
0.30
0.20
30 & aboveb × FACsc
-1.17
2.13
-1.40
2.21
-0.48
2.38
-1.28
2.34
Random effects
Recitation section
4.73
4.56
4.52
3.46
Residual
86.60
88.27
88.91
90.20
Sample size
Recitation sections
45
45
45
45
Participants
576
497
517
453
Note. Significance of fixed effects was determined by performing a Type III Sums of Squares ANOVA with Satterthwaite's method.
aALEKS score was centered at 30.
b 0 = false, 1 = true.
cpre-FACs = 0, FACs = 1.
*p < .05. **p < .01. ***p < .001.
81
Table 9
Three-Way Interaction Models
Model 1
──────────────────
Model 2
──────────────────
Model 3
──────────────────
Model 4
──────────────────
Effect
b
SE
Variance
n
b
SE
Variance
n
b
SE
Variance
n
b
SE
Variance
n
Fixed effects
Intercept
81.24***
1.86
81.25***
1.88
81.81***
1.90
81.74***
1.88
ALEKS scorea
0.10***
0.19
0.10***
0.19
0.09***
0.19
0.09***
0.19
30 & aboveb
-1.29*
2.04
-2.53*
2.18
-2.72**
2.16
-4.13**
2.27
FACsc
5.17
3.33
4.95
3.36
4.80
3.39
4.79
3.37
ALEKS scorea ×30 & aboveb
0.16
0.20
0.35
0.25
0.19
0.21
0.42
0.25
ALEKS scorea × FACsc
0.37
0.31
0.38
0.32
0.37
0.32
0.36
0.32
30 & Aboveb × FACsc
-5.25
3.61
-4.54
3.84
-5.21
3.82
-4.74
3.99
ALEKS scorea × 30 & aboveb × FACsc
-0.29
0.33
-0.41
0.40
-0.21
0.34
-0.31
(0.42
Random effects
Recitation section
4.88
4.78
4.69
3.69
Residual
86.51
88.24
88.58
90.19
Sample size
Recitation sections
45
45
45
45
Participants
576
497
517
453
Note. Significance was determined by performing a Type III Analysis of Variance Table with Satterthwaite's method.
aALEKS score was centered at 30.
b 0 = false, 1 = true.
cpre-FACs = 0, FACs = 1.
dT1 = 0, T2 = 1
*p < .05. **p < .01. ***p < .001.
82
Figure 9
Estimated Means of Final Grade Percentage by ALEKS scores Around the Cutoff, Pre-
FACs and FACs
Note. Model 1 was fit on 576 participants in 45 recitation sections. Bands are ± 1 SE from the estimated
marginal mean. The vertical dotted line represents an ALEKS of 29.5.
that three-way interactions are often underpowered to achieve significance (Moonseong
& Andrew, 2010). Recall that crossovers and extreme cases can cause bias to the
regression estimates in regression discontinuity. Therefore, further limiting the data
offered insight into the differences between slopes at the discontinuity.
Crossovers and Extreme Cases
To provide unbiased regression estimates, regression discontinuity requires that
the difference between students above an ALEKS score of 30 and below an ALEKS
83
score of 30 is due to the math placement score and nothing else. Thus, the crossovers
seen in Figure 7 must be eliminated from the data as they can bias the estimates
(Cunningham, 2021). Additionally, it is important to view the data close to the cutoff
score—as narrowly as possible—while preserving as much data as needed to model it
appropriately (Lesik, 2006). Plotting the LOESS (locally weighted least squares
regression) curve on the data showed that the slopes of the regression estimates were
influenced by the extreme ALEKS scores (see Appendix M). Thus, I made additional
models to remove crossovers and extreme cases. Model 1 refers to the models fit to the
full dataset with crossovers and extreme cases included. I removed the 79 extreme cases
from the dataset and referred to subsequent models fit to this dataset as Model 2. I
removed the 59 crossovers from the dataset and referred to subsequent models fit to this
dataset as Model 3. Lastly, for the fourth dataset, I removed a total of 123 crossovers
and/or extreme cases from the dataset and referred subsequent models fit to this dataset as
Model 4. The best dataset according to the regression discontinuity specifications was the
dataset fit as Model 4 because crossovers and extreme cases are removed. However, the
calculations made on all four datasets allow for comparisons.
I followed the same process for these datasets as the full dataset to provide
comparisons across the regression parameters. Thus, these datasets were fit to model
EQ(full), the full equation described in Chapter III, with the two-way interactions. The
regression parameters for Models 2-4 are shown with Model 1 in Table 8. Like Model 1,
none of the interactions in Models 2-4 were significant. I then used the Type III Sums of
Squares ANOVA using Satterthwaite’s method to sequentially test the fixed effects on
84
each model for significance. Models 2, 3, and 4, all had significance for the slope of the
ALEKS scores (b = 0.24, p < .001 ; b = 0.23, p < .001 ; b = 0.23, p < .001) and at the
discontinuity (b = -3.59, p = .021; b = -4.41, p = .013; b = -5.29, p = .002) but no
significance was found on either of the two-way interactions.
The three-way interaction was fit to these data, and again, no interactions were
significant (see Table 9). Similar to Model 1, the slope of the ALEKS scores for Models
2-4 were significant (p < .001) and the discontinuity was significant for Models 2-4 (b = -
2.53, p = .013; b = -2.72, p = .006; b = -4.13, p = .001). After this, I ran simple slopes
analyses for Models 2-4, shown in Appendix L, in which the outputs mimicked the output
for Model 1. I continued to use the three-way interaction model for each of Models 2-4 to
plot the marginal means for the final grade percentage on the ALEKS scores for pre-
FACs and FACs groups. The plots for all four models are found in Appendix P. The
change in slopes at the cutoff for the pre-FACs and FACs semesters warranted post hoc
analyses.
Post Hoc Analyses for the Change in Final
Grade Percentage at the Cutoff
Noting the significance of the main effects of the two variables (the ALEKS score
and whether students were at or below the cutoff ALEKS score of 30), I ran post hoc
analyses. I calculated the standardized mean differences (SMD) to measure the size of the
change in the average final course percentages at the cutoff for all four models. The SMD
give Cohen’s d-like effect sizes by dividing point differences by the pooled standard
deviation. The pooled standard deviation was estimated by pooling the variance
85
components of the MLM model (Brysbaert & Stevens, 2018). Pairwise comparisons used
Fisher's least significant difference (LSD) procedure, utilizing the Kenward-Roger
method for degrees of freedom (Luke, 2017). Post hoc analyses revealed statistically
significant differences in the average final grade percentages for the FACs semester for
Models 2-4 before and after the discontinuity, see Table 10.
Table 10
Standardized Mean Differences and Pairwise t Tests for Change in Final Grade
Before and After the Cutoff (Less than 30 – At least 30) for all Four Models
Model
Group
SMD
t
df
p
1
Pre-FACs
.06
.22
512
.825
FACs
.74
1.85
538
.064
2
Pre-FACs
.22
0.93
374
.351
FACs
.74
0.21
160
.033
3
Pre-FACs
.21
0.76
281
.446
FACs
.83
2.09
299
.037
4
Pre-FACs
.40
1.42
166
.102
FACs
.91
0.48
125
.010
Note. Model 1 was fit on 576 participants in 45 recitation sections. Model 2 was fit to the
data with 79 extreme cases removed on 497 observations nested in 45 recitation sections.
Model 3 was fit to the data with 59 crossovers removed on 517 participants nested in 45
recitation sections. Model 4 was fit to the data with crossovers and extreme cases removed
on 453 participants nested in 45 recitation sections. Pair-wise t tests utilize Fisher’s LSD
and are unadjusted p-values. Degrees of freedom utilize the Kenward-Roger method.
The pre-FACs and FACs semesters saw changes in course achievement across the
cutoff. For the pre-FACs semesters, the students who scored just below 30 on the ALEKS
math placement exam performed better in the course on average than those who scored
30 on the ALEKS. For Model 4, this difference is shown by the SMD of .40 in Table 10,
which is not significant (p = .102). Using the parameter estimates in Table 9, this
86
difference in means for pre-FACs semesters across the cutoff can be quantified. For the
pre-FACs students who scored an ALEKS of 29, the average final grade percentage was
81.65%, and for those who scored an ALEKS of 30, the average final grade percentage
was 77.61%, a difference of 4.04%.
For the FACs semester, Model 4 in Table 10 shows that the mean difference of
the final grade percentages across the cutoff was statistically significant (SMD = 0.91, p =
.010). Thus, the SMD of .91 implies that the average standardized difference in final
grade percentages for students experiencing FACs was nearly one standard deviation
higher for those students scoring 29 on the ALEKS placement exam than those who
scored 30. Although pre-FACs semesters’ difference in course achievement across the
cutoff was not significant, it was statistically significant for the FACs semester. Figure 10
is a visualization of these changes at the cutoff for Model 4.
The change in course achievement from pre-FACs to FACs semesters was not
significant. Table 11 shows the effect sizes from pre-FACs to FACs semesters at the
cutoff. Below an ALEKS of 30 represents Stat 1045 students and ALEKS scores at least
30 are those in Stat 1040. None of the pairwise t-tests were significant. Although the Stat
1045 course’s differences from their course grade from pre-FACs to FACs semesters had
a medium effect size for Model 4, it was not statistically significant (SMD = .52, p =
.159).
Figure 11 visualizes these differences in final grade percentages in Stat 1040 and
Stat 1045 from pre-FACs to FACs semesters at the cutoff for Model 4. These differences
can also be quantified. FACs made no significant change in the final course percentage
87
Figure 10
Side by Side Plots of Final Grade Percentage at the Cutoff, Pre-FACs to FACs Semester
for Model 4
Note. Model 4 was fit to the data with crossovers and extreme cases removed on 453 participants nested in
45 recitation sections. Bands are ± 1 SE from the estimated marginal mean. The vertical dotted line
represents an ALEKS of 29.5.
for Stat 1040 students at an ALEKS of 30, as the achievement for pre-FACs (red line)and
FACs (blue line) at an ALEKS of 30 is a difference of 0.04%. This difference is verified
by the SMD of .01 (p = .968) for the “At Least 30” group of Model 4 in Table 11.
However, the “Below 30” group shows a potential difference in slopes pre-FACs (red
line) to FACs (blue line). From Table 11, the SMD shows this difference was about .52
standard deviations in the final grade percentage but is not significant (SMD = .52, p =
.159). Although this difference is not statistically significant, using the parameter
estimates in Table 9, this difference can also be realized. In the pre-FACs semesters, for
88
Table 11
Standardized Mean Differences (Effect Sizes) and Pairwise t Tests for the
Difference in Final Grade Change from Pre-FACs to FACs (Pre-FACs –
FACs) for all Four Models
Model
ALEKS group
SMD
t
df
p
1
Below 30
.71
1.50
495
.134
At least 30
.03
0.20
77
.840
2
Below 30
.56
1.45
330
.145
At least 30
.04
0.21
156
.834
3
Below 30
.63
1.40
360
.164
At least 30
.02
0.12
60
.910
4
Below 30
.52
1.41
227
.159
At least 30
.01
0.04
135
.968
Note. Model 1 was fit on 576 participants in 45 recitation sections. Model 2 was fit to the
data with 79 extreme cases removed on 497 observations nested in 45 recitation sections.
Model 3 was fit to the data with 59 crossovers removed on 517 participants nested in 45
recitation sections. Model 4 was fit to the data with crossovers and extreme cases removed
on 453 participants nested in 45 recitation sections. Pair-wise t tests utilize Fisher’s LSD
and are unadjusted p values. Degrees of freedom utilize the Kenward-Roger method. No p
values were significant at α = .05.
an ALEKS score of 29, the average final grade percentage was 81.65%. However,
students with the same ALEKS score of 29 who experienced FACs scored 86.08% in the
class, a difference of about 4.43%. This difference is about one-half of a standard
deviation for the overall final grade percentage (SD = 9.83%, see Table K). Indeed, final
grade percentages were not statistically significant at the cutoff from pre-FACs to FACs
semesters. Still, the difference in average final grade percentages between pre-FACs to
FACs semesters for those who scored below 30 on ALEKS could be meaningful.
89
Figure 11
Estimated Means of Final Grade Percentage by ALEKS Scores Around the Cutoff for
Model 4
Note. Model 4 was fit to the data with crossovers and extreme cases removed on 453 participants nested in
45 recitation sections. Bands are ± 1 SE from the estimated marginal mean. The vertical dotted line
represents an ALEKS of 29.5.
Summary
Multilevel modeling was appropriate to model the change in final grade
percentages from the pre-FACs to FACs semesters for different ALEKS math placement
exam scores using regression discontinuity methodology. The three-way interaction
model was fit to the data by deleting crossovers and extreme scores by minimizing the
range of ALEKS scores to about 15 points on either side of the exogenous cutoff score,
which places students below 30 into Stat 1045 and otherwise into Stat 1040. Model 4
90
found no interaction between the variables of the ALEKS score, the discontinuity, and
whether the student was in the FACs semester. However, potential differences in average
final grade percentages were seen before and after the cutoff and from pre-FACs to FACs
semesters. Although there is an average improvement of 4.5% in final grade percentages
from pre-FACs to FACs semesters for students who scored 29 on the ALEKS placement
exam, it was not statistically significant (SMD = .52, p = .159). Significant differences in
average final grade percentages were discovered at the cutoff in the FACs group. For an
ALEKS score of 29, the Stat 1045 FACs group saw a significant average improvement of
8.5% in their final grade percentage compared to the average FACs Stat 1040 student
with an ALEKS of 30 (SMD = .91, p = .01). The only significant difference found in
course achievement was at the cutoff for the FACs semester. Although there was no
statistical significance for average final grade percentages at the cutoff from pre-FACs to
FACs semesters, the difference may still be meaningful and is discussed in the next
chapter.
Analysis for Research Question 2
Research Question 2 investigated potential improvement in students’ attitudes
toward statistics in large-enrollment introductory statistics courses with embedded
Formative Assessment Cycles (FACs). I explain and provide the participants’ descriptive
and summary statistics in this section. Then, I report the Cronbach’s alpha values that
provided a measure of the internal consistency before interpreting the scores from my
sample, followed by a visual assessment of the data through histograms and person-
91
profile plots. Next, using multilevel modeling (MLM), I discuss the analysis of each
attitude component of the SATS-36 (Schau, 2003). Subsequently, I provide plots of the
change in attitude scores from pre-to post-survey and discuss the practical and statistical
significance of the effect sizes calculated. Then, I devote a section to the analysis of
students who took the ALEKS math placement exam before the semester to determine
whether their mathematical knowledge before the class impacted their attitudes. Finally, a
summary of the findings for the second research question completes this section. The
complete analysis in an R markdown file is available upon request.
Descriptive and Summary Statistics
I taught the two large-enrollment introductory statistics sections of Stat 1040 and
Stat 1045 in Fall 2021 and Spring 2022 semesters. Students in both sections experienced
FACs in their homework, quizzes, and examinations. These course assessments were
identical in both sections and semesters. All students had the opportunity to take the
SATS-36 pre-survey during the first two weeks of the semester and the SATS-36 post-
survey during weeks 12 and 13 of the 15-week semester. Of the 531 enrolled students, 66
withdrew or failed, leaving 465 students (88%) who completed the semester and received
the full FACs intervention. Approximately 95% of the 465 students completed at least
one of the two surveys, and only 5% (n = 24) failed to respond, leaving 441 participants
for analysis, see Table 12.
Table 13 displays the descriptive and summary statistics for the students who
received the full FACs intervention. Of the 465 students, the majority were female
(76.8%), which was significant associated with participation in the surveys (p = .001).
92
Table 12
Students Participation in the SATS-36 Survey: Total and After Removing Student
Responses Who Withdrew or Failed
Survey response
n
Percentage (N = 531)
Percentage (N = 465)
Pre Only
57
10.75
12.26
Pre & Post
356
67.04
76.56
Post Only
28
5.27
6.02
Neither
24
4.52
5.16
Student withdrew/failed
66
12.43
--
Total
531
100
100
Most students who received the FACs intervention were from the fall semester (59.1%)
and Stat 1040 (65.6%). Both the semester and course were not associated with
participation in the surveys (p = .390; p = .597).
Freshmen accounted for 44.3% of students, while sophomores accounted for
34.6%. Juniors and seniors accounted for a total of 21% of the students. The average age
of the students was 19.79 (SD = 2.66). Both class rank and age were insignificant (p =
.248; p = .125). The final grade in the course averaged 83.86% (SD = 9.46%), which was
significantly different for survey participation (p < .001). Both significant demographic
and summary variables, sex and final grade, were important variables in the multilevel
model analyses as a main factor and an interaction, respectively.
More descriptive and summary statistics are found in Appendix O and P.
Appendix O shows the descriptive and summary statistics of the breakdown of the
prerequisites the students used for placement into Stat 1040 and Stat 1045. The pre-
requisite was not significantly associated with survey participation (p = .670). Appendix
93
Table 13
Descriptive and Summary Statistics of the Participation in the Surveys