Conference PaperPDF Available

Step Tutor: Supporting Students through Step-by-Step Example-Based Feedback

Abstract and Figures

Students often get stuck when programming independently, and need help to progress. Existing, automated feedback can help students progress, but it is unclear whether it ultimately leads to learning. We present Step Tutor, which helps struggling students during programming by presenting them with relevant, step-by-step examples. The goal of Step Tutor is to help students progress, and engage them in comparison, reflection, and learning. When a student requests help, Step Tutor adaptively selects an example to demonstrate the next meaningful step in the solution. It engages the student in comparing "before" and "after" code snapshots, and their corresponding visual output, and guides them to reflect on the changes. Step Tutor is a novel form of help that combines effective aspects of existing support features, such as hints and Worked Examples, to help students both progress and learn. To understand how students use Step Tutor, we asked nine undergraduate students to complete two programming tasks, with its help, and interviewed them about their experience. We present our qualitative analysis of students' experience, which shows us why and how they seek help from Step Tutor, and Step Tutor's affordances. These initial results suggest that students perceived that Step Tutor accomplished its goals of helping them to progress and learn.
Content may be subject to copyright.
Step Tutor: Supporting Students through Step-by-Step
Example-Based Feedback
Wengran Wang, Yudong Rao, Rui Zhi, Samiha Marwan, Ge Gao, Thomas W. Price
North Carolina State University
Raleigh, NC
Students often get stuck when programming independently, and
need help to progress. Existing, automated feedback can help stu-
dents progress, but it is unclear whether it ultimately leads to learn-
ing. We present Step Tutor, which helps struggling students during
programming by presenting them with relevant, step-by-step ex-
amples. The goal of Step Tutor is to help students progress, and
engage them in comparison, reection, and learning. When a stu-
dent requests help, Step Tutor adaptively selects an example to
demonstrate the next meaningful step in the solution. It engages
the student in comparing “before” and “after” code snapshots, and
their corresponding visual output, and guides them to reect on
the changes. Step Tutor is a novel form of help that combines eec-
tive aspects of existing support features, such as hints and Worked
Examples, to help students both progress and learn. To understand
how students use Step Tutor, we asked nine undergraduate students
to complete two programming tasks, with its help, and interviewed
them about their experience. We present our qualitative analysis of
students’ experience, which shows us why and how they seek help
from Step Tutor, and Step Tutor’s aordances. These initial results
suggest that students perceived that Step Tutor accomplished its
goals of helping them to progress and learn.
ACM Reference Format:
Wengran Wang, Yudong Rao, Rui Zhi, Samiha Marwan, Ge Gao, Thomas W.
Price. 2020. Step Tutor: Supporting Students through Step-by-Step Example-
Based Feedback. In Proceedings of the 2020 ACM Conference on Innovation
and Technology in Computer Science Education (ITiCSE ’20), June 15–19, 2020,
Trondheim, Norway. ACM, New York, NY, USA, 7 pages.
Students get stuck in many ways while programming [
], leading
to frustration [
]. Ideally, a student can ask for instructor help,
but this may be dicult in today’s growing CS classrooms [
where instructor availability is limited. And the student may see
asking for instructor help as a threat to their competence and inde-
pendence [
]. To solve this problem, researchers have developed
various kinds of automated, adaptive programming feedback to help
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specic permission
and/or a fee. Request permissions from
ITiCSE ’20, June 15–19, 2020, Trondheim, Norway
©2020 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-6874-2/20/06. . . $15.00
students [
]. Like instructor feedback, adaptive feedback
is context-dependent and personalized to address students’ current
code. Unlike instructor feedback, adaptive feedback is also scalable:
it can be generated automatically using data-driven approaches
(e.g., [
]). This feedback can reduce teacher workload, and does
not pose a social threat to students.
Adaptive feedback takes many forms, such as highlighting possi-
bly erroneous code [
], reporting completed and uncompleted
objectives [
], and correcting likely misconceptions [
]. Perhaps
the most common form of adaptive feedback is on-demand, next-
step hints, which suggest a small change a student can make to
progress or x an error [
]. These hints can help students
progress when stuck [
], and sometimes achieve better learn-
ing [
], especially when students engage in self-explanation of the
hint [33].
However, next-step hints also have limitations: they present a
single edit, with little context. The hints may be dicult to interpret
or fail to address students’ immediate goals, leading students to
ignore or avoid them [
]. They do not always lead to learning
], and students can abuse the help to complete the problem
without eort [
]. This suggests a need to design improved forms
of adaptive feedback, that oer context, reection and learning.
In this paper, we present Step Tutor, an extension of the Snap!block-
based programming environment, which adds on-demand, example-
based feedback. The goal of Step Tutor is 1) to help struggling
students progress by demonstrating a correct solution step, and
2) to help them learn why the step works and how to apply it in
the future. It does so by demonstrating an example step from the
current problem, allowing the student to compare and run code
representing before and after the step is completed, and reect
on the dierence. Step Tutor incorporates prior work on Worked
Examples, which are an eective learning support [
]. It shows
examples step-by-step, to support students during problem-solving.
By combining the immediate usefulness of next-step hints and the
learning eectiveness of Worked Examples, Step Tutor aims to help
struggling students progress, and guide them to explore and reect
on its suggested changes, so they can learn to apply it in the future.
To understand how students use Step Tutor, we asked nine un-
dergraduate students from introductory programming classes to
use it to solve two programming problems. We conducted inter-
views with these students after each problem, and analyzed their
responses using thematic analysis [
]. Our results revealed diverse
needs and activities students had using the system. They also pre-
sented rich data on students’ experience with Step Tutor, showing
various help-seeking [
] and sense-making [
] behaviors that may
help students progress and learn.
ITiCSE ’20, June 15–19, 2020, Trondheim, Norway
Next-step hint and its limitations in promoting learning:
Hints have been traditionally valued in education theory as
a tutorial tactic, which provides students with the information
needed to progress or prompts them to reect on their knowl-
edge and problem-solving status [
]. Adaptive help systems com-
monly use on-demand, next-step hints that suggest the next step
a student should take [
]. In the domain of program-
ming, next-step hints can help students program more eciently.
For example, undergraduate students spent signicantly less time
completing Lisp programming tasks, with three dierent forms of
next-step hints, compared to those without [
]. Additionally, in
block-based Snap!programming, next-step hints with explanations
or self-explanation prompts allowed crowd-workers to complete
more programming objectives in the same amount of time, than
those without hints [
]. But these next-step hints rarely lead to
improved learning, for example, as Rivers found in an evaluation of
the ITAP python tutor [
]. This missed learning can occur because
hints fail to address students’ needs [
], for hints show only
one edit at a time, and may lack context and details needed to inter-
pret the suggestion [
]. Additionally, students often misuse hints
(e.g., hint abuse) [
], allowing them to progress without understand-
ing each step. Aleven et al. reviewed hints from various domains,
and concluded that learning from hints only happens occasionally,
under certain circumstances, and its eect is small [2].
How do students learn programming strategies? Schön explained
the cognitive process of learning procedural knowledge [
] through
reection-in-action, emphasizing that new knowledge is gained
through self-reection, during which the learner repeatedly ques-
tions herself while actively working and testing on the learning ma-
terial [
]. This emphasis on self-reection and activity is echoed
by several learning theories, such as the sense-making process
highlighted by the KLI framework [
]), and the emphasis on active
interaction with the learning material suggested by the ICAP frame-
work [
]. For a next-step hint to oer students meaningful content
to promote sense-making and self-reection, its “next step” may
involve more than one single edit. And for the next-step hint to en-
able an active learning experience, its feedback window should go
beyond just allowing students to passively view the content. These
indicate ways to improve upon next-step hints, to oer feedback
that gives a clear step, and enables students to interact with the
feedback and reect on why and how the step works.
Worked Examples:
Worked Examples (WEs) are a form of instructional support,
which give students a demonstration of how to solve a problem
]. The eectiveness of WEs is grounded in Cognitive Load The-
ory, which argues that learners have a nite amount of mental
resources during problem-solving (called cognitive load), and when
problems impose an unnecessary burden on those resources (in-
trinsic load), the student has fewer resources left for processing
and learning the material (germane load) [
]. WEs help learning
by providing support for “borrowing” knowledge, reducing the
unnecessary intrinsic load [
]. Unlike next-step hints, WEs are
traditionally oered in lieu of problem-solving, usually “before” or
“after” a student solves another related programming task [
WEs have been implemented in a few programming environments,
such as WebEx [
]. With the help of WEs, students can learn the
problem-solving schema [
] and transfer it to another task [
For example, Trafton et al. evaluated 40 undergraduate students’
post-test scores after programming in BATBook, a Lisp program-
ming learning environment, and found that those with alternating
WE and problem-solving (PS) pairs performed better than those
with only PS pairs [48].
Worked Examples during problem-solving:
WEs are an eective learning support, but students still need
help during programming when they are stuck. Looking Glass pro-
vides students with annotated WEs from another similar task during
block-based programming [
]. However, learners had dicul-
ties understanding these examples in Looking Glass, encountering
“example comprehension hurdles” while trying to connect example
code to their own code [25, 26].
Novice students are usually not able to spontaneously transfer
knowledge they learn from one problem to another isomorphic
problem [
], so they can benet more if WEs are oered from the
same programming problem. Peer Code Helper oers such step-by-
step WEs from the same task, during block-based programming
]. An evaluation on 22 high school novice students showed that
students using these WEs solved tasks quicker than those without,
without hindering their learning [
]. The FIT Java Tutor [
] pro-
vides such step-by-step WEs for Java programming. Investigation
on ve students’ programming experience showed that students
occasionally followed the feedback and improved their program
over time [
]. But, example steps in these programming problems
are non-adaptive [
], or coarse-grained [
], and lack the nec-
essary scaolding for students to make sense of them. In a study
evaluating 23 students’ experience with step-by-step WEs oered
during Java programming, students barely followed the examples,
reporting them being “unspecic and misleading” [
]. Therefore,
more work is needed to design new forms of example feedback,
to promote reection through ne-grained example steps, and to
enable progression through an interactive experience [13].
Step Tutor is an extension of the Snap!programming environ-
ment. Its goal is to help students progress and learn when stuck,
by teaching them a meaningful step when requested. Step Tutor
helps students by showing them a concrete example of how the step
could be completed, including both the changes in the code and
the corresponding changes in the program’s output, and prompt-
ing the student to reect on these dierences. In our context, an
“example step” should be self-contained, and large enough to mean-
ingfully change the program’s output, but small enough to be easily
digested. This feedback serves the dual purpose of helping the stu-
dent: 1) progress when stuck (as with a hint), and also 2) critically
engage with and reect on the example code to learn generalizable
programming concepts. We designed Step Tutor to achieve this
goal through a feedback window that facilitates comparison, code
running, and self-explanation.
Because Step Tutor extends Snap!, instructors can easily inte-
grate it into widely-used Snap!programming curricula (e.g., the
Beauty and Joy of Computing [
]). Although Step Tutor currently
Step Tutor: Supporting Students through Step-by-Step Example-Based Feedback ITiCSE ’20, June 15–19, 2020, Trondheim, Norway
Figure 1: An example step given by Step Tutor, which in-
cludes a Code Comparison Panel, two Click-and-Run But-
tons, and a Self-Explanation Prompt.
only supports Snap!-based programming, the design and algorithm
we used to create Step Tutor and its feedback is language-agnostic.
3.1 The Step Tutor Feedback Window
Consider a student who gets stuck when working on a homework
problem in Snap!, perhaps due to a bug in her code, a misconcep-
tion, or uncertainty about how to proceed. Rather than waiting
for oce hours or a response on a forum, the student can click on
the “Show Example” button that’s displayed on the programming
interface to ask for help. Step Tutor added this “Show Example”
button to the original Snap!interface to remind students about this
added option to view a step. It ashes every 90 seconds, inspired
by prior work [
], which suggests that students can become too
engaged in solving a challenging problem to notice or act on their
own need for help [
]. When the student clicks on the button,
she sees the Step Tutor feedback window (Figure 1), which shows
a carefully-selected example step (explained in Section 3.2). The
feedback window guides the student through learning the step in
three ways, designed to promote deliberate comparison and self-
reection, to help students learn the step, and learn how to apply it
again in the future [44]:
Comparing and running the code:
At the top of the feedback
window, the student sees two code snapshots, which together give
a meaningful, interpretable step that a student can take to proceed
towards the solution, selected by the example selection algorithm
in Section 3.2. The left “before” code is similar to the student’s code,
representing “before” the step is completed, and the right “end” code
shows the changes needed to complete the step. The student can
inspect the example step by comparing the left and right code. We
want to encourage learning the step through comparison, because
prior work in programming education suggests that comparison is
a powerful way for a student to learn from examples and generalize
Figure 2: Three possible solution paths for Task 2, made up
of three example steps: (a) MakeRow: drawing a row of trian-
gles; (b) TriangleToHill: changing the triangle to a stripped
hill; (c) AddColor: adding color for each triangle.
domain principles [
]. The student is also encouraged to run
the code to understand the step. The click-and-run feature prompts
students to actively engage with and reect on the example code,
which is an essential element of active learning dened by the ICAP
framework [10].
Writing self-explanation
: After the student has viewed an ex-
ample step, she can answer the self-explanation prompt: “In your
own words, describe the dierence between the two examples”. Al-
though answering the prompt takes time, prior work suggests that
self-explanation is critical for learning from feedback [
and such self-explanation does not emerge spontaneously without
carefully-designed prompts [2, 18]. After writing self-explanation,
she can either close the Step Tutor feedback window, or leave it
open as a reference when she continues to write code.
When the student continues to make progress, but gets stuck
again, she can use the “Show Example” button to ask for another
help, seeing a new example step adapted to her current code. In-
structors could easily add a limit of total request to prevent help
abuse [
]. To explore a student’s natural use of the system, we did
not impose the limit in our study. Additionally, even if a student
does abuse help by repeatedly asking for examples, she has still
experienced a step-by-step Worked Example, which prior work
shows leads to more ecient learning outcomes than solving the
problem from scratch [48].
3.2 The Example Selection Algorithm
An important feature of Step Tutor is that it adaptively selects exam-
ples, tailoring them to a student’s current code. It does so through
an example selection algorithm, which searches for an example step.
To help students achieve optimal learning, the example step should
be one that the student has not completed but is ready to complete
with some help – meaning one in the student’s Zone of Proximal
Development [
]. Our approach extends our prior work [
], and
consists of the following steps:
An instructor creates a database of example steps:
An instruc-
tor can rst create a set of example steps for a problem, each consist-
ing of “before” and “after” code (as shown in Figure 1), representing
one meaningful step in completing the solution which ideally alters
the program’s output. In many problems, solution steps can be
completed in various orders [
]. As is shown in Figure 2, there are
multiple ways to reach the solution. To illustrate each possible path
to the solution, it is helpful to author an example pair (i.e., the “be-
fore” code and the “after” code) for each transition (i.e., each arrow
in Figure 2). While in this work, we author these steps by hand, our
ITiCSE ’20, June 15–19, 2020, Trondheim, Norway
prior work suggests they can also be generated automatically from
student data, matching the quality of expert examples [51].
The algorithm selects “before” and “after” code snapshots:
When a student requests an example step, the selection algorithm
attempts to select one with “before” code very similar to the stu-
dent’s, so it is easy to understand. The algorithm transfers students’
code to an abstract syntax tree (AST), and then calculates the dis-
tance between the student’s code AST and the “before” code AST of
each of the expert examples using the SourceCheck code distance
function [
]. Among the two examples that have the smallest dis-
tance to the student’s current code, we select the example with the
fewest completed steps, so err on the side of giving away less of the
solution. The algorithm then attempts to select “after” code that
demonstrates a step the student can understand. Since all selected
example pairs will complete at least one step beyond the student’s
code, the algorithm selects the example step with “after” code clos-
est to the students’ current code, which is most likely to be the step
the student is currently working on.
Before deploying Step Tutor in a classroom, we wanted to collect
formative data on how and why students used Step Tutor, to gain a
better understanding of the strengths and weaknesses of the system,
and whether it is likely to achieve its goals of supporting progress
and learning. So, we recruited a small group of students to use Step
Tutor while solving two programming problems, and conducted
interviews to better understand their experiences.
We recruited nine undergraduate students from two
introductory programming courses in a large, public university
in the Southeast United States. In both classes, students had been
taught Snap!programming for one to two months. The students
included ve males and four females, with two identifying as His-
panic/Latino, three as Asian, three as White, and one undisclosed.
Each student received a $25 gift card as compensation at the end of
the study.
Students completed the procedures one at a time. To
start with, they read a short tutorial on Snap!to refresh their memo-
ries. They were then given up to 30 minutes to complete each of two
programming tasks: Stairway and Row of Hills (the latter shown
in Figure 2). Both tasks required students to use variables, loops,
and nested loops. These tasks were appropriate for our study, be-
cause they contain decomposable steps, visual output, and concepts
that were beyond what students learned in their coursework (e.g.,
three nested loops). The researcher did not oer help to students
except to conrm when the programming task was completed. Af-
ter completing each programming task, the researcher conducted a
semi-structured interview to ask about the student’s experience in
the task. Each of the two interview sessions lasted between ve to
15 minutes.
Each interview includes a retrospective think-aloud protocol [
during which we asked students to watch the video we recorded
when they interacted with Step Tutor. While watching the video,
they were asked to explain their thoughts and activities. The in-
terviewer also asked students’ general experience with Step Tutor.
Since one of the goals of Step Tutor was to provide more help-
ful feedback than next-step hints, and students had had access to
next-step hints [
] in their classrooms, we also included ques-
tions asking students to compare Step Tutor to next-step hints. To
encourage impartial feedback, they were not informed that Step
Tutor was designed by our research team, and the questions were
designed to evoke open-ended responses (e.g., “Here’s an example
you have requested; how do you feel about it?”).
Rather than a summative evaluation, we conducted a formative
evaluation to understand students’ experience with Step Tutor. As
a pilot study, we used qualitative analysis to capture the various
ways students interacted with Step Tutor. For this initial analysis,
We focused on analyzing the interview data, because it oers a
comprehensive understanding of students’ experience.
5.1 Thematic Analysis
We used thematic analysis to summarize and identify central themes
from our interviews [
]. Using the six-phase thematic analysis
method outlined by Braun and Clarke [
], two researchers each
independently read the transcripts thoroughly (Phase 1), open-
coded conversation sentences with labels of interest, then met to
discuss and rene the codes, producing 125 initial codes (Phase 2).
The two researchers then iteratively analyzed and categorized codes
to generate themes, i.e., general ideas that emerged in codes (Phase
3). They then revisited the original data to rene the initial themes
into main themes, each including several sub-themes (Phase 4). The
two researchers then discussed and dened the themes (Phase 5).
We present the results (Phase 6) in Section 5.2.
5.2 Findings
Our thematic analysis has revealed three main themes: why I chose
to use or not use Step Tutor (WHY); how I used Step Tutor (HOW);
what aordances Step Tutor oers (WHAT) (Table 1). During the
Phase 4 of our thematic analysis, when we rened initial themes
into these three main themes and their sub-themes, we revisited the
data and found a meaningful correspondence between the WHY,
HOW, WHAT, and the frequency students asked for examples. This
correspondence revealed three groups of students, with dierent
levels of reliance on examples:
high-use group
- P1, P3, P6; 2+
per task)
medium-use group
- P4, P8; 0-2 per task), and
use group
- P2, P5, P7, P9; 0-1 per task). Rather than providing
a denitive categorization of student behaviors, we used these
groups to draw descriptive connections between the WHY, HOW,
and WHAT of each individual student. Table 1 provides the count
of students in each group who discussed each theme.
5.2.1 WHY I chose to use or not use Step Tutor? Students’ discus-
sions of “WHY” can tell us how we may adjust our design goal
to align with students’ expectations. Students discussed high-level
goals, such as aective and achievement goals, that are not problem-
specic, and low-level goals, that describe specic outcomes students
want in relation to their current code. Here we focus on presenting
high-level goals, since low-level goals align strongly with HOW
students used Step Tutor, discussed in Section 5.2.2.
high-use group includes participants P1, P3 and P6. Each of them were observed to
have requested at least two example steps per task.
Step Tutor: Supporting Students through Step-by-Step Example-Based Feedback ITiCSE ’20, June 15–19, 2020, Trondheim, Norway
Themes H (3) M (2) L (4) Total
1. WHY I chose to use or not use Step Tutor
1.1. High-level goals
1.1.1. Progress 3 2 4 9
1.1.2. Assurance 1 0 0 1
1.1.3. Expedience 1 0 0 1
1.1.4. Independence 1 2 4 7
1.2. Low-level goals
1.2.1. Find next step 3 0 0 3
1.2.2. Find how to do a step 1 2 1 4
1.2.3. Fix a problem in my code 1 0 4 5
2. HOW I used Step Tutor
2.1. Run example code 3 1 3 8
2.2. Comparing example code 3 2 2 7
2.3. Write self-explanation 2 1 2 5
2.4. Locate the change 0 0 4 4
2.5. Copy example code 1 0 0 1
3. WHAT aordances Step Tutor oers
3.1. Comparison with hints 3 2 3 8
3.2. Roadmapping
3.2.1. Connect the roadmap 2 0 0 2
3.2.2. Roadmap transfer 1 0 0 1
Table 1: Themes discussed by students from high-use (H),
medium-use (M), and low-use (L) groups.
High-level goals.
A primary high-level motivation expressed by
all students was the desire to
in the assignment. When
asked why they asked for an example, one student stated "I don’t
wanna like, keep getting stuck on this one little piece"[P1/H]
. This
aligns with one of the primary goals of Step Tutor: to help students
progress. We also saw other motivations that we had not anticipated.
For example, one student noted her desire for
: "students
like me who are not like pretty condent in programming, having an
example makes us feel . .. like, this is how you need to do it."[P3/H].
Step Tutor may address this need for assurance with the “before”
code snapshot, which allows them to conrm the correctness of
what they have already written. Another student’s motivation for
using Step Tutor was an
goal [
], to reduce their own
eort: "coz it’s easy I guess, it just shows you what to do?"[P6/H].
These two goals were unique among the high-use group, who may
have lower self-ecacy [
], suggesting that Step Tutor needs to
address their particular needs for reassurance, and discourage ex-
pedient help use. Students also expressed motivations for not using
Step Tutor, such as a desire for
As in prior work
], students avoided using help to maintain independence: "I g-
ure things out on my own, so I learn more thoroughly."[P8/M]. But
unlike prior work, another student noted that Step Tutor actually
gave her a sense of having independence, because it allows students
to continue working without the need for other forms of help: "even
though [Step Tutor is] helping, they are doing it independently, they
are doing it by themselves, instead of calling [teaching assistant] every
other time."[P3/H].
2A quotation from participant P1 within the high-use (H) group.
5.2.2 HOW I used Step Tutor? We collected the “HOW” theme
through the retrospective think-aloud protocols during the inter-
views. We found various ways students interacted with Step Tutor,
including those we intended (run, compare, self-explain), and some
we did not:
Run, compare, and self-explain.
Students discussed interleaved
activities of running, comparing and self-explain the examples,
which aligned with our design goals. Students found running the
examples to be useful, since it shows "the dierence between these
two codes"[P1/H], and how it leads to "the dierence in the out-
puts."[P1/H]. Then, seeing the output triggers them to think more:
"the output of the examples. .. make me wonder for a sec like, why it
gives me example [sic] like that."[P1/H]. Students also noted engag-
ing in comparison, not only by running the left and right example
snapshot to "see what’s going on"[P4/M], but also by comparing their
own code with the left or right code snapshot to see if their code
"matches up"[P4/M] with the example code. We found students had
mixed feelings about writing self-explanation, as in prior work [
Some expressed that self-explanation was distracting: "when I can
look at what’s going on, and understand, [writing self-explanation]
kind of gets in the way."[P5/L]. In contrast, other students expressed
that it helped them reect and think more: "if you write it down
as a reection, . .. it would just get into your head that there were
these dierences and that’s what I have to do next."[P9/L]. One stu-
dent also appreciated the chance for expression, explaining that
the self-explanation prompt "gives a place where I can explain what
I’m feeling"[P3/H]. The interleaved activities of running, compar-
ing, and reection show that our tool oers an active and engaged
learning experience, and suggest us to design the self-explanation
prompts carefully, to help students without frustrating them.
Locate the change and copy example code.
We noticed the low-
use group generally did not run the example code "because I felt
like I already knew what [the snapshots] are going to do."[P5/L].
Instead, they used the example code to locate where they needed to
update their code. For example, a student who asked for an example
to nd how to use a “turn” block explained: "I look for whichever
one that already have [sic] a turn degrees."[P5/L]. These students
also expressed their low-level goals as only using the example
steps when they knew what to do next, but were unclear how, or
needed to x a problem with their code. We also observed another
behavior: copying the example code, which all three high-use group
students employed (based on our observations). These students
also expressed all three dierent low-level goals, including to “nd
the next step”, indicating that they were unclear about what they
wanted the program to do [
]. While two students seemed to have
interleaved the code-copying experience with running, comparing,
and critically reecting on the examples, one expressed "when I read
the example, I was just copying it."[P6/H], and critically commented
that it made him "think a lot less"[P6/H]. While this student seemed
to have abused Step Tutor’s help, suggesting a limitation of our tool,
we discuss the positive impact of example-copying for the other
two students below.
5.2.3 WHAT aordances Step Tutor oers? Other than the specic
interactions students experienced with Step Tutor, students also
expressed perceptions of Step Tutor as a whole.
ITiCSE ’20, June 15–19, 2020, Trondheim, Norway
Comparison with hints.
During the interviews, among eight of
our interviewees who recalled using next-step hints in class, ve
preferred Step Tutor, one hints, and two didn’t express a strong
preference. Students commented that a "hint is like, no informa-
tion"[P6/H], and they "don’t really do anything"[P6/H] because they
tell "something I already know"[P1/H]. In contrast, they appreciated
the interpretability of Step Tutor - comparing to next-step hints, an
example step "gives the entire coding [sic]"[P3/H], and tells "how to
combine the things I already know"[P1/H]. They also appreciated the
demonstration oered by Step Tutor, since it "teaches better"[P6/H]
by "[showing] how the stu runs"[P6/H]. But regarding the amount
of information that’s given, three students believed that next-step
hints have advantages over Step Tutor. They discussed that a hint
"doesn’t give me as much of the answer as the example does"[P5/L],
so it "makes you think more on your own"[P5/L], indicating that
students need more control over how much information they see.
Unlike experience with one example step, the
“roadmapping” theme describes how the series of examples steps
together help students understand the high-level structure of the so-
lution: "I can see how the example [given] is... evolving, from one single
square to all these squares and then increase the thickness."[P1/H]. A
student in the high-use group expressed that the series of examples
in one task (e.g., the Step Tutor tutorial) helped them learn the task
structure: "in the example before, it was drawing one circle rst, and
then... it was drawing many circles..."[P3/H]. She then was able to
apply this pattern to a subsequent task which used a similar solu-
tion structure: "I was expecting the same thing . . . "[P3/H], showing
an eort to transfer knowledge from one task to another.
We here compare students’ experience with Step Tutor with help-
seeking and learning behaviors highlighted by prior work, and
examine whether Step Tutor combines the benets of promoting
progression oered by next-step hints, and the benets of enabling
learning and transfer provided by Worked Examples.
Does Step Tutor help students progress?
Eight out of nine students successfully completed the tasks
within 30 minutes in our study. Like ndings in students’ help-
seeking behaviors with next-step hints [
], students actively
employed help-seeking as a problem-solving strategy to progress
when stuck [
], indicating Step Tutor may oer similar benets to
next-step hints. Unlike challenges observed in students’ experience
with Worked Examples, such as diculties in understanding exam-
ples [
], we found many students followed the steps suggested
by Step Tutor, which shows Step Tutor oers clear information for
students to trust the step [38], and use it to progress.
Does Step Tutor help students learn?
One concern instructors may have is that of “Assistance Dilemma”
] - Step Tutor may give away too much information, allowing
students to progress, but without understanding how, as in expedi-
ent help-seeking or help-abuse, problems commonly seen among
students’ interactions with next-step hints [
]. However,
unlike those next-step hints that were perceived as not addressing
their needs [
], students described Step Tutor to oer clear
and interpretable information. We also saw that most students, in-
cluding two of the three students who copied examples, engaged
in running, comparing, and self-explaining, showing a deliberate
attempt to make sense of the step, which is critical for learning
]. Not only did we see reection on individual examples,
but also on a larger problem structure. Even when a student copied
all the examples, she was able to connect the series of examples to
create a problem-solving schema [
], and was able to transfer this
schema to another task [
], indicating a learning process through
reection-in-action [43].
We also saw that Step Tutor supports dierent levels of prior
knowledge. When a student is unclear about what to do next, she
takes a longer time to interact with an example, and uses vari-
ous activities to make sense of the example step, such as running,
comparing, copying the code, and reecting on the step. When a
student only needs a quick debugging tip to move forward, she
simply searches through the example code and nds the change she
needs to make. This nding aligns with the expertise reversal eect
], indicating the more expertise one has, the less information
she needs when asking for help. We believe the variety of ways of
interactions in Step Tutor provides dierent students with tailored
Our user study includes several limitations. 1) With only a small
sample of students and two short programming tasks, our study
may not generalize to other student groups. But instead of widely
gathering data to conclude the benets of the system, we conducted
an in-depth analysis by closely analyzing each individual student’s
experience. 2) As an initial pilot study, our analysis did not measure
learning or progress quantitatively. Therefore, we cannot make
strong claims about our system’s ability to support progress and
learning. But, we collected rich data that depicts a variety of expe-
riences that informs us of Step Tutor’s aordances. We also discov-
ered limitations in the Step Tutor system itself. While no student
reported inherent deciencies in the system, the dierent levels
of interactions with Step Tutor indicate that students can bring
dierent predispositions (e.g., prior knowledge, goal orientation,
programming preference) into their programming experience, and
thus should benet from dierent levels of programming support.
While we were able to oer such personalization through choices of
interactions in Step Tutor, we should also create more personalized
and exible forms of feedback to students in the future.
In conclusion, Step Tutor provides a combination of the imme-
diate relevance oered by next-step hints, as well as the learning
benets provided by Worked Examples. Our user study shows the
wide variety of interactions students employed with Step Tutor, and
also suggests the need to personalize support for each individual
This material is based upon work supported by the National Science
Foundation under Grant No. 1917885.
Vincent Aleven. 2013. Help seeking and intelligent tutoring systems: Theoretical
perspectives and a step towards theoretical integration. In International handbook
of metacognition and learning technologies. Springer, 311–335.
Step Tutor: Supporting Students through Step-by-Step Example-Based Feedback ITiCSE ’20, June 15–19, 2020, Trondheim, Norway
Vincent Aleven, Ido Roll, Bruce M McLaren, and Kenneth R Koedinger. 2016.
Help helps, but only so much: Research on help seeking with intelligent tutoring
systems. International Journal of Articial Intelligence in Education 26, 1 (2016),
Vincent Aleven, Elmar Stahl, Silke Schworm, Frank Fischer, and Raven Wallace.
2003. Help seeking and help design in interactive learning environments. Review
of educational research 73, 3, 277–320.
Computing Research Association et al
2017. Generation CS: Computer science
undergraduate enrollments surge since 2006. Retrieved March 20 (2017), 2017.
Robert K Atkinson, Alexander Renkl, and Mary Margaret Merrill. 2003. Transi-
tioning from studying examples to solving problems: Eects of self-explanation
prompts and fading worked-out steps. Journal of educational psychology 95, 4
(2003), 774.
Albert Bandura. 1977. Self-ecacy: toward a unifying theory of behavioral
change. Psychological review 84, 2 (1977), 191.
Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology.
Qualitative research in psychology 3, 2 (2006), 77–101.
Peter Brusilovsky. 2001. WebEx: Learning from Examples in a Programming
Course.. In WebNet, Vol. 1. 124–129.
Ruth Butler. 1998. Determinants of help seeking: Relations between perceived rea-
sons for classroom help-avoidance and help-seeking behaviors in an experimental
context. Journal of Educational Psychology 90, 4 (1998), 630.
Michelene TH Chi and Ruth Wylie. 2014. The ICAP framework: Linking cognitive
engagement to active learning outcomes. Educational psychologist 49, 4 (2014),
Ruth C Clark, Frank Nguyen, and John Sweller. 2011. Eciency in learning:
Evidence-based guidelines to manage cognitive load. John Wiley & Sons.
Jarno Coenen, Sebastian Gross, and Niels Pinkwart. 2017. Comparison of Feed-
back Strategies for Supporting Programming Learning in Integrated Development
Environments (IDEs). In International Conference on Computer Science, Applied
Mathematics and Applications. Springer, 72–83.
Allan Collins, John Seely Brown, and Susan E Newman. 1988. Cognitive appren-
ticeship: Teaching the craft of reading, writing and mathematics. Thinking: The
Journal of Philosophy for Children 8, 1 (1988), 2–10.
Albert T Corbett and John R Anderson. 2001. Locus of feedback control in
computer-based tutoring: Impact on learning rate, achievement and attitudes.
In Proceedings of the SIGCHI conference on Human factors in computing systems.
ACM, 245–252.
Bob Edmison, Stephen H. Edwards, and Manuel A. Pérez-Quiñones. 2017. Using
Spectrum-Based Fault Location and Heatmaps to Express Debugging Suggestions
to Student Programmers (ACE ’17). Association for Computing Machinery, New
York, NY, USA, 48–54.
Dan Garcia, Brian Harvey, and Tiany Barnes. 2015. The beauty and joy of
computing. ACM Inroads 6, 4 (2015), 71–79.
Dedre Gentner, Jerey Loewenstein, and Leigh Thompson. 2003. Learning and
transfer: A general role for analogical encoding. Journal of Educational Psychology
95, 2 (2003), 393.
Peter Gerjets, Katharina Scheiter, and Richard Catrambone. 2004. Designing
instructional examples to reduce intrinsic cognitive load: Molar versus modular
presentation of solution procedures. Instructional Science 32, 1-2 (2004), 33–58.
Mary L Gick and Keith J Holyoak. 1983. Schema induction and analogical transfer.
Cognitive psychology 15, 1 (1983), 1–38.
Sebastian Gross, Bassam Mokbel, Benjamin Paassen, Barbara Hammer, and Niels
Pinkwart. 2014. Example-based feedback provision using structured solution
spaces. International Journal of Learning Technology 10 9, 3 (2014), 248–280.
Sebastian Gross and Niels Pinkwart. 2015. Towards an integrative learning
environment for java programming. In 2015 IEEE 15th International Conference
on Advanced Learning Technologies. IEEE, 24–28.
Luke Gusukuma, Austin Cory Bart, Dennis Kafura, and Jeremy Ernst. 2018.
Misconception-driven feedback: Results from an experimental study. In Proceed-
ings of the 2018 ACM Conference on International Computing Education Research.
ACM, 160–168.
Stuart Hansen and Erica Eddy. 2007. Engagement and frustration in programming
projects. ACM SIGCSE Bulletin 39, 1 (2007), 271–275.
Gregory Hume, Joel Michael, Allen Rovick, and Martha Evens. 1996. Hinting as
a tactic in one-on-one tutoring. The Journal of the Learning Sciences 5, 1 (1996),
Michelle Ichinco, Kyle J Harms, and Caitlin Kelleher. 2017. Towards Understand-
ing Successful Novice Example User in Blocks-Based Programming. Journal of
Visual Languages and Sentient Systems 3 (2017), 101–118.
Michelle Ichinco and Caitlin Kelleher. 2015. Exploring novice programmer ex-
ample use. In 2015 IEEE Symposium on Visual Languages and Human-Centric
Computing (VL/HCC). IEEE, 63–71.
[27] Slava Kalyuga. 2009. The expertise reversal eect. IGI Global.
Stuart A Karabenick. 2004. Perceived achievement goal structure and college
student help seeking. Journal of educational psychology 96, 3 (2004), 569.
[29] Andrew J Ko, Brad A Myers, and Htet Htet Aung. 2004. Six learning barriers in
end-user programming systems. In 2004 IEEE Symposium on Visual Languages-
Human Centric Computing. IEEE, 199–206.
Kenneth R Koedinger and Vincent Aleven. 2007. Exploring the assistance dilemma
in experiments with cognitive tutors. Educational Psychology Review 19, 3 (2007),
Kenneth R Koedinger, Albert T Corbett, and Charles Perfetti. 2012. The
Knowledge-Learning-Instruction framework: Bridging the science-practice
chasm to enhance robust student learning. Cognitive science 36, 5 (2012), 757–798.
Hannu Kuusela and Paul Pallab. 2000. A comparison of concurrent and retrospec-
tive verbal protocol analysis. The American journal of psychology 113, 3 (2000),
Samiha Marwan, Joseph Jay Williams, and Thomas Price. 2019. An Evaluation of
the Impact of Automated Programming Hints on Performance and Learning. In
Proceedings of the 2019 ACM Conference on International Computing Education
Research. 61–70.
Samiha Marwan, Nicholas Lytle, Joseph Jay Williams, and Thomas Price. 2019.
The Impact of Adding Textual Explanations to Next-step Hints in a Novice Pro-
gramming Environment. In Proceedings of the 2019 ACM Conference on Innovation
and Technology in Computer Science Education. 520–526.
Elizabeth Patitsas, Michelle Craig, and Steve Easterbrook. 2013. Comparing and
contrasting dierent algorithms leads to increased student learning. In Proceed-
ings of the ninth annual international ACM conference on International computing
education research. ACM, 145–152.
Thomas Price, Rui Zhi, and Tiany Barnes. 2017. Evaluation of a Data-driven
Feedback Algorithm for Open-ended Programming. International Educational
Data Mining Society (2017).
Thomas W Price, Yihuan Dong, and Dragan Lipovac. 2017. iSnap: towards
intelligent tutoring in novice programming environments. In Proceedings of the
2017 ACM SIGCSE Technical Symposium on Computer Science Education (SIGCSE
’17). ACM, 483–488.
Thomas W Price, Zhongxiu Liu, Veronica Cateté, and Tiany Barnes. 2017. Fac-
tors Inuencing Students’ Help-Seeking Behavior while Programming with
Human and Computer Tutors. In Proceedings of the 2017 ACM Conference on
International Computing Education Research. ACM, 127–135.
Thomas W Price, Rui Zhi, and Tiany Barnes. 2017. Hint generation under
uncertainty: The eect of hint quality on help-seeking behavior. In International
Conference on Articial Intelligence in Education. Springer, 311–322.
Bethany Rittle-Johnson and Jon R Star. 2007. Does comparing solution methods
facilitate conceptual and procedural knowledge? An experimental study on
learning to solve equations. Journal of Educational Psychology 99, 3 (2007), 561.
Kelly Rivers. 2017. Automated Data-Driven Hint Generation for Learning Pro-
gramming. (2017).
Ido Roll, Ryan SJ d Baker, Vincent Aleven, and Kenneth R Koedinger. 2014. On the
benets of seeking (and avoiding) help in online problem-solving environments.
Journal of the Learning Sciences 23, 4 (2014), 537–560.
Donald A Schön. 1987. Teaching artistry through reection in action: Educating
the reective practitioner. (1987).
Benjamin Shih, Kenneth R Koedinger, and Richard Scheines. 2011. A response
time model for bottom-out hints as worked examples. Handbook of educational
data mining (2011), 201–212.
John Sweller. 1988. Cognitive load during problem solving: Eects on learning.
Cognitive science 12, 2 (1988), 257–285.
John Sweller. 2006. The worked example eect and human cognition. Learning
and instruction (2006).
W. Price Thomas, Jay Williams Joseph, Solyst Jaemarie, and Marwan Samiha.
2020. Engaging Students with Instructor Solutions in Online Programming
Homework. In To be published in the 2020 Association for Computing Machinery’s
Special Interest Group on Computer Human Interaction (ACM SIGCHI ‘20).
John Gregory Trafton and Brian J Reiser. 1994. The contributions of studying
examples and solving problems to skill acquisition. Ph.D. Dissertation. Citeseer.
Lev Vygotsky. 1978. Interaction between learning and development. Readings on
the development of children 23, 3 (1978), 34–41.
Wengran Wang, Rui Zhi, Alexandra Milliken, Nicholas Lytle, and Thomas W.
Price. 2020. Crescendo: Engaging Students to Self-Paced Programming Practices
(SIGCSE ’20). ACM, 859–865.
Rui Zhi, Samiha Marwan, Yihuan Dong, Nicholas Lytle, Thomas W Price, and
Tiany Barnes. 2019. Toward Data-Driven Example Feedback for Novice Pro-
gramming. Proceedings of the International Conference on Educational Data Mining
(2019), 218–227.
Rui Zhi, Thomas W Price, Nicholas Lytle, Yihuan Dong, and Tiany Barnes.
2018. Reducing the State Space of Programming Problems through Data-Driven
Feature Detection. In Educational Data Mining in Computer Science Education
(CSEDM) Workshop@ EDM.
Rui Zhi, Thomas W Price, Samiha Marwan, Alexandra Milliken, Tiany Barnes,
and Min Chi. 2019. Exploring the Impact of Worked Examples in a Novice
Programming Environment. In Proceedings of the 50th ACM Technical Symposium
on Computer Science Education. ACM, 98–104.
... Researchers have developed systems to support novices' use of code examples during programming. Many were built for closed-ended tasks [45,46]. For example, by offering step-by-step examples with options to immediately run the example code [45]. ...
... Many were built for closed-ended tasks [45,46]. For example, by offering step-by-step examples with options to immediately run the example code [45]. Some offers an online database of annotated examples [6]. ...
... The student may click on different sprites to look at the example code for each sprite (shown in Figure 1). They may also look at the animation of the output next to the example code, since reading code in relation to output has been shown to trigger students to reflect on how the example code works [45]. The student can also click on the "Open the Project" button to view the example in a separate window and experiment with it. ...
... Researchers have developed systems to support novices' use of code examples during programming. Many were built for closed-ended tasks [45,46]. For example, by offering step-by-step examples with options to immediately run the example code [45]. ...
... Many were built for closed-ended tasks [45,46]. For example, by offering step-by-step examples with options to immediately run the example code [45]. Some offers an online database of annotated examples [6]. ...
... The student may click on different sprites to look at the example code for each sprite (shown in Figure 1). They may also look at the animation of the output next to the example code, since reading code in relation to output has been shown to trigger students to reflect on how the example code works [45]. The student can also click on the "Open the Project" button to view the example in a separate window and experiment with it. ...
Full-text available
Open-ended programming increases students' motivation by allowing them to solve authentic problems and connect programming to their own interests. However, such open-ended projects are also challenging, as they often encourage students to explore new programming features and attempt tasks that they have not learned before. Code examples are effective learning materials for students and are well-suited to supporting open-ended programming. However, there is little work to understand how novices learn with examples during open-ended programming, and few real-world deployments of such tools. In this paper, we explore novices' learning barriers when interacting with code examples during open-ended programming. We deployed Example Helper, a tool that offers galleries of code examples to search and use, with 44 novice students in an introductory programming classroom, working on an open-ended project in Snap. We found three high-level barriers that novices encountered when using examples: decision, search and integration barriers. We discuss how these barriers arise and design opportunities to address them.
... Programming classes are also becoming more popular in high and middle schools. The need to provide better support for the increasing fraction of less-prepared students in CS classes motivated a number of recent attempts to create ITS supporting step-by-step problem support for CS classes [16,28,30,37]. The step-by-step approach, however, does have a few shortcomings, as having students complete each step can slow down productivity and make the process boring for better-prepared students. ...
... Likewise, selecting a mode like worked-example could lower cognitive load for less experienced students. Such an adaptive design would be similar to step-based help recently proposed in program construction tasks [37]. Additionally, interleaving worked examples and problems might make students more efficient and decrease student frustration as observed in prior research [24,25]. ...
... In this paper, we evaluate the effects of feedback that is generated automatically by static analysis. Past research on automated hint generation has mainly considered the problem of providing hints on what should be the next step in solving programming assignments [20,23,25,30,32] or open ended programming tasks [19,21] and how novices seek help in these systems [1,15,16,22]. An exception is the work by Gusukuma et al. [6], who showed that feedback delivery on mistakes that anticipate possible misconceptions generally leads to favourable results, and that showing such hints does not harm transfer to new tasks. ...
Bugs in learners' programs are often the result of fundamental misconceptions. Teachers frequently face the challenge of first having to understand such bugs, and then suggest ways to fix them. In order to enable teachers to do so effectively and efficiently, it is desirable to support them in recognising and fixing bugs. Misconceptions often lead to recurring patterns of similar bugs, enabling automated tools to provide this support in terms of hints on occurrences of common bug patterns. In this paper, we investigate to what extent the hints improve the effectiveness and efficiency of teachers in debugging learners' programs using a cohort of 163 primary school teachers in training, tasked to correct buggy Scratch programs, with and without hints on bug patterns. Our experiment suggests that automatically generated hints can reduce the effort of finding and fixing bugs from 8.66 to 5.24 minutes, while increasing the effectiveness by 34% more correct solutions. While this improvement is convincing, arguably teachers in training might first need to learn debugging "the hard way" to not miss the opportunity to learn by relying on tools. We therefore investigate whether the use of hints during training affects their ability to recognise and fix bugs without hints. Our experiment provides no significant evidence that either learning to debug with hints or learning to debug "the hard way" leads to better learning effects. Overall, this suggests that bug patterns might be a useful concept to include in the curriculum for teachers in training, while tool-support to recognise these patterns is desirable for teachers in practice.
Conference Paper
Full-text available
Research in a variety of domains has shown that viewing worked examples (WEs) can be a more efficient way to learn than solving equivalent problems. We designed a Peer Code Helper system to display WEs, along with scaffolded self-explanation prompts, in a block-based, novice programming environment called \snap. We evaluated our system during a high school summer camp with 22 students. Participants completed three programming problems with access to WEs on either the first or second problem. We found that WEs did not significantly impact students' learning, but may have impacted students' intrinsic cognitive load, suggesting that our WEs with scaffolded prompts may be an inherently different learning task. Our results show that WEs saved students time on initial tasks compared to writing code, but some of the time saved was lost in subsequent programming tasks. Overall, students with WEs completed more tasks within a fixed time period, but not significantly more. WEs may improve students' learning efficiency when programming, but these effects are nuanced and merit further study.
Conference Paper
Full-text available
The feedback given to novice programmers can be substantially improved by delivering advice focused on learners' cognitive misconceptions contextualized to the instruction. Building on this idea, we present Misconception-Driven Feedback (MDF); MDF uses a cognitive student model and program analysis to detect mistakes and uncover underlying misconceptions. To evaluate the impact of MDF on student learning, we performed a quasi-experimental study of novice programmers that compares conventional run-time and output check feedback against MDF over three semesters. Inferential statistics indicates MDF supports significantly accelerated acquisition of conceptual knowledge and practical programming skills. Additionally, we present descriptive analysis from the study indicating the MDF student model allows for complex analysis of student mistakes and misconceptions that can suggest improvements to the feedback, the instruction, and to specific students.
Conference Paper
Full-text available
When novice students encounter diculty when learning to program, some can seek help from instructors or teaching assistants. Œis one-on-one tutoring is highly e‚ective at fostering learning, but busy instructors and large class sizes can make expert help a scarce resource. Increasingly, programming environments aŠttempt to imitate this human support by providing students with hints and feedback. In order to design e‚ective, computer-based help, it is important to understand how and why students seek and avoid help when programming, and how this process di‚ers when the help is provided by a human or a computer. We explore these questions through a qualitative analysis of 15 students’ interviews, in which they reƒect on solving two programming problems with human and computer help. We discuss implications for help design and present hypotheses on students’ help-seeking behavior.
Cognitive studies of expertise that were reviewed in Chapter I indicated that prior knowledge is the most important 1earner characteristic that influences learning processes. Recently, it has been established that learning procedures and techniques that are beneficial for learners with low levels of prior knowledge may become relatively inefficient for more knowledgeable learners due to cognitive activities that consume additional working memory resources. This expertise reversal effect could be related to aptitude-treatment interactions (interactions between learning outcomes of different instructional treatments and student aptitudes) that were actively investigated in 1960-70s. The learner level of prior knowledge or level of expertise is the aptitude of interest in this case. The effect is explained by the cognitive overload that more knowledgeable learners may experience due to processing redundant for these learners instructional components (as compared to information without redundancy). As a consequence, instructional outcomes of different multimedia learning formats and procedures are always relative to levels of learner task-specific expertise. This chapter describes cognitive processes that cause expertise reversal effect and major instructional implications of this effect. The chapter provides a review of empirical evidence obtained in the original longitudinal studies of the effect, the expertise reversal for methods of enhancing essential cognitive load, and expertise reversal phenomena when learning from textual and hypertextual materials. The chapter also describes relations between the expertise reversal effect and studies of Aptitude-Treatment Interactions. Additional empirical evidence for the effect in other areas will be described in the following chapters in Section 2 of the book.
Conference Paper
A growing body of work has explored how to automatically generate hints for novice programmers, and many programming environments now employ these hints. However, few studies have investigated the efficacy of automated programming hints for improving performance and learning, how and when novices find these hints beneficial, and the tradeoffs that exist between different types of hints. In this work, we explored the efficacy of next-step code hints with 2 complementary features: textual explanations and self-explanation prompts. We conducted two studies in which novices completed two programming tasks in a block-based programming environment with automated hints. In Study 1, 10 undergraduate students completed 2 programming tasks with a variety of hint types, and we interviewed them to understand their perceptions of the affordances of each hint type. For Study 2, we recruited a convenience sample of participants without programming experience from Amazon Mechanical Turk. We conducted a randomized experiment comparing the effects of hints' types on learners' performance and performance on a subsequent task without hints. We found that code hints with textual explanations significantly improved immediate programming performance. However, these hints only improved performance in a subsequent post-test task with similar objectives, when they were combined with self-explanation prompts. These results provide design insights into how automatically generated code hints can be improved with textual explanations and prompts to self-explain, and provide evidence about when and how these hints can improve programming performance and learning.
Conference Paper
Automated hints, a powerful feature of many programming environments, have been shown to improve students' performance and learning. New methods for generating these hints use historical data, allowing them to scale easily to new classrooms and contexts. These scalable methods often generate next-step, code hints that suggest a single edit for the student to make to their code. However, while these code hints tell the student what to do, they do not explain why, which can make these hints hard to interpret and decrease students' trust in their helpfulness. In this work, we augmented code hints by adding adaptive, textual explanations in a block-based, novice programming environment. We evaluated their impact in two controlled studies with novice learners to investigate how our results generalize to different populations. We measured the impact of textual explanations on novices' programming performance. We also used quantitative analysis of log data, self-explanation prompts, and frequent feedback surveys to evaluate novices' understanding and perception of the hints throughout the learning process. Our results showed that novices perceived hints with explanations as significantly more relevant and interpretable than those without explanations, and were also better able to connect these hints to their code and the assignment. However, we found little difference in novices' performance. Our results suggest that explanations have the potential to make code hints more useful, but it is unclear whether this translates into better overall performance and learning.
Conference Paper
The large state space of programming problems makes providing adaptive support in intelligent tutoring systems (ITSs) difficult. Reducing the state space size could allow for more interpretable analysis of student progress as well as easier integration of data-driven support. Using data collected from a CS0 course, we present a procedure for defining a small but meaningful programming state space based on the presence or absence of features of correct solution code. We present a procedure to create these features using a panel of human experts, as well as a data-driven method to derive them automatically. We compare the expert and data-driven features , the resulting state spaces, and how students progress through them. We show that both approaches dramatically reduce the state-space compared to traditional code-states and that the data-driven features have high overlap with the expert features. We conclude by discussing how this feature-state space provides a useful platform for integrating data-driven support methods into ITSs.
Conference Paper
In this paper we present a novel, data-driven algorithm for generating feedback for students on open-ended programming problems. The feedback goes beyond next-step hints, annotating a student’s whole program with suggested edits, including code that should be moved or reordered. We also build on existing work to design a methodology for evaluating this feedback in comparison to human tutor feedback, using a dataset of real student help requests. Our results suggest that our algorithm is capable of reproducing ideal human tutor edits almost as frequently as another human tutor. However, our algorithm also suggests many edits that are not supported by human tutors, indicating the need for better feedback selection.