Conference PaperPDF Available

Using learning analytics to assess students' behavior in open-ended programming tasks

Authors:

Abstract and Figures

There is great interest in assessing student learning in unscripted, open-ended environments, but students' work can evolve in ways that are too subtle or too complex to be detected by the human eye. In this paper, I describe an automated technique to assess, analyze and visualize students learning computer programming. I logged hundreds of snapshots of students' code during a programming assignment, and I employ different quantitative techniques to extract students' behaviors and categorize them in terms of programming experience. First I review the literature on educational data mining, learning analytics, computer vision applied to assessment, and emotion detection, discuss the relevance of the work, and describe one case study with a group undergraduate engineering students
Content may be subject to copyright.
Using learning analytics to assess students’ behavior in
open-ended programming tasks
Paulo Blikstein
Transformative Learning Technologies Lab
Stanford University School of Education and (by courtesy) Computer Science.
520 Galvez Mall, CERAS 232, Stanford, CA, 94305
paulob@stanford.edu
ABSTRACT
There is great interest in assessing student learning in unscripted,
open-ended environments, but students’ work can evolve in ways
that are too subtle or too complex to be detected by the human
eye. In this paper, I describe an automated technique to assess,
analyze and visualize students learning computer programming. I
logged hundreds of snapshots of students’ code during a
programming assignment, and I employ different quantitative
techniques to extract students’ behaviors and categorize them in
terms of programming experience. First I review the literature on
educational data mining, learning analytics, computer vision
applied to assessment, and emotion detection, discuss the
relevance of the work, and describe one case study with a group
undergraduate engineering students
Categories and Subject Descriptors
K.3.2 [Computer and Information Science Education]:
Computer Science Education.
General Terms
Algorithms, Measurement, Performance, Language.
Keywords
Learning Analytics, Educational Data Mining, Logging,
Automated Assessment, Constructionism.
1. INTRODUCTION
Researchers are unanimous to state that we need to teach the so-
called “21st century skills”: creativity, innovation, critical
thinking, problem solving, communication, and collaboration.
None of those skills are easily measured using current assessment
techniques, such as multiple choice tests, open items, or
portfolios. As a result, schools are paralyzed by the push to teach
new skills, and the lack of reliable ways to assess them. One of
the difficulties is that current assessment instruments are based on
products (an exam, a project, a portfolio), and not on processes
(the actual cognitive and intellectual development while
performing a learning activity), due to the intrinsic difficulties in
capturing detailed process data for large numbers of students.
However, new data collection, sensing, and data mining
technologies are making it possible to capture and analyze
massive amounts of data in all fields of human activity. These
techniques include logs of email and web servers, computer
activity capture, wearable cameras, wearable sensors, bio sensors
(e.g., skin conductivity, heartbeat, brain waves), and eye-tracking,
using techniques such as machine learning and text mining. Such
techniques are enabling researchers to have an unprecedented
insight into the minute-by-minute development of several
activities. In this paper, we propose that such techniques could be
used to evaluate some cognitive strategies and abilities, especially
in learning environments where the outcome is unpredictable such
as a robotics project or a computer program.
In this work, we focused on students learning to program a
computer using the NetLogo language. Hundreds of snapshots for
each student were captured, filtered, and analyzed. I will describe
some prototypical coding trajectories and discuss how they relate
to students’ programming experience, as well as the implication
for the teaching and learning of computer programming.
2. PREVIOUS WORK
Two examples of the current attempts to use artificial intelligence
techniques to assess human learning are text analysis and emotion
detection. The work of Rus et al. [13], for example, makes
extensive use of text analytics within a computer-based
application for learning about complex phenomena in science.
Students were asked to write short paragraphs about scientific
phenomena – Rus et al. then explored which machine learning
algorithm would enable them to most accurately classify each
student in terms of their content knowledge, based on
comparisons with expert-formulated responses. However, some
authors have tried to use even less intrusive technologies; for
example, speech analysis further removes the student from the
traditional assessment setting by allowing them to demonstrate
fluency in a more natural setting. Beck and Sison [4] have
demonstrated a method for using speech recognition to assess
reading proficiency in a study with elementary school students
that combines speech recognition with knowledge tracing (a form
of probabilistic monitoring.)
The second area of work is the detection of emotional states using
non-invasive techniques. Understanding student sentiment is an
important element in constructing a holistic picture of student
progress, and it also helps enabling computer-based systems to
interact with students in emotionally supportive ways. Using the
Facial Action Coding System (FACS), researchers have been able
to develop a method for recognizing student affective state by
simply observing and (manually) coding their facial expressions
and applying machine learning to the data produced [11].
Researchers have also used conversational cues to detect student’s
emotional state. Similar to the FACS study, Craig et al. designed
an application that could use spoken dialogue to recognize the
states of boredom, frustration, flow, and confusion. They were
able to resolve the validity of their findings through comparison to
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
Conference’10, Month 1–2, 2010, City, State, Country.
Copyright 2010 ACM 1-58113-000-0/00/0010…$10.00.
emote-aloud activities (a derivative of talk-aloud where
participants describe their emotions as they feel them) while
students interacted with AutoTutor.
Even though researchers have been trying to use all these artificial
intelligence techniques for assessing students’ formal knowledge
and emotional states, the field is currently benefiting from three
important additions: 1) detailed, multimodal student activity data
(gestures, sketches, actions) as a primary component of analysis,
2) automation of data capture and analysis, 3) multidimensional
data collection and analysis. This work is now coalescing into the
nascent field of Learning Analytics or Educational Data Mining
[1, 3], and has been used in many contexts to measure students’
learning and affect. However, in Baker and Yacef’s review of its
current uses, the majority of the work is focused on cognitive
tutors or semi-scripted environments [2]. Open-ended tasks and
unscripted learning environments have only been in the reach of
qualitative, human-coded methods. However, qualitative
approaches presents some crucial shortcomings: (1) there is no
persistent trace of the evolution of the students’ artifacts
(computer code, robots, etc.), (2) crucial learning moments within
a project can last only seconds, and are easy to miss with
traditional data collection techniques (i.e., field notes or video
analysis), and (3) such methodologies are hard to scale for large
groups or extended periods of time. The cost of recording,
transcribing and analysis data is a known limiting factor for
qualitative researchers.
At the same time, most of previous work on EDM has been used
to assess specific and limited tasks – but the “21st century skills”
we need to assess now are much more complex, such as creativity,
the ability to find solutions to ill-structured problems and navigate
in environments with sparse information, as well as dealing with
uncertainty. Unscripted learning environments are well-known for
being challenging to measure and assess, but recent advances both
data collection and machine learning could make it possible to
understand students’ trajectories in these environments.
For example, researchers have attempted to automate the
collection of action data, such as gesture and emotion. Weinland
et al. [15] and Yilmaz et al. [17] were able to detect basic human
actions related to movement. Craig et al. [10] created a system for
automatic detection of facial expressions (the FACS study). The
technique that Craig et al. validated is a highly non-invasive
mechanism for realizing student sentiment, and can be coupled
with computer vision technology and biosensors to enable
machines to automatically detect changes in emotional state or
cognitive-affect.
Another area of active development is speech and text mining.
Researchers have combined natural language processing and
machine learning to analyze student discussions and writing,
leveraging Independent Component Analysis of student
conversations – a technique whose validity has been repeatedly
reproduced. The derived text will is subsequently analyzed using
Latent Semantic Analysis [13]. Given the right training and
language model, LSA can give a clearer picture of each student’s
knowledge development throughout the course of the learning
activity.
In the realm of exploratory learning environments, Bernardini,
Amershi and Conati [6] built student models combining
supervised and unsupervised classification, both with log files and
eye-tracking, and showed that meaningful events could be
detected with the combined data. Montalvo et al. [11], also using a
combination of automated and semi-automated real-time coding,
showed that they could identify meaningful meta-cognitive
planning processes when students were conducting experiments in
an online virtual lab environment.
However, most of these studies did not involve the creation of
completely open-ended artifacts, with almost unlimited degrees of
freedom. Even though the work around these environments is
incipient, some attempts have been made (see 7, 8). Another of
such examples is the work Berland & Martin [5], who by logging
data found that novice students' developed successful program
code by following one of two progressions: planner and tinkerer.
Planners found success by carefully structuring programs over
time, and tinkerers found success by accreting programs over
time. In their study, students were generally unsuccessful if they
didn't follow one of those paths.
In this paper, I will present one exploratory case study on the
possibility of using learning analytics and educational data mining
to inspect students’ behavior and learning in project-based,
unscripted, constructionist [12] learning environments, in which
traditional assessment methods might not capture students’
evolution. My goal is to establish a proof of existence that
automatically-generated logs of students programming can be
used to infer patterns in how students go about programming, and
that by inspecting those patterns we could design better support
materials and strategies, as well as detect critical points in the
writing of software in which human assistance would be more
needed. Since my data relies in just nine subjects, I don’t make
claims of statistical significance, but the data points present
meaningful qualitative distinctions between students.
3. METHODS AND DATASET
To collect the programming logs, I employed the NetLogo [16]
programming environment. NetLogo can log to an XML file all
users’ actions, such as key presses, button clicks, changes in
variables and, most importantly, changes in the code. I developed
techniques and custom tools to automatically store, filter, and
analyze snapshots of the code generated by students.
The logging module uses a special configuration file, which
specifies which actions are to be logged. This file was distributed
to students alongside with instruction about how to enable
logging, collect the log-files, and send those files back for analysis
Nine students in a sophomore-level engineering class had a 3-
week programming assignment. The task was to write a computer
program to model a scientific phenomenon of their choice.
Students had the assistance of a ‘programming’ teaching
assistance, following the normal class structure. The teaching
assistant was available for about 3-4 hours a week for each
student, and an individual, 1-hour programming tutorial session
was conducted with each of the students on the first week of the
study.
158 logfiles were collected. Using a combination of XQuery and
regular expression processors (such as ‘grep’), the files were
processed, parsed, and analyze (1.5GB and 18 million lines of
uncompressed text files). Below is a summary of the collected
data (in this order): total number of events logged, total number of
non-code events (e.g., variable changes, button presses), percent
of non-code events, and actual coding snapshots.
N
C
T
e
T
r
u
r
(
w
e
w
F
F
m
c
m
s
i
n
e
g
p
a
4
F
i
n
h
Table 1.
N
ame Eve
n
C
huc
k
258
0
Che 59
Leah 28
Liam 4044
7
Leen 2531
Luca 926
3
Nema 36
Paul 21
8
Shana 4165
6
Total 8826
8
T
he overwhelmi
n
e
vents, such as
T
hese
p
articula
r
u
nning or testi
n
ecorded, what
almost 9 milli
o
w
ith models is
e
vents were filt
e
w
ith 1187 event
s
F
or further data
F
irst, I develo
p
m
eaningful ev
e
c
haracters, key
w
m
essages. Then
,
s
napshots where
n
flection
p
oints
e
xamine the s
n
Event Navigato
g
o back and fo
r
p
rogression and
m
Figure 1. Scre
e
a
llows researc
h
stu
d
4
. DATA
A
F
or the analysis
,
n
-depth explora
t
h
er work with
Number of ev
e
n
ts Non-co
d
0
36 25767
5
7
0 928
3
6 525
7
23 404112
3
12 24182
7
3
1 86708
9
0 649
8
15
6
57 415932
7
8
73 878877
7
n
g majority of
e
variable chan
g
r
kin
d
s of eve
n
n
g models – e
v
accounts for t
h
o
n.) Since the
a
out of the sco
p
e
red out from t
h
s
for 9 users.
analysis, a co
m
p
e
d
a series o
f
nts within th
e
w
ords used, cod
e
,
I used the res
seemingly aty
p
, plateaus, and
n
apshots, I dev
e
r
” (Figure 1). T
h
r
th in time, “fr
a
m
easuring statis
e
nshot of the E
v
h
ers to go back
d
ents created a
A
NALYSIS
I will first foc
u
t
ion of her codi
n
other students
,
e
nts collected p
e
d
e Non-Cod
e
5
99.9%
15.5%
18.5%
3
99.9%
7
95.5%
93.6%
17.6%
6.9%
7
99.8%
7
99.6%
e
vents collecte
d
g
es, buttons pr
e
n
t takes place
w
v
ery single var
i
h
e very large
n
a
nalysis of stu
d
p
e of this pap
e
h
e main datase
t
m
bination of tec
h
f
Mathematica
e
dataset, suc
h
e
compilations,
a
ulting plots to
p
ical coding act
i
sharp decrease
s
e
loped a custo
h
e software ena
b
me-by-frame,”
t
tical data.
v
ent Navigator
a
nd forth in ti
m
computer pro
g
u
s on one stude
n
n
g strategies. Th
,
and show h
o
e
r student
e
% Code
361
5042
2311
3600
11285
5923
3041
203
6330
38096
d
were non-codi
n
e
ssed, and clic
k
w
hen students
a
i
able change g
e
n
umber of eve
n
d
ents’ interactio
e
r, all non-codi
n
t
, so we were l
e
h
niques was us
e
scripts to co
u
h
as number
a
nd types of er
r
look at particu
l
i
vity took place
s
or increases.
T
m software to
o
b
les researchers
t
racking studen
t
software, whic
h
m
e, tracking ho
w
g
ram.
n
t and conduct
a
e
n, I will comp
a
o
w differences
n
g
k
s.
a
re
e
ts
n
ts
ns
n
g
e
ft
e
d.
u
nt
of
r
or
l
ar
T
o
o
l,
to
t
s’
h
w
a
n
a
re
in
p
revio
u
p
erfor
m
4.1
C
4.1.1
Luca i
s
model
with c
o
averag
e
of her
l
Figure
contin
u
code, t
h
time b
right),
compil
a
unsucc
e
series
p
urpos
e
6 regi
o
overall
Code
N
when t
h
Figu
r
The fo
l
1. L
u
p
r
o
th
a
u
n
sk
e
dr
o
2. S
h
he
r
b
e
t
u
n
or
a
ch
u
s ability and
m
ance.
C
odin
g
str
a
Luca
s a sophomore
in her domain
o
mputers, and
h
e
, which makes
h
l
og files.
2 is a visualiz
u
ous (red) curv
e
h
e (blue) dots (
m
etween two co
(green) dots
ations, (orang
e
essful compilat
i
are arbitrary
e
s.) In the follo
w
o
ns of the plot.
increase in ch
a
N
avigator tool (
h
e events happe
n
r
e 2. Code size,
l
lowing are the
m
u
ca started with
o
grams seen in
a
n a minute,
n
necessary code
e
leton of a new
o
p in point A).
h
e spent the ne
x
r first procedu
r
tween A and
B
n
successful co
m
a
nge dots), and
g
h
aracters of code.
experience mi
g
a
te
g
ies
engineering st
u
area. She had
m
h
er grade in th
e
her a good exa
m
z
ation of Luca’
s
e
represents the
n
mostly) undern
e
de compilation
s
placed at y=
1
e
) dots plac
e
i
ons (the y coo
r
and were cho
w
ing paragraph
s
The analysis
w
a
racter count (F
i
Figure 1) to lo
c
n
ed.
time between
c
for Luca’s log
m
ain coding ev
e
one of the exe
m
the tutorial. I
n
she deleted
and ended up
w
program (see t
h
x
t half-hour bu
i
r
e. During this
B
, she had num
e
m
pilations (see
g
oes fro
m
200 t
o
.
g
ht have deter
m
u
dent and built
m
odest previou
s
e
class was als
o
m
ple for an in-d
e
s
model buildin
n
umber of char
a
e
ath the curve r
s
(secondary y
-
1
800 represent
e
d at y=120
0
r
dinates for th
o
sen just for
v
s
, I will analyz
e
w
as done by lo
o
i
gure 2), and th
e
c
ate the exact p
c
ompilations, a
n
files
e
nts for Luca:
m
pla
r
n
less
the
w
ith
a
h
e big
i
lding
time,
e
rous
the
o
600
m
ined their
a scientific
s
experience
o
around the
e
pth analysis
n
g logs. The
a
cters in her
epresent the
-
axis to the
successful
0
represent
o
se two data
v
isualization
e
each of the
o
king at the
e
n using the
oint in time
n
d errors,
e
4
U
l
o
f
L
(
i
n
a
m
3. The size of
12 minutes
sudden ju
m
characters (
j
jump corres
p
p
asting her
her first pr
second one.
she opens
programs w
i
4. Luca s
p
en
d
new duplic
a
frequency o
f
the density
the averag
e
increases, a
n
for about on
e
5. After one h
o
another sud
d
from 900 t
o
D and E).
A
to open a s
a
p
rocedure t
h
needed for
h
compilation
s
6. After mak
i
work, Luca
1200 charac
t
about 20
m
code, fixin
g
names of
changes in
there are no
Luca’s narra
t
e
vents:
Str
i
p
oi
n
Lo
n
stu
d
use
f
Su
d
im
p
co
d
A f
i
the
4
.1.2 Shana,
U
sing the chara
o
gfiles from o
t
f
ollowing, I sho
w
L
een, and Shan
a
including openi
n
n
Figure 2 did
a
ctivities withi
n
m
anipulating ot
h
t
he code remai
n
(point B), unti
l
m
p from 60
0
j
ust before poi
n
p
onds to Luca c
own code: she
ocedure as a
b
During this p
e
many of t
h
i
thin NetLogo.
d
s some time
m
a
te
p
rocedure
w
f
compilation de
c
of orange and
g
e
time per
c
n
d again we se
e
e
hour (point D)
.
o
ur in the
p
late
a
d
en increase in
o
1300 characte
r
A
ctually, what L
u
a
mple program
a
h
at generated so
m
h
er model. Not
e
s
are even less fr
e
i
ng the “recy
c
got to her final
t
ers of code. Sh
e
m
inutes “beaut
i
g
the indentatio
n
variables, etc.
the code took
incorrect compi
l
t
ive suggests,
t
i
pping down a
n
n
t.
n
g plateaus of
n
ents browse o
t
f
ul pieces.
d
den jumps in
p
ort code from
o
d
e from within t
h
i
nal phase in w
h
co
d
e, indentati
o
Lian, Leen,
a
c
te
r
count time
t
her students i
n
w
plots from f
o
a
, Figure 3),
w
n
g other models
not show all
o
n
her model,
h
er models).
n
s stable fo
r
l
there is
a
0
to 900
n
t C). This
opying an
d
duplicate
d
b
asis for
a
e
riod, also,
h
e sample
m
aking he
r
w
ork. The
c
reases (see
g
reen dots),
c
ompilatio
n
e
a platea
u
.
a
u, there is
code size,
r
s (betwee
n
u
ca did was
a
nd copy
a
m
ething she
e
that code
e
quent.
c
led” code
number o
f
e
then spen
t
i
fying” the
n
, changing
No real
place, an
d
l
ations.
t
hus, four prot
o
n
existing
p
ro
g
n
o coding activ
t
her code (or t
h
character cou
n
o
ther programs,
h
eir working
p
ro
g
h
ich students fi
x
o
n, variable nam
e
a
nd Che
series it is
p
o
s
n
search of si
m
o
u
r
different stu
d
hich include al
l
the “spikes”
o
f Luca’s activi
t
i.e., excludi
n
o
typical modeli
n
g
ram as a starti
n
ity, during whi
c
h
eir own code)
f
n
t, when stude
n
or copy and pa
s
g
ram.
x
the formatting
e
s, etc.
s
sible to exami
n
m
ilarities. In t
h
d
ents (Luca, C
h
l
of their activ
i
note that the p
l
t
ies, but only
h
n
g opening a
n
n
g
n
g
c
h
f
or
n
ts
s
te
of
n
e
h
e
h
e,
i
ty
l
ot
h
er
n
d
Figur
e
Che,
First,
l
almost
jump (
a
closer,
differe
n
the co
d
and de
c
and a
d
sample
hers o
n
the m
o
make
integra
t
signifi
c
Leen,
o
other s
1000
2000
3000
4000
5000
6000
1000
2000
3000
4000
1500
2000
2500
3000
3500
1000
2000
3000
4000
e
3. Code size v
e
and Leen. The
o
l
et’s examine
S
no change in th
e
a
t time=75) fro
m
systematic ex
a
n
t approach tha
n
d
e of other
p
rog
r
c
ided to do the
d
d her code to
i
e
program (prov
i
n
top of it. The
o
ment when she
it ‘her own’
ted the pre-e
x
c
ant new feature
o
n the other han
d
s
ample
p
rogram
s
20 40 60
sh
a
20 40 60
luc
a
50 10
0
c
100 200
e
rsus time for
f
spikes show m
o
o
pened sample
S
hana’s logfile
s
e baseline char
a
m
about 200 to
4
a
mination revea
l
n
Luca. After s
o
r
ams into her o
w
opposite: start
f
i
t. She then ch
o
i
ded as part of
t
sudden jump t
o
loaded the sa
m
by adding
pr
x
isting code i
n
e
s.
d
, had yet anoth
e
s
for inspiratio
n
80 100 1
2
a
na
80 100 1
a
0
150
c
h
e
300 400
lee
n
f
or students: S
h
o
ments in whic
h
code.
s
. After many
a
cte
r
count, ther
e
4
,000 character
s
l
ed that Shana
o
me attempts to
w
n (the spikes),
f
rom a ready-m
a
o
se a very wel
l
t
he initial tutori
a
o
4,000 charact
e
m
ple program a
n
r
ocedures. She
n
to her new
o
e
r coding style.
H
n
or cues, but
d
2
0
20
200
500
h
ana, Luca,
h
students
spikes and
e
is a sudden
s
of code. A
employed a
incorporate
she gave up
a
de program
l
-established
a
l) and built
e
rs indicates
n
d started to
seamlessly
o
ne, adding
H
e did open
d
id not copy
and paste code. Instead, he built his procedures in small
increments by trial-and-error. In Table 2 we can observe how he
coded a procedure to “sprout” a variable number of white screen
elements (the action lasted 30-minutes). The changes in the code
(“diffs”) are indicated with the (red) greyed-out code.
Table 2. Leen’s attempts to write a procedure
to Insert-
Vacancies
to Insert-Vacancies
sprout 2
[ set breed
vacancies
set color
white ] ]
end
Initial code
to Insert-
Vacancies
ask patches
[ sprout 2
[ set breed
vacancies
set color
white ] ]
end
to Insert-Vacancies
ask one-of
patches
[ sprout 2
[ set breed
vacancies
set color
white ] ]
end
Ask patches is
introduced, and
then one-of
to Insert-
Vacancies
ask one-of
patches
[ sprout 1
[ set breed
vacancies
set color
white ] ]
end
to Insert-Vacancies
ask 5 patches
[ sprout 2
[ set breed
vacancies
set color
white ] ]
end
Leen
experiments
with different
number of
patches (1, 5,
3)
to Insert-
Vacancies
ask patches-
from
[ sprout 2
[ set breed
vacancies
set color
white ] ]
end
to Insert-Vacancies
loop
[
ask one-of
patches
[ sprout 2
[ set breed
vacancies
set color
white ] ] ]
end
Tries patches-
from and then
introduce a
loop
to Insert-
Vacancies
while
[
ask one-of
patches
[ sprout 2
[ set breed
vacancies
set color
white] ] ]
end
to Insert-Vacancies
while n < Number-
of-Vacancies
[
ask one-of
patches
[ sprout 2
[ set breed
vacancies
set color
white ] ] ]
end
Tries another
loop approach,
with a while
command
to Insert-
Vacancies
ask one-of
patches
[ sprout 2
[ set breed
vacancies
set color
white ] ]
end
to Insert-Vacancies
ask 35 of patches
[ sprout 2
[ set breed
vacancies
set color
white ] ]
end
Gives up
looping, tries a
fixed number
of patches
to Insert-
Vacancies
ask n-of
Number-of-
Vacancies patches
[ sprout 2
[ set breed
vacancies
set color
white ] ] end
Gives up a
fixed number,
creates a slider,
and introduces
n-of
Leen trial-and-error method had an underlying pattern: he went
from simpler to more complex structures. For example, he first
attempts a fixed, “hardcoded” number of events (using the sprout
command), then introduces control structures (loop, while) to
generate a variable number of events, and finally introduces new
interface widgets to give the user control over the number of
events. Leen reported having a high familiarity with programming
languages (compared to Luca and Shana), which might explain his
different coding style. He seemed to be much more confident
generating code from scratch instead of opening other sample
programs to get inspiration or import code.
Che, with few exceptions, did not open other models during
model building. Similarly to Leen, he also employs an
incremental, trial-and-error approach, but we can clearly detect
many more long plateaus in his graph. Therefore, based on these
logfiles, seven canonical coding strategies can be inferred:
1. Stripping down an existing program as a starting point.
2. Starting from a ready-made program and adding one’s own
procedures.
3. Long plateaus of no coding activity, during which students
browse other sample programs (or their own) for useful code.
4. Long plateaus of no coding activity, during which students
think of solutions without browsing other programs.
5. Period of linear growth in the code size, during which
students employ a trial-and-error strategy to get the code right.
6. Sudden jumps in character count, when students import code
from other programs, or copy and paste code from within their
working program.
7. A final phase in which students fix the formatting of the
code, indentation, variable names, etc.
Based on those strategies, and the previous programming
knowledge of students determined from questionnaires, the data
suggest three coding profiles:
“Copy and pasters:” more frequent use of a, b, c, f, and g.
Mixed-mode: a combination of c, d, e, and g.
“Self-sufficients:” more frequent use of d, e.
The empirical verification of these canonical coding strategies and
coding profiles has important implications for design, in
particular, learning environments in which engage in project-
based learning. Each coding strategy and profile might demand
different support strategies. For example, students with more
advanced programming skills (many of which exhibited the “self-
sufficient” behavior) might require detailed and easy-to-find
language documentation, whereas “copy and pasters” need more
working examples with transportable code. In fact, it could be that
more expert programmers find it enjoyable to figure the solutions
themselves, and would dislike to be helped when they are
problem-solving. Novices, on the other hand, might welcome
some help, since they exhibited a much more active help-seeking
behavior. The data suggests that students in fact are relatively
autonomous in developing apt strategies for their own expertise
level, and remained consistent. Therefore, echoing previous work
on epistemological pluralism, the data suggests that it would be
beneficial for designers to design multiple forms of support to
cater to each style (see, for example, [14]).
4.1.3 Code compilation
Despite these differences, one behavior seemed to be rather
similar across students: the frequency of code compilation. Figure
4
t
h
n
p
d
t
h
4
shows the mo
v
h
e error rate) v
e
n
umber of unsu
c
p
eriod (the mo
v
d
uration of the
l
h
ere period of t
h
v
ing average o
f
e
rsus time, i.e., t
h
c
cessful compil
a
v
ing average
p
l
ogfile—if ther
e
h
e moving aver
a
f
unsuccessful c
o
h
e higher the va
a
tions within on
p
eriod was 10
%
e
were 600 co
m
a
ge would be 60
)
o
mpilations (th
u
lue, the higher t
h
e moving avera
g
%
of the over
a
m
pilation attemp
t
)
.
u
s,
h
e
g
e
a
ll
t
s,
Fi
g
For all
instant
s
starts
v
then d
e
on to
p
compil
a
them
a
p
prox
i
further
which
genera
t
compri
compil
a
explor
a
follow
e
compil
a
smalle
r
5. 4
This
p
(compi
freque
n
serve
a
into st
u
enviro
n
The fr
e
p
lots
p
approx
i
an ana
l
based l
First, t
o
difficu
l
data i
n
p
roject
deadli
n
p
ropos
e
assign
m
could
m
system
might
compil
a
atypic
a
changi
n
Secon
d
cater t
o
g
ure 4. Error r
a
four students,
a
s
, the error rate
c
v
ery low, reach
e
e
creases reachin
g
p
of y=0 (c
o
ations) indicate
are concentrat
e
imately 2/3 in t
h
confirms the
d
we observed t
h
t
ing code is not
i
sed of severa
l
ations, we can
d
a
tion character
i
e
d by a phase
ation attempts,
r
fixes, with a lo
CONCL
U
p
aper is an
i
i
lation frequen
c
n
cy of correct/i
n
a
s formative as
s
u
dents’ free-for
m
n
ments.
e
quency of cod
e
p
reviously ana
l
imation of each
l
ysis has impor
t
l
earning environ
m
o
design and all
o
l
ty in the
p
rogr
n
dicate that tho
s
and not towar
d
n
e crunch anec
d
e
d metrics ca
n
m
ent and not o
n
m
onitor student
s
indicates that
s
be detected
ations occur w
i
a
l error rate c
u
n
g in size too m
u
d
, support mate
r
o
diverse codi
n
a
te versus com
p
a
fter we elimin
a
curve follows a
n
e
s a peak halfw
g
values close t
o
o
rrect compilat
i
the actual co
m
e
d in the fir
s
t
he first half to
d
ata from the
p
h
at the process
o
homogenous a
n
l
different pha
s
d
istinguish three
i
zed by few
u
with intense
and a final p
wer error rate.
U
SION
i
nitial step to
w
c
y, code size
,
n
correct compil
a
s
essments tools
,
m
explorations
e
compilations,
l
yzed, enables
prototypical co
t
ant implication
ments.
o
cate support re
s
r
amming
p
roces
s
s
e moments ha
p
d
s the end, as I i
n
dotally reporte
d
n
be calculate
d
n
ly at the end, s
o
s
in real time a
n
s
tudents are in
a
when, for e
x
i
th few change
s
u
rve is identif
i
u
ch for a long p
e
r
ials and st
r
ate
g
n
g styles and
p
p
ilation attemp
t
a
te the somewh
a
n
inverse parab
o
ay through the
o
zero. Also, th
e
i
ons) and y=
1
m
pilation attem
p
s
t half of th
e
1/3 in the seco
n
p
revious logfile
o
f learning to
p
n
d simple, but
c
s
es. In the ca
distinct segme
n
u
nsuccessful c
o
code evolution
hase of final
t
w
ards develop
i
,
code evoluti
a
tions, etc.) tha
t
,
and pattern-fi
n
in technology-r
together with t
h
us to trace a
ding profile an
d
s for the desig
n
s
ources, mome
n
s should be id
e
p
pens mid-way
n
itially suspecte
d
by many st
u
d
during the p
r
o
instructors an
d
n
d offer help o
n
a
critical zone.
T
x
ample, sever
a
s
in character
c
i
ed, or when
t
e
riod of time.
g
ies need to be
p
rofiles. A “sel
t
s (time)
a
t noisy first
o
lic shape. It
project, and
e
(blue) dots
1
(incorrect
p
ts. Most of
e
activity—
n
d half. This
analysis, in
p
rogram and
c
omplex and
se of code
n
ts: an initial
o
mpilations,
and many
t
ouches and
i
ng metrics
i
on pattern,
t
could both
n
ding lenses
ich learning
h
e code size
reasonable
d
style. Such
n
of project-
n
ts of greater
e
ntified. The
through the
d (given the
u
dents). The
r
ogramming
d
facilitators
n
ly when the
T
hese zones
a
l incorrect
c
ount, or an
t
he code is
designed to
f-sufficient”
coder might not need too many examples, but will appreciate
good command reference. Similarly, novices might benefit more
from well-documented, easy to find examples with easy-to-adapt
code.
By better understanding each student’s coding style and behavior;
we also have an additional window into students’ cognition.
Paired with other data sources (interviews, tests, surveys), the data
could offer a rich portrait of the programming process and how it
affects students’ understanding of the programming language and
more sophisticated skills such as problem solving.
However, the implications of this class of technique are not
limited to programming. Granted, programming offers a relatively
reliable way to collect ‘project snapshots,’ even several times per
hour. But such approaches could be employed with educational
software, or even with tangible interfaces, with the right computer
vision toolkit.
6. LIMITATIONS AND FUTURE WORK
Due to the low number of participants, the current study does not
make any claims about statistical significance. Also, because of
the length of the assignment (3 weeks), some students lost part of
their log files and their data could not be considered. For future
studies, we will be using a centralized repository that would avoid
the local storage of the log files, increasing their reliability and
reduce lost data. Another limitation is that I do not log what
students do outside of the programming environment, so I might
mistakenly take a large thinking period with a pause.
7. REFERENCES
[1] Amershi, S., & Conati, C. (2009). Combining Unsupervised
and Supervised Classification to Build User Models for
Exploratory Learning Environments. Journal of Educational
Data Mining, 1(1), 18-71.
[2] Baker, R. & Yacef, K. (2009). The State of Educational Data
Mining in 2009: A Review and Future Visions. Journal of
Educational Data Mining, 1(1).
[3] Baker, R. S., Corbett, A. T., Koedinger, K. R., & Wagner, A.
Z. (2004). Off-task behavior in the cognitive tutor classroom:
when students “game the system”. Paper presented at the
Proceedings of the SIGCHI conference on Human factors in
computing systems.
[4] Beck, J., & Sison, J. (2006). Using knowledge tracing in a
noisy environment to measure student reading proficiencies.
International Journal of Artificial Intelligence in Education,
16(2), 129-143.
[5] Berland, M. & Martin, T. (2011). Clusters and Patterns of
Novice Programmers. The meeting of the American
Educational Research Association. New Orleans, LA.
[6] Bernardini, A., & Conati, C. (2010). Discovering and
Recognizing Student Interaction Patterns in Exploratory
Learning Environments. In V. Aleven, J. Kay & J. Mostow
(Eds.), Intelligent Tutoring Systems (Vol. 6094, pp. 125-
134): Springer Berlin / Heidelberg.
[7] Blikstein, P. (2009). An Atom is Known by the Company it
Keeps: Content, Representation and Pedagogy Within the
Epistemic Revolution of the Complexity Sciences. Ph.D.
PhD. dissertation, Northwestern University, Evanston, IL.
[8] Blikstein, P. (2010). Data Mining Automated Logs of
Students' Interactions with a Programming Environment: A
New Methodological Tool for the Assessment of
Constructionist Learning. American Educational Research
Association Annual Conference (AERA 2010), Denver, CO.
[9] Conati, C., & Merten, C. (2007). Eye-tracking for user
modeling in exploratory learning environments: An empirical
evaluation. Knowledge-Based Systems, 20(6), 557-574.
[10] Craig, S. D., D'Mello,S., Witherspoon, A. and Graesser, A.
(2008). 'Emote aloud during learning with AutoTutor:
Applying the Facial Action Coding System to cognitive-
affective states during learning', Cognition & Emotion, 22: 5,
777 — 788.
[11] Montalvo, O., Baker, R., Sao Pedro, M., Nakama, A., &
Gobert, J. (2010) Identifying Students’ Inquiry Planning
Using Machine Learning. Educational Data Mining
Conference, Pittsburgh, PA.
[12] Papert, S. (1980). Mindstorms : children, computers, and
powerful ideas. New York: Basic Books.
[13] Rus, V., Lintean, M. and Azevedo,R. (2009). Automatic
Detection of Student Mental Models During Prior
Knowledge Activation in MetaTutor. In Proceedings of the
2nd International Conference on Educational Data Mining
(Jul. 1-3, 2009). Pages 161-170.
[14] Turkle, S., & Papert, S. (1990). Epistemological Pluralism.
Signs, 16, 128-157.
[15] Weinland, D., Ronfard, R., and Boyer, E. (2006). Free
viewpoint action recognition using motion history volumes.
Comput. Vis. Image Underst. 104, 2 (Nov. 2006), 249-257
[16] Wilensky, U. (1999, updated 2006). NetLogo [Computer
software]. Evanston, IL: Center for Connected Learning and
Computer-Based Modeling.
[17] Yilmaz, A. and Shah, M. (2005). Recognizing Human
Actions in Videos Acquired by Uncalibrated Moving
Cameras. In Proceedings of the Tenth IEEE international
Conference on Computer Vision (ICCV ‘05) Volume 1 -
Volume 01 (October 17 - 20, 2005). ICCV. IEEE Computer
Society, Washington, DC, 150-157.
... One of the first process studies found differing patterns of coding in an introductory university course using NetLogo (Blikstein, 2011). These patterns were dependent on the prior experience of students (Blikstein, 2011). ...
... One of the first process studies found differing patterns of coding in an introductory university course using NetLogo (Blikstein, 2011). These patterns were dependent on the prior experience of students (Blikstein, 2011). Blikstein and colleagues expanded this work in 2014. ...
Article
Full-text available
Computational thinking (CT) is a concept of growing importance to pre-university education. Yet, CT is often assessed through results, rather than by looking at the CT process itself. Process-based assessments, or assessments that model how a student completed a task, could instead investigate the process of CT as a formative assessment. In this work, we proposed an approach for developing process-based assessments using constructionist tasks specifically for CT assessment in K–12 contexts, with a focus on directly connecting programming artifacts to aspects of CT. We then illustrated such an assessment with 29 students who ranged in CT and programming experience. These students completed both a constructionist task and a traditional CT assessment. Data from the constructionist task was used to build a process-based assessment and results were compared between the two assessment methods. The process-based assessment produced groups of students who differed in their approach to the task with varying levels of success. However, there was no difference between groups of students in the scores on the traditional CT assessment. Process-based assessment from our approach may be useful as formative assessment to give process feedback, localized to the task given to students.
... While there are numerous advantages to using AI tools in education, the potential pitfalls must be recognised and mitigated. Blikstein (2011) described the ability of these tools to provide automated techniques which could assess and analyse a student's learning process and habits, which provides a key advantage of AI within education. The tools can assess and evaluate a student's needs and interests in real-time during the learning process, which thereby allows for an enhanced learning outcome. ...
Thesis
Full-text available
This doctoral dissertation explores the ethical and moral implications of using publicly available artificial intelligence (AI) platforms in tertiary education. Employing a mixed-methods approach, it combines qualitative interviews with students and lecturers and a quantitative survey involving 391 participants across global tertiary institutions. The research investigates five key questions: the permissibility of AI use, existing institutional regulations, user awareness, AI’s impact on academic integrity, and the essential ethical guidelines for responsible AI integration. The findings reveal significant disparities in regulation awareness and the need for clear, consistent policies to safeguard educational values while embracing technological advancements. Notably, lecturers and non-digital natives demonstrate higher regular AI use than digital-native students, emphasizing the evolving landscape of technology adoption in education. Regional differences also highlight the necessity for context-specific guidelines. The study underscores the importance of ethical oversight, data privacy, and maintaining academic integrity amid rapid technological advancement. It advocates for continuous institutional adaptation, emphasizing collaborative policymaking and stakeholder education to mitigate misuse while maximizing AI’s educational potential. This research contributes to the broader discourse on the responsible integration of emerging technologies in educational settings and provides insights for policymakers, educators, and institutions aiming for equitable and ethical AI use in academia.
... An affect-sensitive museum system can integrate assessments of a visitor's cognitive, affective, and motivational states into its strategies to keep visitors engaged, boost self-confidence, pique their interest, and maximize their learning outcomes [39]. For instance [40,41], if a visitor exhibits signs of frustration, the system can provide helpful hints to facilitate knowledge acquisition and offer empathetic comments to boost motivation. Conversely, if a visitor appears disengaged, the system can present more captivating or challenging exhibits. ...
Article
Full-text available
The Future Internet aims to revolutionize digital interaction by integrating advanced technologies like AI and IoT, enabling a dynamic and resilient network. It envisions emotionally intelligent systems that can interpret and respond to human feelings, creating immersive, empathy-driven learning experiences. This evolution aspires to form a responsive digital ecosystem that seamlessly connects technology and human emotion. This paper presents a computational model aimed at enhancing the emotional aspect of learning experiences within museum environments. The model is designed to represent and manage affective and emotional feedback, with a focus on how emotions can significantly impact the learning process in a museum context. The proposed model seeks to identify and quantify emotions during a visitor’s engagement with museum exhibits. To achieve this goal, we primarily explored the following: (i) methods and techniques for assessing and recognizing emotional responses in museum visitors, (ii) feedback management strategies based on the detection of visitors’ emotional states. Then, the methodology was tested on 1000 cases via specific questionnaire forms, along with the presentation of images and short videos, and the results of data analysis are reported. The findings contribute toward establishing a comprehensive methodology for the identification and quantification of the emotional state of museum visitors.
... Bengio, Y. (2009) discusses machine learning applications of deep architectures for artificial intelligence. Researchers that have looked into the application of data analytics in education include Blikstein (2016), who studies how data analytics can be used to predict learning outcomes and evaluate students' behaviour. ...
Research
Full-text available
The integration of Artificial Intelligence (AI) into educational practices has significantly transformed various disciplines, including Automobile Technology Education (ATE). This study examined the potential of AI as a modern instructional tool, emphasizing its ability to enhance learning experiences, improve educational outcomes, and address challenges inherent in traditional teaching methods within ATE. The methodology involved a comprehensive literature review, the distribution of surveys and questionnaires, and the analysis of collected data. The research was conducted across four selected tertiary institutions offering ATE in the North West region of Nigeria, utilizing a 10-item questionnaire to gather quantitative data. Results indicated a response range between 97.5% and 62.5%, underscoring AI's potential to revolutionize ATE by increasing student engagement, learning efficiency, and skill acquisition. However, the effective integration of AI into ATE requires addressing ethical concerns, investing in necessary resources, and providing continuous support for educators. The paper concluded with recommendations for implementing AI-driven instructional techniques in ATE curricula.
... These platforms can use multimodal learning analytics to track learners' progress and adjust teaching content as needed, supporting students with diverse learning needs. [10] 2. Multimodal Language Learning: AI-driven multimodal learning environments integrate speech recognition, natural language understanding, and computer vision technologies, offering immersive language learning experiences. These environments enable learners to engage in language input across multiple modalities, enhancing language comprehension and communication skills. ...
Article
learners, with a focus on its potential to significantly enhance teaching effectiveness and improve learning outcomes. It meticulously defines ESL and elucidates its fundamental principles, underlining the imperative of providing comprehensible language input, fostering interactivity, contextualization, and personalization. By delving into four pivotal aspects, namely provision of intelligent teaching resources, automated assessment and feedback mechanisms, real-time interaction platforms, and support for self-directed learning, this paper showcases how AI-driven technologies can adeptly customize learning experiences to suit individual learner requirements. Furthermore, it expounds on the challenges and considerations entailed in this integration, encompassing ethical, privacy, and equity concerns, while also delineating future prospects for personalized learning and lifelong learning support. This comprehensive examination underscores AI’s potential to revolutionize language education for Mandarin-speaking learners, creating a more personalized, interactive, and effective learning environment.
... Learning how to use AI offers greater student engagement and achievement than learning in traditional classrooms (Blikstein, 2011). ChatGPT can be a more affordable alternative to traditional resources, such as books or private teachers (Suharmawan, 2023). ...
Article
Full-text available
This study focuses on how ChatGPT and learning motivation affect students' self-directed learning (SDL). This research uses multiple regression analysis methods. The sampling technique used convenience sampling with 98 respondents who were undergraduate, master's, and doctoral students and had used ChatGPT for learning at least 1 (one) time. The research results show a significant and positive influence of ChatGPT and learning motivation on SDL, with a significance value of .00. The use of ChatGPT and learning motivation influence SDL by 75.7%. In comparison, the other 25% is influenced by other factors such as teacher influence, learning environment, metacognitive abilities, critical thinking abilities, and access to educational resources. If we look at the influence per variable, ChatGPT influences SDL by 31.2%, and Learning Motivation influences SDL by 44.3%. It can be concluded that Learning Motivation has a more significant influence on SDL than ChatGPT. These findings contribute new insights to the existing knowledge on the role of artificial intelligence and learning motivation in supporting student self-directed learning.
Data
Full-text available
Action recognition is an important and challenging topic in computer vision, with many important applications including video sur-veillance, automated cinematography and understanding of social interaction. Yet, most current work in gesture or action interpretation remains rooted in view-dependent representations. This paper introduces Motion History Volumes (MHV) as a free-viewpoint represen-tation for human actions in the case of multiple calibrated, and background-subtracted, video cameras. We present algorithms for com-puting, aligning and comparing MHVs of different actions performed by different people in a variety of viewpoints. Alignment and comparisons are performed efficiently using Fourier transforms in cylindrical coordinates around the vertical axis. Results indicate that this representation can be used to learn and recognize basic human action classes, independently of gender, body size and viewpoint.
Article
Full-text available
In an attempt to discover the facial action units for affective states that occur during complex learning, this study adopted an emote-aloud procedure in which participants were recorded as they verbalised their affective states while interacting with an intelligent tutoring system (AutoTutor). Participants’ facial expressions were coded by two expert raters using Ekman's Facial Action Coding System and analysed using association rule mining techniques. The two expert raters received an overall kappa that ranged between .76 and .84. The association rule mining analysis uncovered facial actions associated with confusion, frustration, and boredom. We discuss these rules and the prospects of enhancing AutoTutor with non-intrusive affect-sensitive capabilities.
Data
Full-text available
In an attempt to discover the facial action units for affective states that occur during complex learning, this study adopted an emote-aloud procedure in which participants were recorded as they verbalised their affective states while interacting with an intelligent tutoring system (AutoTutor). Participants' facial expressions were coded by two expert raters using Ekman's Facial Action Coding System and analysed using association rule mining techniques. The two expert raters received an overall kappa that ranged between .76 and .84. The association rule mining analysis uncovered facial actions associated with confusion, frustration, and boredom. We discuss these rules and the prospects of enhancing AutoTutor with non-intrusive affect-sensitive capabilities. Automated detection of affective states is on the horizon of computer science and engineering, in subfields that range from intelligent tutoring systems that help people learn to security systems that attempt to detect terrorists. In the arena of learning, there are sensing systems with computational algorithms that can classify the learners' affective states as they interact with tutoring systems (Fan et al., 2003) and educational games (Conati, 2002). The accuracy of these systems is modest, but improving. Detection of a learner's affective states has proven to be a difficult task because there are many channels of communication (i.e., dialogue, facial.org). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF. We would also like to thank Laurentiu Cristofor for use of the ARMiner client server data mining application used for the association rule mining.
Article
Action recognition is an important and challenging topic in computer vision, with many important applications including video surveillance, automated cinematogra-phy and understanding of social interaction. Yet, most current work in gesture or action interpretation remains rooted in view-dependent representations. This paper introduces Motion History Volumes (MHV) as a free-viewpoint representation for human actions in the case of multiple calibrated, and background-subtracted, video cameras. We present algorithms for computing, aligning and comparing MHVs of different actions performed by different people in a variety of viewpoints. Alignment and comparisons are performed efficiently using Fourier transforms in cylindrical co-ordinates around the vertical axis. Results indicate that this representation can be used to learn and recognize basic human action classes, independently of gender, body size and viewpoint.
Article
The goal of this dissertation is to explore relations between content, representation, and pedagogy, so as to understand the impact of the nascent field of complexity sciences on science, technology, engineering and mathematics (STEM) learning. Wilensky & Papert coined the term "structurations" to express the relationship between knowledge and its representational infrastructure. A change from one representational infrastructure to another they call a "restructuration." The complexity sciences have introduced a novel and powerful structuration: agent-based modeling. In contradistinction to traditional mathematical modeling, which relies on equational descriptions of macroscopic properties of systems, agent-based modeling focuses on a few archetypical micro-behaviors of "agents" to explain emergent macro-behaviors of the agent collective. Specifically, this dissertation is about a series of studies of undergraduate students' learning of materials science, in which two structurations are compared (equational and agent-based), consisting of both design research and empirical evaluation. I have designed MaterialSim, a constructionist suite of computer models, supporting materials and learning activities designed within the approach of agent-based modeling, and over four years conducted an empirical inves3 tigation of an undergraduate materials science course. The dissertation is comprised of three studies: Study 1 - diagnosis . I investigate current representational and pedagogical practices in engineering classrooms. Study 2 - laboratory studies. I investigate the cognition of students engaging in scientific inquiry through programming their own scientific models. Study 3 - classroom implementation. I investigate the characteristics, advantages, and trajectories of scientific content knowledge that is articulated in epistemic forms and representational infrastructures unique to complexity sciences, as well as the feasibility of the integration of constructionist, agent-based learning environments in engineering classrooms. Data sources include classroom observations, interviews, videotaped sessions of model-building, questionnaires, analysis of computer-generated logfiles, and quantitative and qualitative analysis of artifacts. Results shows that (1) current representational and pedagogical practices in engineering classrooms were not up to the challenge of the complex content being taught, (2) by building their own scientific models, students developed a deeper understanding of core scientific concepts, and learned how to better identify unifying principles and behaviors in materials science, and (3) programming computer models was feasible within a regular engineering classroom.
Article
In this paper, we present a data-based user modeling framework that uses both unsupervised and supervised classification to build student models for exploratory learning environments. We apply the framework to build student models for two different learning environments and using two different data sources (logged interface and eye-tracking data). Despite limitations due to the size of our datasets, we provide initial evidence that the framework can automatically identify meaningful student interaction behaviors and can be used to build user models for the online classification of new student behaviors online. We also show framework transferability across applications and data types.