Conference PaperPDF Available

Say Hello to ‘Coding Tutor’! Design and Evaluation of a Chatbot-based Learning System Supporting Students to Learn to Program



The overall goal of this design science research project is to design and evaluate a chatbot-based learning system that is able to support novice programmers to learn to write software code in formal learning settings. Our results indicate that such a conversational intelligent programming tutor is suited to take over tasks of teaching assistants in times when no human teaching assistant or lecturer is available for help, e.g., due to resource constraints. For instance, our so-called 'Coding Tutor' is able to respond to open-ended knowledge questions, to assess submitted source code automatically or to guide students step-by-step through programming exercises using natural language communication. In addition to providing our situated software artifact, we document our results based on the core components of a design theory as proposed by Gregor and Jones (2007). Thus, we provide the first step toward a nascent design theory for chatbot-based learning systems in IS education.
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 1
Say Hello to ‘Coding Tutor’!
Design and Evaluation of a Chatbot-based
Learning System Supporting Students to
Learn to Program
Completed Research Paper
Sebastian Hobert
University of Goettingen
Platz der Göttinger Sieben 5, D-37073 Göttingen
The overall goal of this design science research project is to design and evaluate a chatbot-
based learning system that is able to support novice programmers to learn to write
software code in formal learning settings. Our results indicate that such a conversational
intelligent programming tutor is suited to take over tasks of teaching assistants in times
when no human teaching assistant or lecturer is available for help, e.g., due to resource
constraints. For instance, our so-called ‘Coding Tutor’ is able to respond to open-ended
knowledge questions, to assess submitted source code automatically or to guide students
step-by-step through programming exercises using natural language communication. In
addition to providing our situated software artifact, we document our results based on
the core components of a design theory as proposed by Gregor and Jones (2007). Thus,
we provide the first step toward a nascent design theory for chatbot-based learning
systems in IS education.
Keywords: Chatbot-based learning system, intelligent programming tutor, pedagogical
conversational agent, automatic assessment system, design theory
The importance of technology in our daily life has been growing rapidly for many years and it will be
essential for shaping our future world. Understanding technology and being able to apply it in the
increasingly digital business world is an essential skill for students who want to get adequate preparation
for future workplaces. This is and will not only be true for computer scientists and information systems
employees, but also for most future workplaces (Popat and Starkey 2019). One important skill that is
essential for getting an in-depth understanding of technology, is learning to understand and write software
programs known as programming or coding. However, currently, only less than 10% of all adults are
capable of coding and do it on a regular basis (OECD 2018). Due to the growing importance, teaching to
code software is getting part of today’s curriculums of primary, secondary and higher education. However,
students are often not adequately skilled in programming and consider it as a difficult and complex task
(Daradoumis et al. 2019). One main reason why learning to code is challenging (Vial and Negoita 2018) for
many students is that it requires multiple knowledge dimensions (like conceptual and procedural
knowledge; Passier 2017) and includes several cognitive processes (like understanding, analyzing and
evaluating) according to Bloom’s revised taxonomy (Anderson and Krathwohl 2001; Bloom et al. 1956). For
instance, before being able to create a new piece of software, students need the abilities (1) to understand
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 2
the programming task, (2) to analyze it in order to identify proper solutions and (3) to evaluate different
solution options.
To face this challenge in education, introductory programming courses usually combine multiple teaching
approaches like large-scale lectures to teach basic algorithmic skills and classroom tutorials in smaller
groups to provide guidance for programming exercises that encourage students to practice coding at home.
Despite these in comparison to other courses far-reaching support offers that are common in today’s
programming courses, many students still struggle with it. Even if they have understood the learning
contents in the lectures, the homework tasks often pose a great challenge to many of the novice
programmers. The creative process of writing software code seems particularly demanding for beginners,
especially if no guidance is available (Passier 2017). To address this, in-class tutorial sessions are a starting
point as experienced teaching assistants can support the students. However, due to resource constraints,
the students-to-lecturer ratio is usually still high in tutorial sessions (often up-to 20-to-1 or even higher)
and the timeframe in which teaching assistants are available for help is often limited to only a few hours
per week.
To further assist novice programmers, e-learning tools are available as outlined by Sim and Lau (2018) and
Crow et al. (2018) in two recent systematic reviews of prior research. Particularly, so-called intelligent
programming tutors, are state-of-the-art. While there is no distinct feature set of intelligent programming
tutors, the most common features include (1) the provision of learning content (e.g., tutorials, explanations
of concepts, and formative assessments to query factual knowledge using quiz questions), and (2) automatic
assessment tools that evaluate the students’ homework assignments automatically. Those existing e-
learning systems may improve the level of support of novice programmers and reduce the workload of
teaching assistants. However, current systems are not yet able to provide equally good support as teaching
assistants would do during in-class tutorial sessions. Particularly, the ability to individually respond to
arising questions of novice programmers and to provide adequate explanations are not yet possible in
common intelligent tutoring systems. Thus, programmers still face the challenge to solve programming
tasks on their own. To address this problem of insufficient communication opportunities and missing
individualized support of novice programmers, we leverage the advances in intelligent tutoring systems
particularly in chatbot-based learning systems known as pedagogical conversational agents (Tamayo-
Moreno and Perez-Marin 2016) and adapt them to the field of programming education. Using these
technologies, we aim at developing an intelligent programming tutoring system that is able to communicate
with novice programmers in natural language using a chatbot interface in order to assist them on an
individualized level while solving programming tasks in a programming environment. To achieve our goal
of providing students with assistance in a similar manner as teaching assistants would do, we follow a design
science research approach (Hevner 2007) to answer the following research questions:
RQ1: What requirements should be considered when designing a chatbot-based learning system
that aims at supporting students to solve coding tasks?
RQ2: How do novice programmers assess the changed learning concept based on the developed
chatbot-based learning system?
To answer these research questions, the remainder of this paper is structured as follows: In the next section,
we describe the theoretical background by focusing on related research and the ICAP framework (Chi and
Wylie 2014) as our kernel theory. Following this, we describe our research design based on the design
science research process and outline in detail how we design our chatbot-based learning system based on
both, scientific literature as well as empirical results. After describing our design and evaluation process in
eight consecutive steps, we discuss the results and document the generated design knowledge as proposed
by Gregor and Jones (2007). Finally, we summarize our results, discuss limitations and point out future
research directions in the conclusion.
Theoretical Background
Chatbot-based Learning Systems
Chatbots (also known as chatterbots, conversational agents or talkbots) can be defined as interactive
information systems that provide natural language user interfaces and attempt to conduct conversations
similar to human beings (Winkler and Söllner 2018). By applying methods known from artificial
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 3
intelligence, machine learning, and natural language processing, chatbots usually act autonomously either
in a reactive or proactive way. To handle natural language input and output, typical technical architectures
of chatbots consist of a natural language understanding component that preprocesses the user’s input and
redirects it to a dialog manager. After computing an appropriate answer based on inputs from a knowledge
base, the natural language generation component generates an output message that is sent to the user.
Chatbots have been researched since the 1960ies when Weizenbaum (1966) invented ELIZA the first
chatbot that focused on natural language communication. Other prior chatbots, like the widely known
chatbot A.L.I.C.E (Wallace 2009), mainly applied rule-based natural language processing tools. Nowadays
more sophisticated natural language processing tools are available. Due to these technology improvements
in recent years, chatbots started to spread across many domains. As indicated by recent literature reviews
targeting chatbot-based learning systems (e.g., Hobert and Meyer von Wolff 2019; Winkler and Söllner
2018), this also resulted in an increased availability and research interest of so-called pedagogical
conversational agents. Pedagogical conversational agents have been developed and analyzed in prior
research: For instance, Mikic et al. (2009) developed Charlie, a chatbot based on the AIML rule-based
language, which provides an alternative user-interface to an e-learning platform. The MentorChat software
by Tegos et al. (2011) is a conversational agent that implements functionalities to support collaborative
learning tasks in chat discussions. Further, Graesser et al. (2017) implemented the AutoTutor prototype
that focuses on presenting complex problems that require some form of reasoning. Besides these mentioned
chatbot-based learning systems further software prototypes have been researched (see e.g. Hobert and
Meyer von Wolff (2019) for further examples). However, formal learning settings have only been researched
in few cases. Additionally, most available chatbots focus on very specific use cases. Thus, students usually
use those chatbots only for a very short time frame. In contrast to that, the learning settings in programming
courses as described in the introduction setting would require the regular use of a chatbot-based learning
system during the whole semester. For instance, if students need to solve programming tasks every single
week, the chatbot needs to be capable of supporting them each time. Thus, the scope of the chatbot needs
to be much broader.
E-Learning Approaches in Introductory Programming Courses
E-learning or technology-enhanced learning systems have been used by lecturers of introductory
programming courses for a long time. They are known in the literature under the term intelligent
programming tutors (Crow et al. 2018; Sim and Lau 2018). Often the aim of such systems is to provide
learning contents about basic programming skills via learning management systems or to give access to
formative assessments (e.g., single- or multiple-choice quizzes). By providing the learning contents online
instead of teaching it in a lecture, blended learning scenarios (e.g., Albrecht et al. 2018) have already been
tested. In one case, a conversational agent was used to support students by answering questions about Java
terms (Müller et al. 2018). In addition to providing the learning contents via e-learning systems, automatic
assessment systems like JACK (Goedicke et al. 2018), Praktomat (Breitner et al. 2017; Zeller 2000) or many
others (e.g., Daradoumis et al. 2019) are widespread among programming courses and are even used in
MOOCs (Bey et al. 2018). Often the main reason for using such automatic assessment systems is to reduce
the workload of lecturers or teaching assistants to evaluate the submitted homework assignments. This
enables lectures to be conducted even if the number of students is increasing rapidly in a programming
course as evaluating the exercises is not dependent on the students-to-lecturer ratio anymore (Bey et al.
2018). Students also can benefit from automatic assessment systems as they get immediate feedback about
their submitted tasks, i.e. after uploading their source code, the system automatically evaluates it and is
able to provide instant feedback (Ihantola et al. 2010). From a technical perspective, automatic assessment
systems are capable of evaluating the students’ source code with static or dynamic code analysis. Common
test scenarios are for instance to compile the submitted source code and to apply dynamic software tests,
e.g., JUnit for the Java programming language.
The currently used e-learning systems are capable of supporting novice programmers and can reduce the
workload of teaching assistants as homework assignments can be evaluated automatically. Thus, both
common usage approaches of intelligent programming tutors the provision of contents via a learning
management system and the automatic evaluation of programming exercises seems suited to deal with a
growing number of students and limited teaching assistant resources. However, the research problem we
outlined in the introduction section has not been adequately targeted in prior research. To solve this
problem of missing individualized support (e.g., answering of arising questions while trying to solve
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 4
homework assignments) and missing guidance for novice programmers in times when no lecturers or
teaching assistants are available, more sophisticated solutions are required that focus specifically on the
needs of students. Chatbot-based learning systems are particularly suited to provide such an interactive and
individualized interaction as outlined in the previous subsection. Hence, combining a natural language-
based personal assistant interface of chatbot-based learning systems with an automatic assessment system
of intelligent programming tutors, pledges to be beneficial for providing a suited learning support. It is to
be expected that such a chatbot-enabled intelligent tutoring system is capable of answering arising
questions of novice programmers and guiding them individually through solving programming tasks. Thus,
such a system to be designed in this research project combines tasks of common intelligent programming
tutoring systems with tasks that are currently carried out by teaching assistants.
ICAP Framework as a Kernel Theory
The added value of using the technology of chatbots in formal learning settings in comparison to other
common technology-enhanced learning systems is given by an increased engagement of the students due
to the conversational interaction of the learners with the chatbot. According to the ICAP Framework (i.e.,
Interactive, Constructive, Active and Passive Framework) by Chi and Wylie (2014), the learners’
engagement with learning materials can range from passive to active to constructive to interactive(Chi
and Wylie 2014) and will result in an improved learning outcome. Whereas in passive engagement activities
students only consume or receive learning materials, in active engagement activities students are able to
manipulate the content presentation, e.g., by highlighting important text passages. In the two most
engaging forms of interaction according to Chi and Wylie (2014), students deepen their interaction, e.g., by
comparing the learning materials with prior knowledge (constructive engagement), by debating with others
or asking and answering questions (interactive engagement).
Following the hypothesis of the ICAP Framework, chatbot-based learning systems are capable of fostering
the students’ engagement as they add the new component of dialoguing to e-learning systems in
introductory programming courses. In comparison to common e-learning approaches like e-learning
modules or automatic assessment systems as described above, chatbots can also start discourses and
discussions with students about the learning contents just like human teaching assistants would do. Thus,
an interactive engagement according to the ICAP Framework can be achieved even in times when no human
teaching assistants are available.
Research Design
To address our research goal by developing an innovative learning system for supporting students of
introductory programming courses individually when no teaching assistants are available for help, we apply
a design science research (DSR) method. Using this approach, we aim at (1) providing a relevant solution
for the predominant problem setting as described in the introductory section by applying a scientific
approach and (2) deriving generalized design implications for the Information Systems (IS) research
discipline according to Gregor and Jones (2007).
We use the three cycle information systems research framework by Hevner (2007) and Hevner et al. (2004)
as a basis and apply eight research steps accordingly in the relevance, rigor and design cycles as displayed
in Figure 1. In the first step, we formulated the research problem by deriving it from common settings in
introductory programming courses as outlined in the introduction section. In addition to this practice-
oriented motivation, we specified the problem from an educational perspective by deriving the learning
objectives of our problem setting using the revised Bloom’s Taxonomy (Anderson and Krathwohl 2001;
Bloom et al. 1956) as outlined in the following step 1 of our research processes. Following the problem
specification, we identified user stories and related requirements from both, scientific literature and
educational practice. To get insights into the educational practice, we conducted an expert workshop with
eight participants (lecturers, e-learning experts and instructional designers) in which we discussed how
pedagogical conversational agents could be used for supporting students in introductory university courses.
Based on this, we derived design principles and a conceptual artifact in a first design iteration. We evaluated
the conceptual artifact afterward with both, 28 students and 16 teaching assistants, to get feedback about
this first iteration of the artifact early in the design process. Using the feedback of this first evaluation, we
revised our design features and the conceptual design. As the outcome of our second design iteration, we
got a fully functional software artifact called Coding Tutor for the specified problem. In a second evaluation
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 5
setting, 40 other novice programming students tested our software artifact in a field test in the following
semester term. During this second evaluation, the students used our Coding Tutor artifact to solve a
predefined programming task. By first evaluating the design features in the conceptual design (step 5) and
then the software artifact (step 7), we conducted a systematic evaluation before and after implementing the
artifact. During both evaluation studies, we followed the evaluation strategy for design science research as
proposed by Venable et al. (2016). Finally, after finishing the evaluation, we documented the design
knowledge using the core components of a design theory as proposed by Gregor and Jones (2007).
Figure 1. Adapted Research Approach (based on Hevner et al. 2004; Hevner 2007; Rietsche et al. 2018)
Designing and Evaluating the Coding Tutor
In the following, we outline the results of all eight steps of our design science research approach with the
aim of developing our software artifact called Coding Tutor.
#1 Specifying the Problem Statement
As motivated in the introduction section of this paper, we address the challenge of supporting students in
large-scale introductory programming courses when they solve homework exercises. Our main goal from
an educational perspective is to provide novice programmers with an e-learning system that guides them
individually through solving programming tasks when there is no human teaching assistant available (e.g.,
at the weekend). Main reasons for the challenge to provide individual support in large scale lectures are
resource constraints (Hien et al. 2018). To overcome this challenge, the learning system to be created needs
to be able to support students in a similar way a teaching assistant would do in in-class trainings.
To specify the research problem, we define the learning objectives based on the revised Bloom’s Taxonomy
(Anderson and Krathwohl 2001; Bloom et al. 1956). The taxonomy differentiates four different dimensions
of knowledge: factual, conceptual, procedural and metacognitive. Additionally, six categories exist in the
cognitive process dimension: remember, understand, apply, analyze, evaluate and create. By combining
both dimensions, learning objectives can be categorized as intersections between knowledge type and
cognitive process.
In programming exercises of introductory programming courses particularly three main learning objectives
are targeted: First, novice programmers need to understand the syntax of the programming language
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 6
(objective 1). Especially important is to get familiar with the syntax and semantic of commands including
their functionality. Thus, this first objective serves as a basis for programming and can be assigned as factual
knowledge. In order to be able to solve programming exercises, conceptual knowledge is needed as well. In
particular, students need to understand and to be able to apply the underlying algorithms of the
programming problem to be solved (objective 2a). Otherwise, it is usually not possible to solve a
programming task, if the underlying theoretical knowledge is missing. In addition to that, the students
should be able to break the algorithmic problem into smaller subtasks (objective 2b). This is especially
important as soon as the programming tasks are getting more complex. Finally, the actual coding process
of writing source code can be categorized as procedural knowledge. This task encompasses, in particular,
to translate the algorithm into source code and to structure the overall code. Due to this, the actual coding
during exercises goes beyond apply and can be classified as an analytic cognitive process. Table 1
summarizes the learning objectives in the revised Bloom’s Taxonomy (Anderson and Krathwohl 2001).
Cognitive Process Dimension
Objective 1
Objective 2a
Objective 2b
Objective 3
Table 1. Learning Objectives Taxonomy (based on Anderson and Krathwohl 2001)
#2 Deriving Requirements from Environment and Knowledge Base
To deduce requirements for the specified problem, we considered the environmental perspective
(relevance) as well as the knowledge base (rigor). To analyze the environmental perspective, we first
conducted an expert workshop with eight participants (lecturers, e-learning experts and instructional
designers) in which we derived application scenarios of chatbot-based learning systems. Based on this, we
identified three user stories that are relevant for supporting students in introductory programming courses
(see Figure 2 below): (U1) Students need to understand the theoretical problem statement of programming
exercises, (U2) students need to be able to apply the underlying algorithm and (U3) students need to be
able to transfer the theoretical algorithms into source code. Those user stories are directly related to the
learning objectives as described in step 1. Second, we use the task characteristics of teaching assistants in
programming courses as a basis to derive further requirements. This seems especially important as the
learning system to be designed should take over tasks of teaching assistants in times when no one is
available for help. In in-class programming tutorial sessions, teaching assistants usually deepen the
learning contents by explaining the exercises (T1) and the theoretical background (e.g., underlying
algorithms; T2). Even though lectures are usually not suited to answer all questions of all students, the
teaching assistants often have enough capacity to answer remaining open questions about the learning
contents (T3). To help novice programmers in-class, the teaching assistants usually instruct them based on
the students’ individual needs and experiences (T4). After the students finished the tasks, the teaching
assistants need to correct them (T5) and give the students individual formative feedback (T6).
Additionally, we rely on the ICAP framework as a kernel theory as outlined in the Theoretical Background
section and supplement it with further educational literature. Based on the framework, we derived the
overall solution approach to design a chat-based learning system for targeting the problem of insufficient
support in introductory programming courses. Based on the assumption of the ICAP framework that an
increased interactivity results in enhanced learning effectiveness, the first requirement (R1) for our artifact
solution is to provide a natural language interface, which can be used by the students to interact with a
chatbot. In this setting, the chatbot takes over the tasks of the instructors (teaching assistants) who are
capable of explaining the tasks and learning content. Due to this form of chat-based interactivity, two types
of interaction the learner-content interaction as well as the learner-instructor can be enhanced, which
is advantageous according to Moore (1993). Using the natural language user interface of chatbots,
particularly the answering (T3), instructing (T4) and feedback (T6) tasks can be realized.
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 7
Additionally, a source code editor that is integrated into the learning system is required (R2) in order to
support the actual coding process (U3). It is thereby important that the learning system including the source
code editor is accessible independently of a specific place or time as the main objective of the system is to
support students outside of traditional learning settings (e.g., students should be able to use it at home
when practicing how to code).
Based on the first two requirements targeting the actual user interface design, six additional function-
oriented requirements are needed, which are directly related to the teaching assistants’ task characteristics.
Thus, the chatbot-based learning system needs to be able to provide explanations of the programming
exercises (R3 based on T1 and U1) and of the theoretical background of the exercises (R4 based on T2, U1,
and U2), when students need additional advice. Besides this mostly static information provision, teaching
assistants have the task to answer on-demand questions (R5 based on T3, U1) and to provide step-by-step
guidance to students that require additional input (R6 based on T4 and U3). With these requirements, the
essential tasks that are usually carried out by teaching assistants can be addressed. In addition to these in-
class tasks, the teaching assistants usually need to correct submitted assignments (R7 based on T5 and T6)
after the in-class sessions. Based on this, the chatbot-based system should give the students feedback and
additional advice (R8 based on T6 and U3).
A detailed overview of the relationships between the students’ user stories (U1 - U3), the teaching assistants’
task characteristics (T1 - T6) and the requirements (R1 R8) is displayed in Figure 2. It also visualizes the
connection between the requirements and the design principles that are derived in the next section.
#3 Deriving Design Principles
Based on the eight requirements, we derived five design principles and a conceptual artifact that is the first
iteration of our design process (see next subsection). Whereas two design principles are targeted at the user
interface, three design principles encompass the functional aspects of the artifact.
The first design principle defines that the main interactivity as derived from the kernel theory should take
place in a natural language-based user interface (DP1). To meet this requirement in the learning system, a
chatbot is required that is able to automatically communicate with the students. Thus, a chat-based user
interface needs to be integrated into a web-based programming environment (DP2). Due to this, the
students will be able to interact with the system independently of time and place as required by R2. Close
integration of DP1 and DP2 is needed as the requirement elicitation in the previous step indicated that the
chat-based communication needs to be adapted based on the written source code of students. Otherwise, it
would, for instance, not be possible to provide individualized feedback to students as for this, the system
needs to analyze the source code in order to formulate adequate natural language-based communication.
As these two design principles target the overall user interface design, they need to be considered when
implementing the following function-oriented design principles.
As the main function-oriented design principle, a learning path is required for each programming exercise
that should be provided (DP3). This enables that students can be guided step-by-step through solving the
task. As a major aspect for enabling interactivity based on the individual needs of the students, the learning
paths need to be adaptive to the students’ state of knowledge, i.e. their programming skills. From a
didactical point of view, adaptive learning paths seem particularly desirable as they pledge an enhanced
teaching efficiency for the students. Due to adaptive learning paths, more experienced students do not have
to deal with learning contents targeted at beginners, but they can interact with more advanced learning
content. Thus, the individual skills of the students can be promoted, which could not be easily achieved by
human tutors.
As it is usually not feasible that learning paths cover all theoretically possible topics of a whole lecture, an
additional question and answering component is required that can answer open questions concerning the
learning contents (DP4). By implementing such a component, the requirements R4 and R5 can be met. As
a basis for the generation of answers, a learning object database is required that acts as the underlying
knowledge base. To be effective, the learning system should be able to answer all exercise related questions,
which is only possible if the knowledge base is large enough to cover all related topics of the programming
exercise and even contains learning objects that go beyond this.
Finally, automatic feedback should be enabled (DP5), which is the basis for correcting the students’
assignments automatically and for providing formative feedback which helps to improve their performance
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 8
(Rietsche et al. 2018). The concepts of static and dynamic testing known from common software testing
approaches seem particularly important in order to provide as individualized feedback as possible.
Figure 2. Overview of Elicitation of Requirements (step 3) and Design Principles (step 4)
#4 First Iteration: Designing the Conceptual Artifact
Based on the derived design principles, we implemented a conceptual artifact called Coding Tutor in the
first design iteration. To this aim, we implemented first a clickable mockup as a web-based front-end that
visualizes how the user interface (DP1 and DP2) can be designed. Second, we designed an exemplary
learning path that visualizes possible usage flows of students in the learning system to address the
remaining function-oriented design principles.
We developed the clickable mockup front-end using state-of-the-art web technology (HTML5, CSS3, and
JavaScript) and implemented a simple but predefined chat dialog that enables us to visualize the
interactivity. To provide a ubiquitously accessible front-end, we used the Bootstrap 4 framework that
enables a responsive layout. Due to this, the front-end can be accessed on any computer with varying screen
sizes and equipped with state-of-the-art browsers. Even though the user interface is designed to be
responsive to various screen sizes, using it on smartphones seems not suited to us due to the usual
complexity of programming exercises. As displayed in Figure 3, the front-end is structured in two main
parts. On the left side, the natural language interface (DP1) is shown as a chat as known from common
instant messengers. The programming environment (DP2) is displayed on the right side and occupies the
largest space of the front-end as this is the most important user interface component where the actual
programming task takes place. The programming environment is implemented using the well-known Ace
editor that enables common features like syntax highlighting, automatic intents and line wrapping (Ace
The interaction of the functional design principles (DP3, DP4, and DP5) takes place within the chat-based
component. Both, predefined learning paths and on-demand question and answer dialogs, can be realized.
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 9
When a predefined learning path is taught, the interaction is pushed forward by the Coding Tutor as the
main intent is to provide explanations or additional hints. For instance, the explanation of the programming
exercise is presented, and the students are asked whether they need more learning contents related to the
theoretical background (e.g., a suited algorithm that can be used to solve the task). If the students ask the
Coding Tutor for such theoretical learning contents, a predefined learning path is visualized (DP3). Besides
this, students can always interrupt the current conversation to ask further open questions to request specific
explanations and learning contents on-demand (DP4). Additionally, the Coding Tutor is always capable of
analyzing the students’ source code to provide textual feedback in the chat component (DP5).
Figure 3. Screenshot of the Clickable Front-end Mockup of Coding Tutor
To visualize the learning process supported by Coding Tutor, we designed an exemplary learning path that
shows the most important learning steps and exemplary chat dialog excerpts. The learning path of the first
iteration is displayed in Figure 4 on the left side. The displayed transitions indicate that the students can
choose different ways through the learning process. In some cases, the transition can be triggered by Coding
Tutor automatically, e.g., if errors are identified in a student’s source code, Coding Tutor should proactively
inform the student and offers assistance to solve them. Based on both, the students’ choice and the
automatic evaluation of the source code, adaptive learning can be achieved.
#5 Evaluating the Conceptual Artifact
After finishing the first design iteration by designing the user interface of Coding Tutor and outlining an
exemplary learning path, we collected feedback about the resulting conceptual artifact. To this aim, we
presented the overall concept of providing an interactive chatbot-based learning system as well as a
clickable mockup of the system’s user interface to the students of an introductory Java programming course
in an information systems study program and requested their feedback. In addition to that, we asked
teaching assistants who are assisting novice programmers in multiple introductory programming classes as
well to get their opinion about the level of support a solution like Coding Tutor can add. In total, we asked
28 students and 16 teaching assistants in this first evaluation to participate and received 22 and 16 valid
responses respectively. In this first evaluation, we were interested in the students’ and teaching assistants’
overall opinion concerning the concept using open-ended free-text questions and more importantly we
wanted to get suggestions for improvement before we started the actual software implementation.
Overall, the students rated the concept as a useful addition to the teaching concept of introductory
programming courses. It would make it easier for them to complete the exercises. For instance, one student
evaluated the overall concept as follows: I think the concept makes a lot of sense, mainly because you don't
always find the right answers to your questions on the Internet when problems during solving the exercises
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 10
arise. Particularly, the learning path was highlighted by students as very helpful for explanations and
understanding”. Overall, approx. 72% of the students supported the idea of the Coding Tutor concept and
only 9% would rather not use it. Especially students that rated their programming skills as high, would
rather not use the Coding Tutor: It wouldn't help me very much, because I can solve the tasks on my own
without any problems.
The students’ comments and suggestions for improvement focused on three aspects: (1) The step-by-step
guidance was on the one hand rated as very helpful, but on the other hand some students had concerns
whether the guidance would reduce the learning effectiveness when too much guidance is provided: I am
unsure if the step-by-step guidance is useful. After the course, there are no step-by-step instructions
available in the reality of a programmer”. Another student added that “with too many explanations, students
no longer have to think for themselves”. (2) In addition to that, students had concerns whether the Coding
Tutor will be able to understand questions properly and whether it will have answers to all questions. For
instance, one student acknowledged: Brilliant idea, with the problem of a sufficient knowledge base”. (3)
Finally, the students requested that the Coding Tutor should not only indicate when the source code was
faulty, but it should also explain how to solve it and provide the original error message from the compiler.
This would help in the long-term, to solve programming errors on their own: In addition to the
explanations of the artificial tutor, the errors of the compiler should also be displayed.
The teaching assistants rated the overall concept of Coding Tutor similarly. Remarkable is that no teaching
assistant evaluated the concept negatively. The teaching assistants also shared concerns regarding too much
guidance as it “could inhibit a student's ability to seek individual solutions”. One assistant added that it's
important that beginners don't just only learn a programming language but learn that programming
sometimes means long troubleshooting”. As a solution, one assistant suggested that instead of detailed step-
by-step instructions that are strictly related to the solution, “guided instructions should focus on solution
strategies”. Another one suggested to “not use the Coding Tutor during the whole semester for all tasks”,
but only in the beginning. Also, the answering of open questions was mentioned as “it only makes sense if
it works well and is strongly related to the topic.” Thus, the teaching assistants requested that the Coding
Tutor needs to cover all contents of the lecture.
Based on this valuable feedback of the students and teaching assistants, we revised our conceptual artifact
and implemented a fully working software artifact as described in the next step.
#6 Second Iteration: Revising the Concept and Implementing the Software Artifact
As outlined in step 5, the participants of our evaluation of the conceptual artifact confirmed that the overall
idea as well as the overall concept are suited for the given problem. However, guidance based on learning
paths (DP3) was particularly discussed critically. Both, students and teaching assistants, had concerns that
too many hints based on guided instructions could possibly negatively affect the students’ ability to code on
their own. However, despite this concern of providing too much help, the provision of guidance was not
questioned in general. Only the level of support should be chosen carefully.
We agreed with the opinion of the students and teaching assistants and revised the design principle DP3
accordingly. Using adaptive learning paths based on the concept of scaffolding (Kim and Hannafin 2011;
van de Pol et al. 2010) could solve the problem of providing too much guidance to students. As proposed in
the scaffolding concept, the provision of learning support should be adapted to meet the learners’ level of
knowledge. Hence, the learners should be encouraged and supported to solve tasks or problems they
wouldn’t be able to solve on their own without proper support. Scaffolding is thus transferable to the
problem of solving programming exercises at hand: If a student is a novice programmer and cannot solve
an exercises on his/her own, the Coding Tutor should provide detailed step-by-step instructions to guide
the student. When the student’s ability to code evolves during the semester term, Coding Tutor should
reduce the guidance to a minimum to encourage the students to solve the exercises on their own. In contrast
to that, for a student who already has some prior knowledge about programming, the guidance should be
reduced to a minimum from the beginning. Instead, Coding Tutor should only provide assistance if a
student is not able to solve the task on his/her own. By implementing this guidance behavior, the students’
and teaching assistants’ concerns can be addressed properly by adaptive learning paths based on
scaffolding. To take this into account, we have revised our learning concept by adding further transitions,
branches, and states, as shown in Figure 4. Thus, the guidance can be adapted based on the students’ needs.
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 11
Figure 4. First and Second Iteration of the Learning Path including Exemplary Dialog Excerpts
The remaining two remarks of the evaluation participants, the Coding Tutors’ capabilities to answer open,
topic-related questions and the provision of detailed error message from the compiler, are valuable for our
implementation as well. However, a revision of the design principles proposed in step 3 is not necessary, as
both aspects focus on very specific implementation details and are covered by the existing design principles.
To address the first aspect in the Coding Tutor prototype, the underlying knowledge base that is used to
answer the students’ open-ended questions needs to cover all important terms and concepts that are
relevant for the programming exercises. Additionally, the natural language understanding component
needs to be trained properly that it will recognize the students’ questions properly. Thus, this aspect is an
additional specification of DP4 (Q&A component based on learning object database). The second aspect
the provision of detailed error messages is also already covered. DP5 specifies that automatic feedback
should be provided to the students. To address the students’ concerns, we specify DP5 further such that the
compiler’s error message should be provided to the students in addition to a natural language explanation.
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 12
Figure 5. Software Architecture of the Coding Tutor Artifact
As the clickable front-end mockup as shown in Figure 3 (step 4) was positively evaluated by the students
and teaching assistants, we used it as the basis for our subsequent software implementation of Coding
Tutor. To integrate the process logic into the front-end, we rely on RESTful web services, i.e. we extended
the existing front-end with a JavaScript-based logic to call the needed functionalities via AJAX from our
backend infrastructure. As visualized in our software architecture (see Figure 5), in addition to the user
interface that implements DP1 and DP2, we implemented three backend components: (1) To implement the
adaptive learning paths based on scaffolding in the backend (DP3), we implemented a component that can
guide students based on the redefined learning paths through solving the programming task (see Figure 4).
The learning paths are implemented in Coding Tutor as finite state machines. Based on the students’
abilities to code as well as on their own decisions, our so-called quick response handler is responsible for
selecting the appropriate level of support. By relying on the dialog manager and the knowledge sources
(particularly the learning path object database), appropriate instructions are provided to the students. (2)
To implement DP4 and to enable answering based on open-ended questions, we added a natural language
processing component to Coding Tutor based on the open source library NLP.js by AXA Shared Services
Spain S.A. (2019). After detecting the student’s intent behind a question, an appropriate answer is
generated by the dialog manager using the independent learning object database that stores definitions of
all relevant terms and concepts that are related to the homework exercises or using the small talk response
database if the students intent was not topically related. We chose to integrate a small talk component as
well, as prior research indicates that it could improve the users’ acceptance even though it is not
complimentary for solving the homework tasks. (3) We implemented the source code analyzing component
to meet DP5. To this aim, we integrated both, a static and a dynamic code analyzer, as discussed in step 3.
Particularly important for dynamic code testing in our code execution handler is to ensure a secure
environment that cannot be exploited by the students. We chose to compile and execute the students’ source
code only after starting a sandbox environment on the server. In this secure environment that we reset after
each execution, we execute unit tests based on JUnit, which enables us to properly examine and evaluate
the students’ assignments. The results generated by the unit tests are particularly important to give the
students in-depth feedback automatically and are thus sent via the messenger interface to the students.
When testing the programming tasks, we proceed according to the black-box testing method. Therefore, we
do not force the implementation of a specific algorithm. Instead, the students can implement their problem
solution and our test procedure uses multiple exemplary input values to check whether expected results are
calculated by the program. As a result of this second design interaction, we resulted in a fully-functional
software artifact that implements all (revised) design principles as specified in step 3 and step 5.
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 13
#7 Evaluating the Software Artifact
To evaluate the result of the second design cycle, we demonstrated the Coding Tutor artifact to 40 students
in an introductory programming course. According to a self-assessment before the evaluation, students
participating in the course evaluated their programming experience as low (approx. 50 %). Only 10 % rated
their skills as high. This self-evaluation is in line with our expectation as the course is targeted at
information systems students from the 3rd semester onwards. During the evaluation study, the participants
had the possibility to use the Coding Tutor software to solve a predefined programming exercise. In this
task, the students should code a program that can calculate the greatest common divisor of two specified
numbers using the Euclidean algorithm. To give the participants enough time and space to solve the task
and to test our software artifact carefully, we didn’t restrict the time duration for the test scenario. On
average, the students interacted with Coding Tutor for approx. 20 minutes. Whereas the more experienced
students only needed approx. 14 minutes to solve the task completely, the last student successfully finished
it after 32 minutes. Although participation in the evaluation was voluntary and successful completion of the
programming task was not enforced, a majority successfully solved the task completely. After completing
the test, we asked all participants to fill out a questionnaire in which we aimed at evaluating our design
principles. To measure the participants’ perception of our developed software artifact, we provided a
quantitative questionnaire based on a 5-point Likert scale (-2: strongly disagree to +2 strongly agree) in
which we focused on functional (i.e., usefulness of the implemented design principles) and form aspects
(i.e., ease of use of the implemented design principles) of each design principle by adapting the well-known
constructs perceived usefulness and perceived ease of use (Davis et al. 1989). Furthermore, we asked the
students if they consider Coding Tutor as suitable for practical use in introductory programming courses
and if they intent to use it in the future (Davis et al. 1989). In addition to these quantitative items, the
students had the possibility to provide written feedback, e.g., about required improvements and further
remarks. At the end of our evaluation, we received 35 valid questionnaires, which are the basis for our
analysis. We analyzed each design principle individually as shown in Figure 6 and tested whether the data
significantly differs from the mean using one-sided t-tests.
Figure 6. Overview of the evaluation results of function and form
The results show that all design principles were evaluated by the students as useful and easy to use. The
averages of all constructs are significantly better than the Likert scale’s mean value and no student rated
the usefulness of any design principle with -2. Particularly noticeable is the fact that the students rated the
usefulness of DP3 (adaptive learning path), DP4 (Q&A component), and DP5 (automatic feedback
provision) with average values of +1.26, +1.13 and +1.14 very positively. This indicates that the Coding Tutor
artifact is capable of performing tutoring tasks that are usually carried out by teaching assistants. This is
also reflected in the fact that the students rated the software as useable in practice (practicality) with an
average of +1.38 and intent to use it in the future (mean of +1.43).
In the written feedback, the students particularly emphasized the usefulness of the step-by-step guidance
and rated the “proactive offering of step-by-step guidance” at the beginning of the programming task as
well as in situations when the static code analysis indicates errors in the source code as “very helpful”.
Additionally, several students named the interactivity of the chatbot as a significant improvement” during
programming tasks when additional help is required to solve a task. Besides these positive remarks, two
aspects for improvements of Coding Tutor were mentioned: (1) Some students suggested improving the
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 14
responsiveness of Coding Tutor, especially when providing feedback about the source code. In particular,
the execution and dynamic code testing using unit tests took some time in some situations. For instance,
during the evaluation study, a large number of students tried to execute and test their source code almost
at the same time. At this moment, the students needed to wait some seconds before Coding Tutor could
provide detailed feedback. As the duration for executing and evaluating the students’ source code is mainly
dependent on Coding Tutors server computing power, it can easily be improved by using a more powerful
server. During the field test, we only used a virtual server with low computing power and restricted memory.
By increasing the number of CPU cores of our virtual server after the test, we were able to significantly
increase Coding Tutors responsiveness and to solve the issue. (2) As a second possibility for improvements,
expending the knowledge base of the chatbot was named. This became apparent as two students tried to
figure out which questions could be answered by Coding Tutor and which could not. In our test setting, the
knowledge base was restricted to cover programming aspects related to the provided task. Thus, it was easy
for the students to identify the knowledge base’s boundaries. For later use in introductory programming
courses, it is necessary to provide a sufficient knowledge base. Overall both mentioned possibilities for
improvements didn’t question our design principles. However, both aspects need to be considered when
using chatbot-based learning systems in practice. Thus, a sufficient server capacity needs to be arranged to
provide Coding Tutor to larger audiences and the learning content should be extended to cover all relevant
aspects of the course’s content.
In summary, the evaluation of our fully-functional software artifact indicates that our conceptual design
and the design principles are valid for using chatbot-based learning systems in introductory programming
courses. The students rated both, the form and the function, of our implemented design principles as suited.
As the second evaluation cycle didn’t reveal any major design issues, we terminated our design process
successfully after the second design cycle.
#8 Documenting the Design Knowledge
To document the results of our design process and to communicate our results to the scientific knowledge
base, we use the core components of a design theory as proposed by Gregor and Jones (2007). In this way,
we summarize our theoretical contributions in form of a “design and action” theory (Gregor and Jones
2007) based on our rigorous design process. Thus, Table 2 summarizes our systematically derived design
knowledge using the components purpose and scope, constructs, principles of form and function, artifact
mutability, testable proposition and justificatory knowledge (Gregor and Jones 2007).
The purpose of the concept and implementation of Coding Tutor is to support novice
programmers in introductory programming courses to learn how to code while solving
programming tasks.
Chatbot, programming environment, learning paths, knowledge base of learning objects,
source code analyzer
DP1: Support students using natural language in a chatbot-based user interface; DP2:
Provide an easily accessible (web-based) programming environment; DP3: Provide
guidance based on adaptive learning paths using scaffolding; DP4: Answer topic-related
questions based on a sufficient knowledge base; DP5: Provide automatic feedback based
on static and dynamic code analysis
The artifact can be applied in every introductory programming course independently of
a specific programming language by revising the programming tasks, the knowledge base
and particularly the learning paths. By replacing the coding environment with other
interactive tools (e.g., modelling tools for UML), it can even be used to impart procedural
knowledge in other domains of (IS) education (see following discussion section).
To test the design principles and implementation, each aspect as mentioned above needs
to be surveyed. To evaluate the effects on students’ learning performance, the following
propositions should be considered: (1) Using Coding Tutor increases the students’ ability
to solve programming exercises. (2) Using Coding Tutor improves the provision of
guidance to students. (3) Using Coding Tutor reduces the required amount of teaching
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 15
assistants’ resources when supporting students even in times when no lecture or in-class
tutorial is available.
Scientific literature, particularly the ICAP framework (Chi and Wylie 2014) and the
scaffolding concept (Kim and Hannafin 2011), as well as empirical knowledge,
particularly results from a workshop and tasks characteristics of teaching assistants.
Table 2. Documentation of the Design Knowledge based on Gregor and Jones (2007)
Discussion and Conclusion
The overall research goal of this DSR study was to design a chatbot-based learning system that aims at
supporting students to learn to code in university courses. In particular, the aim was t0 support novice
programmers in introductory programming courses in times when no teaching assistant or no in-class
support is available (e.g., due to resource constraints). To this aim, we derived requirements from scientific
literature (especially the ICAP framework by Chi and Wylie 2014), a focus group workshop and the task
characteristics of teaching assistants. Based on these inputs from the knowledge base (rigor) and the
current teaching practice (relevance), we deduced design principles and implemented our software artifact
called Coding Tutor in an iterative process by adapting the three cycle design science research approach
(Hevner et al. 2004; Hevner 2007). During this design process, we evaluated the results of each design
iteration and successfully terminated our design process after two iterations.
Besides the software artifact as a situated implementation of a chatbot-based learning system, we contribute
design knowledge to the scientific knowledge base. We systematically deduced design knowledge as
documented in our last step of the design process (see Table 2 in step 8). Due to the systematic procedure,
we aimed at generating a satisfying design contribution as indicated by Gregory and Muntermann (2014).
The resulting design knowledge is not only valid for our specific case but can also be transferred to further
use cases in programming education. For instance, it is easily possible to apply the concept of Coding Tutor
in courses that deal with other programming languages than Java that we used in our case. To this aim,
only the knowledge base of learning paths, learning objectives as well as the programming tasks need to be
adapted. The design principles of form and function and the overall system design don’t need to be adapted
for those use cases. Furthermore, it is also possible to transfer the design knowledge to other use cases that
target the training of procedural knowledge in other areas of (IS) education. For instance, if the application
of other IS-related software tools like UML-modelling should be trained, a similar chatbot-based learning
system can be used. However, in this case, the system design needs to be revised partially. Particularly the
coding environment needs to be replaced, e.g., by an UML-modelling tool. Due to this transferability of our
design knowledge, our research does not only provide a Level 1 DSR contribution by showing a situated
artifact implementation but also provides a nascent design theory (Level 2 contribution) (Gregor and
Hevner 2013).
In addition to these contributions to the scientific knowledge base, our results are also valuable for the
practical use of chatbot-based learning system in (IS) education. Our results indicate that the state-of-the-
art chatbot-technology is suited to design complex chatbot-based learning systems that are able to support
students individually even in complex teaching tasks when procedural knowledge and advanced cognitive
processes (like analyzing; Anderson and Krathwohl 2001) should be trained. We have thus shown that
more advanced support possibilities in programming education can be implemented using chatbot-based
learning systems in contrast to common e-learning approaches that are currently used in university
Despite these contributions to the scientific knowledge base and the practical application of chatbot-based
learning systems, limitations should be considered. For the aim of this study, we focused our research scope
on introductory programming courses. Even though it is reasonable to assume that the transferability to
other cases is possible without major changes, we cannot prove it with our research design. Additionally,
we focused our research on generating design knowledge by applying a DSR approach. Due to this, we
focused on deducing and evaluating design principles and assessing the students’ evaluation. For future
research, particularly analyzing the long-term effect of using Coding Tutor or a similar chatbot-based
learning system in IS education pledges interesting insights into the effectiveness and long-term acceptance
of chatbot-based learning systems. In doing so, not only the effects of these technology-enhanced learning
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 16
systems on the learning behavior of students can be analyzed, but also (necessary) changes in the teaching
concept can be surveyed. Additionally, potential downsides of using chatbot-based learning systems should
be analyzed. Even though we do not plan to replace human teaching assistants in our learning setting but
provide students with our Coding Tutor as an additional learning opportunity, others might think of doing
this. Obviously, experienced human teaching assistants might be able to assist students who have
difficulties with solving programming tasks better than our Coding Tutor. Nevertheless, introducing a
chatbot-based learning system might still be beneficial in those cases as the workload of human teaching
assistants might be reduced as the Coding Tutor system might be able to answer most questions
automatically. Human teaching assistants are then able to focus on answering the remaining, more difficult
Ace 2019. Ace - The High Performance Code Editor for the Web. Accessed 29 March
Albrecht, E., Gumz, F., and Grabowski, J. 2018. “Experiences in Introducing Blended Learning in an
Introductory Programming Course,” in Proceedings of the 3rd European Conference Software
Engineering Education, J. Mottok (ed.), New York: ACM, pp. 93101.
Anderson, L. W., and Krathwohl, D. R. (eds.) 2001. A taxonomy for learning, teaching, and assessing: A
revision of Bloom's taxonomy of educational objectives, New York: Longman.
AXA Shared Services Spain S.A. 2019. axa-group/nlp.js. Accessed 4
April 2019.
Bey, A., Jermann, P., and Dillenbourg, P. 2018. “A Comparison between Two Automatic Assessment
Approaches for Programming: An Empirical Study on MOOCs,” Journal of Educational Technology &
Society (21:2), pp. 259272.
Bloom, B. S., Engelhart, M. B., Furst, E. J., Hill, W. H., and Krathwohl, D. R. 1956. Taxonomy of educational
objectives. The classification of educational goals (Handbook I. Cognitive Domain), New York:
Breitner, J., Hecker, M., and Snelting, G. 2017. “Der Grader Praktomat,” in Automatisierte Bewertung in
der Programmierausbildung, O. J. Bott, P. Fricke, U. Priss and M. Striewe (eds.): Waxmann Verlag
GmbH, pp. 159172.
Chi, M. T. H., and Wylie, R. 2014. “The ICAP Framework: Linking Cognitive Engagement to Active Learning
Outcomes,” Educational Psychologist (49:4), pp. 219243.
Crow, T., Luxton-Reilly, A., and Wuensche, B. 2018. “Intelligent tutoring systems for programming
education: a systematic review,” in Proceedings of the 20th Australasian Computing Education
Conference, Brisbane, pp. 5362.
Daradoumis, T., Marquès Puig, J. M., Arguedas, M., and Calvet Liñan, L. 2019. “Analyzing students'
perceptions to improve the design of an automated assessment tool in online distributed
programming,” Computers & Education (128), pp. 159170.
Davis, F. D., Bagozzi, R. P., and Warshaw, P. R. 1989. “User Acceptance of Computer Technology: A
Comparison of Two Theoretical Models,” Management Science (35:8), pp. 9821003.
Goedicke, M., Striewe, M., and Balz, M. 2018. Computer Aided Assessments and Programming Exercises
with JACK: DuEPublico: Duisburg-Essen Publications Online, University of Duisburg-Essen,
Graesser, A. C., Cai, Z., Morgan, B., and Wang, L. 2017. “Assessment with computer agents that engage in
conversational dialogues and trialogues with learners,” Computers in Human Behavior (76), pp. 607
Gregor, S., and Hevner, A. R. 2013. Positioning and presenting design science research for maximum
impact,” MIS Quarterly (37:2), pp. 337356.
Gregor, S., and Jones, D. 2007. “The Anatomy of a Design Theory,” Journal of the Association for
Information Systems (8:5), pp. 312335.
Gregory, R. W., and Muntermann, J. 2014. “Research Note —Heuristic Theorizing: Proactively Generating
Design Theories,” Information Systems Research (25:3), pp. 639653.
Hevner, A. 2007. “A Three Cycle View of Design Science Research,” Scandinavian Journal of Information
Systems (19:2), pp. 8792.
Design and Evaluation of a Chatbot-based Learning System
Fortieth International Conference on Information Systems, Munich 2019 17
Hevner, A., March, S., Park, J., and Ram, S. 2004. “Design Science in Information Systems Research,”
Management Information Systems Quarterly (28:1), pp. 75105.
Hien, H. T., Cuong, P.-N., Le Nam, N. H., Nhung, Ho Le Thi Kim, and Le Thang, D. 2018. “Intelligent
Assistants in Higher-Education Environments: The FIT-EBot, a Chatbot for Administrative and
Learning Support,” in Proceedings of the Ninth International Symposium on Information and
Communication Technology, New York: ACM, pp. 6976.
Hobert, S., and Meyer von Wolff, R. 2019. “Say Hello to Your New Automated Tutor – A Structured
Literature Review on Pedagogical Conversational Agents,” in 14th International Conference on
Wirtschaftsinformatik, pp. 301314.
Ihantola, P., Ahoniemi, T., Karavirta, V., and Seppälä, O. 2010. “Review of recent systems for automatic
assessment of programming assignments,” in Proceedings of the 10th Koli Calling International
Conference on Computing Education Research - Koli Calling '10, C. Schulte and J. Suhonen (eds.), New
York: ACM Press, pp. 8693.
Kim, M. C., and Hannafin, M. J. 2011. “Scaffolding problem solving in technology-enhanced learning
environments (TELEs): Bridging research and theory with practice,” Computers & Education (56:2),
pp. 403417.
Mikic, F. A., Burguillo, J. C., Llamas, M., Rodriguez, D. A., and Rodriguez, E. 2009. “CHARLIE: An AIML-
based chatterbot which works as an interface among INES and humans,” in 2009 EAEEIE Annual
Conference. 2009, pp. 16.
Moore, M. G. 1993. “Three types of interaction,” in Distance Education: New Perspectives, K. Harry, M.
John and D. Keegan (eds.), London: Routledge, pp. 1924.
Müller, S., Bergande, B., and Brune, P. 2018. “Robot Tutoring,” in Proceedings of the 3rd European
Conference Software Engineering Education, J. Mottok (ed.), New York: ACM, pp. 4549.
OECD 2018. Education at a Glance 2018: OECD Indicators, Paris: OECD Publishing.
Passier, H. 2017. “The role of Procedural Guidance in Software Engineering Education,” in Proceedings of
the International Conference on the Art, Science, and Engineering of Programming - Programming
'17, J. B. Sartor, T. D'Hondt and W. de Meuter (eds.), New York: ACM Press, pp. 12.
Popat, S., and Starkey, L. 2019. “Learning to code or coding to learn? A systematic review,” Computers &
Education (128), pp. 365376.
Rietsche, R., Duss, K., Persch, J. M., and Söllner, M. 2018. “Design and Evaluation of an IT-based Formative
Feedback Tool to Forster Student Performance,” in ICIS 2018 Proceedings, pp. 117.
Sim, T. Y., and Lau, S. L. 2018. “Online Tools to Support Novice Programming: A Systematic Review,” in
2018 IEEE Conference on e-Learning, e-Management and e-Services (IC3e), IEEE, pp. 9196.
Tamayo-Moreno, S., and Perez-Marin, D. 2016. “Adapting the design and the use methodology of a
pedagogical conversational agent of secondary education to childhood education,” in 2016
International Symposium on Computers in Education (SIIE), Piscataway, NJ: IEEE, pp. 16.
Tegos, S., Demetriadis, S., and Karakostas, A. 2011. “MentorChat: Introducing a Configurable
Conversational Agent as a Tool for Adaptive Online Collaboration Support,” in 2011 15th Panhellenic
Conference on Informatics, pp. 1317.
van de Pol, J., Volman, M., and Beishuizen, J. 2010. “Scaffolding in Teacher–Student Interaction: A Decade
of Research,” Educational Psychology Review (22:3), pp. 271296.
Venable, J., Pries-Heje, J., and Baskerville, R. 2016. “FEDS: a Framework for Evaluation in Design Science
Research,European Journal of Information Systems (25:1), pp. 7789.
Vial, G., and Negoita, B. 2018. “Teaching Programming to Non-Programmers: The Case of Python and
Jupyter Notebooks,” ICIS 2018 Proceedings, pp. 117.
Wallace, R. S. 2009. “The Anatomy of A.L.I.C.E,” in Parsing the Turing Test: Philosophical and
Methodological Issues in the Quest for the Thinking Computer, R. Epstein, G. Beber and G. Roberts
(eds.), Dordrecht: Springer Netherlands, pp. 181210.
Weizenbaum, J. 1966. “ELIZA - a computer program for the study of natural language communication
between man and machine,” Communications of the ACM (9:1), pp. 3645.
Winkler, R., and Söllner, M. 2018. “Unleashing the Potential of Chatbots in Education: A State-Of-The-Art
Analysis,” in Academy of Management Annual Meeting (AOM), Chicago, USA.
Zeller, A. 2000. “Making students read and review code,” in Proceedings of the 5th annual
SIGCSE/SIGCUE ITiCSEconference on Innovation and technology in computer science education -
ITiCSE '00, J. Tarhio, S. Fincher and D. Joyce (eds.), New York: ACM Press, pp. 8992.
... A different study used a goal-fulfillment map similar to a finite-state machine to map a dialog and allow longer and more dynamic interactions in customer support settings (Chakrabarti & Luger, 2015). A similar study implemented a finite-state machine chatbot to provide support for complex tasks such as e-learning and education (Hobert, 2019b). In this study, the chatbot continuously adjusted its current dialog state based on users' intent and triggered corresponding actions. ...
... In specific, many existing DSR studies have evaluated a specific chatbot's design principles and requirements to identify improvement potentials or generalize their findings (e.g., Feine et al., 2020;Gnewuch et al., 2017;Hobert, 2019b;Tavanapour et al., 2019;Winkler et al., 2020). Also, Carayannopoulos (2018) evaluated how chatbots' design elements and capabilities can help users navigate complex new situations and quickly provide them with necessary information. ...
... In doing so, chatbots should assess the current state and decide which path to follow and which steps, depending on the given inputs or decisions, to conduct. If needed, a chatbot should implement step-by-step guidance (Hobert, 2019b). Thus, the systems must be able to adapt the process or itself to the process's actual needs and current state. ...
... Já, Hobert [Hobert 2019] destaca queé essencial que o aluno seja encorajado a praticar o ato de programar para sedimentar os conceitos aprendidos em aulas teóricas e práticas. O processo criativo de programar demanda muito dos estudantes, essa situação se agrava quando a orientação por parte do professor carece de alguma maneira, por exemplo: as restrições do tempo destes profissionais e consequentemente a impossibilidade de acompanhar de forma contínua o progresso individual de cada aluno. ...
... Chatterbots geralmente buscam conduzir conversações de uma maneira proativa ou reativa, a partir da interação com o usuário [Hobert 2019]. A utilização de chatterbots em um contexto educacional abrange diversos espectros, como: suporte pedagógico [Clarizia et al. 2018] e sistemas de perguntas e respostas [Sinha and Basak 2020]. ...
... Um chatterbot tem como objetivo conduzir conversações de maneira similar a seres humanos [Hobert 2019]. Esse objetivoé atingido através da aplicação de metodologias de diferentesáreas como Processamento de Língua Natural (PLN) ou Aprendizado de Máquina (AM). ...
Conference Paper
A natureza abstrata dos conceitos de programação se mostra como um dos fatores que explicam as dificuldades que muitos estudantes encontram na absorção destes conteúdos. Além disso, os estudantes precisam se adaptar as diferentes sintaxes das linguagens de programação e entender como trechos de código se relacionam. Logo, abordagens alternativas de ensino de programação, principalmente em estágios iniciais, são necessárias. Tendo em vista este contexto, o objetivo deste trabalho é desenvolver um protótipo de chatterbot para auxiliar no ensino de programação na linguagem C. O protótipo de chatterbot desenvolvido, denominado C-BOT, é capaz de fornecer conhecimento sobre programação e testar os conhecimentos do usuário sobre o funcionamento de alguns algoritmos.
... Digital assistants provide solutions to support individuals and organizations in the dynamic conditions and demands arising from DT (Murphy, 2020). Various HEI, for instance, use virtual assistants, also known as chatbots, to supplement existing offers (Bouaiachi et al., 2014;Hobert, 2019;Ranoliya et al., 2017). According to Knote et al. (2019), chatbots are one of five archetypes of smart personal assistants (SPA). ...
... This intelligent human-computer conversation allows giving answers, hints, and suggestions for the user's questions (Meyer von Wolff et al., 2020;Mikic et al., 2009;Winkler & Söllner, 2018). Researchers and practitioners have already introduced many different assistants, which, for example, support students to learn to write program code, strengthen their argumentation skills, answer FAQs, or support study course selections (Bouaiachi et al., 2014;Hobert, 2019;Ranoliya et al., 2017;Wambsganss et al., 2020). ...
Full-text available
During the COVID-19 pandemic, the availability of online higher education programs and tools has grown rapidly. One example is an individual digital study assistant (IDSA) for students, which provides functionalities to train self-regulation skills, to engage with own educational goals and to offer automated, first-level support to higher education institution (HEI) units and employees. An IDSA further can guide students through HEI and their administration. But, what are the critical success factors (CSF) and challenges for an IDSA? We deduce these using a mixed methods approach with one quantitative student survey, two rounds of interviews with various HEI experts, and a literature review. We classified our results according to the information system (IS) success model of DeLone & McLean (2016). Our results and findings show, e.g., that skilled and reliable HEI personnel, well-organized and useful content, cross-platform usability, ease of use, and students’ social factors are essential. Attractive IDSA functionalities are a major challenge because students use many apps, daily. Based on our CSF and challenges, we deduce theoretical and practical recommendations and develop a further research agenda.
... The evolution of intelligent tutoring systems, particularly chatbots, has prompted researchers to utilize this technology as a means to solve the problem of insufficient personalised support for inexperienced programmers [6], [9]. ...
... Researchers have approached this problem in several ways. Starting from conventional methods such as text-based tutorials, creating guidelines for effective user experiences [2], to introducing intelligent technologies such as chatbots [3] and conversational agents [4] to name a few. In this paper, I present the concept of a multi-layered framework that (1) draws from simple to sophisticated data representations and algorithmic approaches in the back-end and which (2) captures the user's contexts and needs in the front-end to deliver effective support, accordingly. ...
Full-text available
Using a complex software system can pose a difficult challenge for its users, especially for those without technical backgrounds. This paper outlines a concept to deploy Recommender Systems coupled with Intelligent User Interfaces as an effective mechanism to guide a user of a complex software system through context-aware suggestions. More concretely, we focus on an authoring tool that our research draws motivation from and will find opportunities for evaluation---it is a platform that aims at empowering medical professionals to create virtual reality experiences to help their work. This paper elaborates on the envisioned concept and the plan for implementation and evaluation.
Full-text available
Besides increasing transparency and demonstrating awareness of the author, self-reported limitations enable other researchers to effectively learn from, build on, validate, and extend the original work. However, this topic is understudied in information systems design science research (IS DSR). The study has assessed 243 IS DSR papers published in the period 2013–2022 and built a typology of the 19 most relevant limitations, organized into four categories: (1) Input Knowledge and Technology, (2) Research Process, (3) Resulting Artifact, and (4) Design Knowledge. Further, the contribution suggests actions to mitigate each type of limitation throughout the entire IS DSR project lifecycle. The authors have also created guidelines to report the limitations in a useful way for knowledge accumulation. The proposed typology and guidelines enable reviewers and editors to better frame self-reported limitations, assess rigor and relevance more systematically, and provide more precise feedback. Moreover, the contribution may help design researchers identify, mitigate, and effectively communicate the uncertainties inherent to all scientific advances.
Full-text available
The programming domain is a challenging field of study for most novice programmers. Novices have to learn different skills in the Programming 1 course at the same time. This study prepared and introduced a chatbot in the course. The chatbot emphasizes syntax and semantics, problem solving strategies, course details, and common programming syntax and semantics errors. The influence of the chatbot on students' learning in the Programming 1 course was determined by performing a survey and analyzing the final grades of the course. The survey results reveal that the chatbot approach was appreciated by both female and male students in the Programming 1 course. The chatbot helped novices in comprehending the fundamental programming concepts and understanding common syntax and semantic errors. The analysis of final grades shows that female students perform well in the medium and low achiever categories, and male students show better performance in the high achiever category.
Many organizations are pursuing the implementation of chatbots to enable automation of service processes. However, previous research has highlighted the existence of practical setbacks in the implementation of chatbots in corporate environments. To gain practical insights on the issues related to the implementation processes from several perspectives and stages of deployment, we conducted semi-structured interviews with developers and experts of chatbot development. Using qualitative content analysis and based on a review of literature on human computer interaction (HCI), information systems (IS), and chatbots, we present an implementation framework that supports the successful deployment of chatbots and discuss the implementation of chatbots through a user-oriented lens. The proposed framework contains 101 guiding questions to support chatbot implementation in an eight-step process. The questions are structured according to the people, activity, context, and technology (PACT) framework. The adapted PACT framework is evaluated through expert interviews and a focus group discussion (FGD) and is further applied in a case study. The framework can be seen as a bridge between science and practice that serves as a notional structure for practitioners to introduce a chatbot in a structured and user-oriented manner.
Nowadays, universities are using more technologies dealing with students’ interactions. The chatbot supported with Artificial Intelligence (AI) – Deep Learning (DL) technology exhibited a better ability and efficiency in various assistant situations. However, the effectiveness of the education chatbot is still not satisfactory. This paper proposes a new chatbot framework that integrated students’ learning profiles and enhanced chatbot components to improve student interaction. The new chatbot framework uses knowledge from the PS2CLH model and AI - DL to build a proactive chatbot for assisting students’ learning on their academic subjects and controllable learning factors. One of the principal novelties of the chatbot framework lies in the student-lecturer/assistant facilitator. The proactive chatbot applies multimodality to students’ learning process to retain students’ attention and explain the content in different ways using text, image, video, and audio to assist students and improve their learning experience effectively. Furthermore, the chatbot proactively suggests controllable learning factors for students to work on, improving their academic performance. The testing results demonstrated that the proactive chatbot offered sound accuracy and more effective learning support than other chatbots.KeywordsAI-Deep Learning chatbotFrameworkStudents’ assistant
Conference Paper
Full-text available
The purpose of this paper is to discuss about smart learning environments and present the FIT-EBot, a chatbot, which automatically gives a reply to a question of students about the services provided by the education system on behalf of the academic staff. The chatbot can play the role of an intelligent assistant, which provides solutions for higher-education institutions to improve their current services, to reduce labor costs, and to create new innovative services. Various artificial intelligence techniques such as text classification, named entity recognition are used in this work to enhance the system performance.
Conference Paper
Full-text available
Today’s world is changing faster than ever before. Students need to constantly acquire new knowledge and the required skills shift more and more to higher order thinking skills. One major approach to support the students in this shift is through formative feedback, which initiates self-regulated learning processes. However, given the constant rise of student numbers while public spending remains constant, one of the major challenges is how to effectively provide formative feedback in large-scale lectures. The aim of our project is to develop an IT-tool which provides students with formative feedback to increase their performance in large-scale lectures. Our results show that through the use of the IT-tool the objective and perceived student performance can be significantly increased. By documenting our design knowledge according to the eight components of Gregor and Jones (2007), we take a first step towards creating a design theory for formative feedback tools.
Conference Paper
Full-text available
Chatbots are becoming a ubiquitous trend in many fields such as medicine, product and service industry, and education. Chatbots are computer programs used to conduct auditory or textual conversations. A growing body of evidence suggests that these programs have the potential to change the way students learn and search for information. Especially in large-scale learning scenarios with more than 100 students per lecturer, chatbots are able to solve the problem of individual student support. However, until now, there has been no systematic, structured overview of their use in education. The aim of this paper is therefore to conduct a systematic literature review based on a multi-perspective framework, from which we have derived initial search questions, synthesized past research, and highlighted future research directions. We reviewed titles and abstracts of 1405 articles drawn from management, education, information systems, and psychology literature before examining and individually coding a relevant subset of 80 articles. The results show that chatbots are in the very beginning of entering education. Few studies suggest the potential of chatbots for improving learning processes and outcomes. Nevertheless, past research has revealed that the effectiveness of chatbots in education is complex and depends on a variety of factors. With our literature review, we make two principal contributions: first, we structure and synthesize past research by using an input-process-output framework, and secondly, we use the framework to highlight research gaps for guiding future research in that area.
Conference Paper
Full-text available
A variety of intelligent tutoring systems have been created for the purpose of teaching computer programming. Most published literature focuses on systems that have been developed to teach programming within tertiary courses. A majority of systems have been developed to teach introductory programming concepts; other systems tutor more specific aspects of programming like scope or recursion. Literature reports that these systems address many of the difficulties associated with teaching programming to novices; however, individual systems vary greatly, and there is a large range of supplementary features developed in these systems. Most intelligent programming tutors involve some form of interactive programming exercises, but the use of supplementary features like plans, quizzes and worked solutions vary greatly between different systems. This systematic review reports key information about existing systems and the prevalence of different features within them. An overview of how supplementary features are integrated into these systems is given, along with implications for how intelligent programming tutors could be improved by supporting a wider range of supplementary features.
The resurgence of computer programming in the school curriculum brings a promise of preparing students for the future that goes beyond just learning how to code. This study reviewed research to analyse educational outcomes for children learning to code at school. A systematic review was applied to identify relevant articles and a thematic analysis to synthesise the findings. Ten articles were included in the synthesis and an overarching model was developed which depicts the themes. The results demonstrate that although students are learning to code, a range of other educational outcomes can be learnt or practiced through the teaching of coding. These included mathematical problem-solving, critical thinking, social skills, self-management and academic skills. The review also identified the importance of instructional design for developing these educational outcomes through coding.
Designing an automated assessment tool in online distributed programming can provide students with a meaningful distributed learning environment that improves their academic performance. However, it is a complex and challenging endeavor that, as far as we know, has not been investigated yet. To address this research gap, this work presents a new automated assessment tool in online distributed programming, called DSLab. The tool was evaluated in a real long-term online educational experience by analyzing students’ perceptions with the aim of improving its design. A quantitative analysis method was employed to collect and analyze data concerning students’ perceptions as to whether using the DSLab tool was really a worthwhile experience. Our study shows that the DSLab tool includes acceptable utility and efficiency features. It also identifies factors that influence current design efficiency with the aim of improving DSLab design by suggesting new functionalities and ideas.
Conference Paper
Cognitive Systems and Artificial Intelligence (AI) are getting increasingly popular in many application domains. An upcoming field for AI are educational applications. Natural language processing (NLP) in combination with cognitive systems may provide many interesting possibilities regarding automatic advice and tutoring of students, e.g. by answering students’ questions. Therefore, this paper investigates the possibilty of using a cognitive software, namely IBM Watson, as a tool for developing a virtual tutor to answer common questions of students during an introductory Java programming course at a German University of Applied Sciences. The prototype software is evaluated using a qualitative study among participants. While the results indicate the future potential of such systems in education, in its current form it could only replace a tutor in very limited and specific scenarios.
Conference Paper
As students shall be able to apply their gained theoretical knowledge practically, usually exercises are offered. But assessment of the exercises is a very time-consuming task and due to the increasing number of students often even infeasible. Though programming exercises can be efficiently tested against expected output, the assessment systems often only deliver feedback regarding the correctness and sometimes additionally which test cases fail. But so far, these systems are not able to identify the reason for the error. Therefore, there should also be human tutors available with which students can discuss their solution and which can help them in identifying their misconceptions. In this paper, we describe the experiences we made when we introduced blended learning to our introductory programming course, which problems occurred, and how we plan to deal with the problems.