Content uploaded by Tugrul Tasci
Author content
All content in this area was uploaded by Tugrul Tasci on Feb 06, 2014
Content may be subject to copyright.
TAŞCI, T., PARLAK, Z., KİBAR, A., TAŞBAŞI, N., & CEBECİ, H. İ. (2014). A Novel Agent-Supported Academic Online
Examination System. Educational Technology & Society, 17 (1), 154–168.
A Novel Agent-Supported Academic Online Examination System
Tuğrul TAŞCI1*, Zekeriya PARLAK2, Alpaslan KİBAR3, Nevzat TAŞBAŞI1 and H.
İbrahim CEBECİ1
1Faculty of Informatics of Computer Engineering Department, Sakarya University, Turkey // 2Faculty of Engineering
of Mechanical Engineering Department, Sakarya University, Turkey // 3Distance Education Center, Sakarya
University, Turkey // ttasci@sakarya.edu.tr // zparlak@sakarya.edu.tr // kibar@sakarya.edu.tr //
ntasbasi@sakarya.edu.tr // hcebeci@sakarya.edu.tr
*Corresponding author
(Submitted August 9, 2012; Revised December 8, 2012; Accepted March 13, 2013)
ABSTRACT
Proper execution of exams aimed at assessment & evaluation is of critical importance in Learning Management
Systems (LMS). Problems arising from human-centered errors or technical difficulties may lead to questioning
of exams, and thus of reliability and efficiency of the distance education systems. Online examination system
architecture proposed in this paper provides for integrated management of main functions such as question pool
creation and update, exam authoring, execution and evaluation, and management of the feedbacks from students,
along with ensuring use of analysis reports related to the questions and exams created by an intelligent agent in
the decision-making processes. After conducting analyses on the distance education (DE) system of Sakarya
University, it was found that the proposed intelligent agent supported online exam system detects the problems
that arise to a large extent and enables the instructors to decide more easily in a shorter time. This system, which
is possible to expand with its flexible structure to include additional intelligent features with the aim of resolving
different problems, can also be adapted to different institutions that use online examination systems.
Keywords
Architectures for educational technology system, Distance education and telelearning, Evaluation methodologies,
Intelligent tutoring systems
Introduction
Following the rapid developments in the information technologies, online education gains importance as an
alternative to the traditional teaching models. In online education, teaching independent of time and space constraints
brings a fast increase in the numbers of students. For evaluation of such high numbers of students, online test
methods are more used than classical methods (Zhang et al., 2006). Structural problems of classical evaluation such
as requiring too much time (Gawali & Meshram, 2009) and not allowing sufficient means for student and instructor
feedbacks (Guo & Mao, 2010; Zhang et al., 2006) can be eliminated owing to the functions of online test system
such as efficient management of evaluation process, feedback system and diversity and pace of student evaluation
(Tallent-Runnels et al., 2006).
Studies on online examination systems constitute a major part of the studies on online education in the recent years
(Tallent-Runnels et al., 2006). In many of these studies, theoretical models relating to the exam systems are provided
along with application approaches. In their study, Mustakerov & Borissova (2011) predicted that the two-layered
examination system they offered ensured significant benefits owing to the flexible adjustment approach during the
testing process. Authors also asserted that the system they prepared could be used for formal and informal evaluation
processes. And in other theoretical studies in the literature, examination systems are designed with 3 or more layers
with the use of broader architectures. Although featured with differing names in the literature, the examination
system is composed of preliminary preparation layer that covers administrator appointments and student enrollments
(Brusilovsky & Miller, 1999; Gawali & Meshram, 2009; Keleş et al., 2009; Aye & Thwin, 2008; Guo et al., 2008;
Zhang et al., 2006), exam preparation layer consisting of questions, question pools and exams (Brusilovsky &
Miller, 1999; Gawali & Meshram, 2009; Keleş et al., 2009; Jun, 2009; Aye & Thwin, 2008; Guo et al., 2008; Li
& Wu, 2007; Zhang et al., 2006) and evaluation layer where the results of the exams are calculated (Brusilovsky &
Miller, 1999; Gawali & Meshram, 2009; Keleş et al., 2009; Jun, 2009; Li & Wu, 2007; Zhang et al., 2006). Aimin
& Jipeng (2009) designed the three layers as administrator, student and instructor, whereas Jin & Ma (2008) created
the exam architecture by the help of student and instructor layers alone.
154
ISSN 1436-4522 (online) and 1176-3647 (print). © International Forum of Educational Technology & Society (IFETS). The authors and the forum jointly retain the
copyright of the articles. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies
are not made or distributed for profit or commercial advantage and that copies bear the full citation on the first page. Copyrights for components of this work owned by
others than IFETS must be honoured. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior
specific permission and/or a fee. Request permissio ns from the editors at kinshuk@ieee.org.
Appropriate management of feedbacks received from students and reporting and analysis processes performed on the
exam results are important for proper execution of the examination systems. In his study on collaborative learning
process, Wilson (2004) investigated the extent the feedback process affects the student performance by comparing
with the students who receive classical face-to-face education. The author emphasized that student feedbacks may
render evaluation process more efficient. Crisp & Ward (2008), on the other hand, put that getting feedbacks in the
systems with too many students can only be achieved by help of online systems. In their study, Shen et al. (2001)
emphasized the importance of adjustment and update processes based on student feedbacks for proper execution of
online exams. Also, the authors designed an intelligent monitoring system with automatic reporting and filtering
capability. In another study, where importance of the monitoring systems is emphasized (Tinoco et al., 1997), a
module, in which the exam results are evaluated, reported and comments derived from these reports by help of
analyses, is added to the examination system architecture. Analyses on exam result are also included. Hang (2011)
expressed that an online examination system architecture must include re-examination, automatic scoring, question
pool update, flexible question design and system security topics.
Architectural flexibility and strong information technology infrastructure of online examinations systems make it
likely for them to possess processes suitable for intelligent approaches. In this context, there is high number of
studies in the literature based on intelligent approaches or supported by them. Keleş et al. (2009) added two
intelligent agents named Advisor and Planner to their learning platform called ZOSMAT, which they designed to
support students online and face-to-face. Advisor agent makes a proposal by mainly analysing the test results and
reports and by taking the education science facts into consideration, while the Planner agent matches the information
received from the Advisor agent with the system resources. In the decision making process, Advisor agent employs a
rule-based expert system approach. In the agent-based study by Jin & Ma (2008), formation of the exam and
evaluation of the results are carried out by two different agents. Alexakos et al. (2006) likewise executed evaluation
process by the help of intelligent agents and used genetic algorithms along with Bayesian Networks for these
processes. Unlike other studies, Gawali & Meshram (2009) designed the examination system in its entirety, but no in
part, on the basis of three agents, which are Stationary, Mobile and Main. Aimin & Jipeng (2009) on the other hand
incorporated intelligent approaches to the exam system architecture through linear algorithms that make random
choices in the question selection and exam formation processes, without making use of agent technology.
Proper evaluations of students affect their learning performances (Wang et al., 2008). It can be said that elimination
of human and/or system centered errors in the examination systems and good execution indirectly affect student
performance and satisfaction. In this regard, it is important to design an auxiliary system or an intermediary layer
that will identify the possible difficulties in the main examination processes via reports and student feedbacks and
generate solution proposals. As a result of effective use of information technologies during this process, it can be
said that intelligent agents can be used for reporting and filtering works in the intermediate layer (Shen et al., 2001).
Also, it appears as a necessity that the system gives the student another exam entry right in case of potential
problems, update of erroneous or deficient questions in the question pool, ability to update the scores in case of a
correction and such components as flexible question design (Hang, 2011). When all these literature findings are
considered, there is an obvious need for design and implementation of an integrated and flexible examination system
architecture possessing an intelligent intermediate layer, which is capable of producing solutions for differing
problems.
In this study, an integrated online exam system architecture that includes Administration, Implementation and
Finalization main layers and a Support Layer (SL) is proposed. Owing to this architecture, integrated management of
concurrent changes triggered in the processes on all related layers, depending on the decisions made in the SL (like
update of question text and difficulty level; giving the student another exam entry right; and update of scores and
reports) are ensured. Also, a Monitoring Agent (MA) is designed to help students through creating reports by
analyzing the online exam results in a way that enables students to review all possible problems and make decisions
more easily, in order to reduce the problems that arise in the implementation process of the online examination
system to a negligible level.
Primary contribution of this study is the proposed online examination system having an integrated and flexible
structure. The system manages a number of functions such as collection of feedbacks from students, generating
155
reports and submitting them to the managers and providing proposals in the decision-making processes by help of
analysis within a single module as distinct from the other studies found in the literature.
Agent-supported academic online examination system architecture
Agent-Supported Academic Online Examination System (ASOES), the subject matter of this study, is designed as a
basic component of the Learning Management System called AkademikLMS (SAUPORT) that aims to execute
online exams of the distance education programs of Sakarya University. While developing ASOES, the goal was to
alleviate the potential problems that could be encountered in the online exams and make the exam management
process easier.
ASOES, which is developed to ensure effective management of assessment and evaluation processes of academic
institutions that have many distance education programs and to offer ease of use to the students and the instructors,
has an architecture comprising three main layers and one support layer as shown in Figure 1. Operating process of
ASOES starts with an integration application that automatically replicates course, user and user-course enrollment
information from databases into an external system to the ASOES database at certain intervals. The Administration
Layer (AL) covers such operations as creation of distance education program, courses and users of this program, as
well as definition of course-user associations. Question and exam question pools are created and the exams are
published for registered students in the Implementation Layer (IL). Finalization Layer (FL) covers calculation and
update of exam results and creation of reports relating to these. Support Layer (SL) possesses a Help Desk (HD)
operated on the basis of problem feedbacks and an exam monitoring module that analyzes the system reports and
forwards the results to the Help Desk.
Figure 1. ASOES architecture
Administration layer
Figure 2. Administration layer structure
156
Within the AL, using the information automatically replicated from the external system to the ASOES database at
certain periods, distance education programs are created in the AL, and course and users are registered in these
programs. Then, data relating to student-course and instructor-course associations are used for matching.
Associations of students with the courses they take are defined by Enrollment, and associations of instructors with
the courses they give are defined by Management records (See Figure 2). Each student has to be enrolled for the
course related to the exam so that they can take the exam and apply to the HD for problems; and this registration
information is sent to Publishing process in the IL and Help Desk process in the SL through the Enrollment record.
Likewise, in order for each instructor to create an exam and manage the HD, information of the course relating to the
exam is sent to the Authoring process in the IL and Help Desk process in the SL through the Management record.
Implementation layer
Figure 3. Implementation layer structure
As seen in Figure 3, IL is composed of Authoring process where the question and exam question pools are created
and their features are defined and the Publishing process where the exams are published.
Users with the Instructor authority coming through the Management record from the AL, add the questions to the
question pools of their courses by entering them in the Question Pool Creation sub-process together with their
features. Features such as question type, difficulty level, keyword and shuffle for each question are entered by the
instructor.
Exam Formation sub-process is another part of the Authoring process. In this process, basic exam features come
optionally from the AL for each distance education program. If such information does not come, they are created by
the instructor with Custom Features in this process. Later, exam question pool is created and recorded in the Exam
Features and Exam Question Pool Database together with the exam features.
157
In the authoring process, if there is a decision taken to change question text, answer choice or question category as a
result of the problems notified by the MA or came to the HD based on student problem feedbacks after performing
the exam, these will again be updated by the instructors so that problems in the coming exams are prevented.
In the Publishing process, exam question pool and the defined exam features are combined by the Enrollment record
from the AL and the exam is published for each student who start a session.
Support layer
In an online education system, there is possibility that tens of thousands of online exams are performed. Also, it is
always possible to face human-centered or technical problems in online exams. When these are taken into
consideration, auxiliary tools are needed to overcome the difficulty of administration of the exams and to identify
and resolve the problems that arise. SL, which is designed for this purpose, is used for smooth execution of the
exams (See Figure 4). SL is composed of MA process creating reports and HD process where these reports and
student feedbacks are evaluated.
Figure 4. Support layer structure
Monitoring agent
It is seen from literature that e-learning platforms need having a kind of item analysis module in order to execute
large numbers of online exam sessions smoothly. Post & Hargis (2012) indicated that providing item analyses and
statistics are essential for any examination system. In another study it is emphasized that the information provided by
the item analysis assists not only in evaluating performance but in improving item quality as well (Fotaris et al,
2010). Online examination modules of the popular LMS platforms (Blackboard, Canvas and Moodle) are evaluated
within this context. Blackboard LMS has built-in item analysis module which produces question and exam reports
with discrimination, difficulty, average score, standard deviation and standard error statistics. Moodle has an add-in
type item analysis module which is capable of presenting reports similar to Blackboard with additional statistics
including skewness and kurtosis of the grade distribution, coefficient of internal consistency and discriminative
efficiency. Canvas, has no item analysis module in spite of incoming requests in this direction.
158
The proposed system with monitoring agent, on the other hand, does not only provide reports and statistics but also
generate suggestions for the instructors. By virtue of these suggestions, an instructor gains opportunity to give
additional exam-entry right to a particular student, and to change the difficulty level or to update body and options of
a specific question. Compared to the other platforms, one of the most significant functions of ASOES is that it
creates only the list of problematic questions and exams for the related instructor rather than screening standard
reports for every question and exam. Instructors may perform updates in the direction of system suggestions or may
insist on their previous decisions by evaluating the related questions and reviewing reports of related students. All of
these functions are presented to the instructors within a dashboard-type interface so that they may operate with ease.
MA processes the reports it receives from the Reporting process within FL, subject to the rules created by expert
opinion, and conducts Category, Exam Response and False Exam Analyses.
˙ Category Analysis
In category analysis, information regarding difficulty level of the questions coming from the IL and the item
difficulty index calculated by the system is compared. MA calculates the item difficulty index ( ) as follows
(Bachman, 2004):
(Eq. 1)
Here and gives the number of students that answer the question correctly in upper and lower groups, while
value shows the total number of students in the groups (Bachman, 2004). Kelley (1939) put forth that upper and
lower groups should be selected equally as first 27 % and last 27 % according to success ranking.
After the value is calculated, the difficulty level interval of this value is determined. In the literature it is
emphasized that determination of the difficulty level intervals should be undertaken by the instructor who prepares
the exam by taking the purpose of the exam into consideration (Atılgan et al., 2006; Kubiszyn & Borich, 1990).
Difficulty level intervals used by MA are provided in Table 1.
Table 1. Difficulty Level Intervals
Difficulty Level
Difficulty Level Interval
Easy
0.65 – 1.00
Moderate
0.35 – 0.65
Hard
0.00 – 0.35
Report of proposal to change the difficulty level prepared by help of Rule 1 is sent to the HD process.
Rule 1: “If the difficulty level interval that corresponds to the p value calculated by the system is different from the
difficulty level determined by the instructor, create a report of proposal to change the difficulty level.”
˙ Exam Response Analysis
In the exam response analysis, distribution of the incorrect answer choices are investigated; and for all false choices,
ratio of number of checking (CF) for the false choice to the number of checking (CT) for the true choice is calculated
as in Equation 2:
(Eq. 2)
If the largest of these ratios are larger than a threshold value (T) then Rule 2 is applied. A report is created according
to this rule to review the correct answer, which is then sent to the HD.
Rule 2:“If , create a false answer report.”
159
˙ False Exam Analysis
In false exam analysis, for all situations, except when students end their session voluntarily or the system terminates
it at the end of the duration of the exam, the error codes in Table 2 are created with the parameters they are linked to.
Table 2. Error Codes
Error Code
Description
Parameter
E1
Server based error
P1
E2
Client-server communication error
P2, P3, P4
Table 3. Error Parameters
Parameter
Description
P1
Presence of active exam sessions during time when the server does not reply
P2
Abandoning the session deliberately by student
P3
Ratio of number of students experiencing problems to all student entered for the same exam
P4
The completion status of the student regarding his/her other exam sessions
Rule, that includes the parameters in Table 3 and the error codes, is inferred as follows:
Rule 3:“IF P1=1 THEN S{NEER}=Yes ELSE
IF P2=1 THEN S{NEER}=No ELSE
IF P3>T1 THEN S{NEER}=Yes ELSE
IF P4> T2 THEN S{NEER}=Yes ELSE S{NEER}=No”
*S{NEER}: new exam entry right proposal
Here T1 and T2 are threshold values determined by P3 and P4 parameters. These values are calculated before the
exam based on the frequency tables belonging to past exams. Reports prepared as per the inference in Rule 3 are sent
to the HD process.
HD mainly includes decision-making processes relating to the problems encountered during exams by receiving
information from problem feedbacks notified by the students and the analysis reports prepared by the MA. This
decision-making process is undertaken by the instructors in the role of exam administrators. The instructor may
receive special reports from the Reporting process if he/she deems it necessary, in order to receive help to decide.
Students who could not complete the exam session or who claim there are errors in the questions choose one of the
categories in Table 4. A problem ticket specific to the student is created together with this category information,
which is then sent to the relevant decision-making process within HD.
Table 4. Problem notification category and the relevant decision-making process
Problem notification category
Category ID
Relevant decision-making process
False Question
FQ
False Question Validation
False Exam
FE
New Exam Entry Right
Problems labeled with the FQ category code in the problem database are sent to the False Question Validation
decision-making process. Instructor reviews the related question and decides whether it is false or not. If the question
is not false, the ticket is closed by sending an information message to the student. If the question is false, then
notification is sent to the Evaluation process in the FL for re-evaluation and to the Authoring process in the IL so that
the problem in the question is fixed.
Help desk
Problems recorded with the FE code in the problem database is sent to the New Exam Entry Right decision-making
process. Also, proposal reports specific to the exam session of the student created at the end of False Exam Analysis
in the MA are matched with the problem notifications during the New Exam Entry Right decision-making process.
160
Problems that cannot be detected by the MA and therefore cannot be matched by the problems in the problem
database are sent to the New Exam Entry Right decision-making process as a separate report. In the decision-making
process, the instructor may give a new exam entry right to the student or close the relevant without giving such right
after evaluating the problem. In both of these cases, a notification mail is sent to the student.
Reports automatically generated by the MA as a result of Exam Response and Category analyses are sent to the
Analysis Based Question and Category Validation decision-making process. The instructor considers the data from
the Exam Response analysis and decides whether the pre-specified answer choices of the questions are false or not;
and decides whether the difficulty levels of the questions are defined correctly or not using the data from the
Category analysis. He/she sends the update request regarding the correct choice and/or difficulty level of the
question to the Authoring process.
Finalization layer
Figure 5. Finalization layer structure
As shown in Figure 5, FL consists of Evaluation process where exam result information is processed and turned into
scores and Reporting process where all information relating to the exam is processed and turned into reports. In the
Evaluation process, answers of the student who has completed the exam are matched with the answer key and a
score is calculated for that student. In the Reporting process, all reports needed for the analyses to be made by MA
are automatically generated after the exam. Also, all kinds of reports regarding the exam can be created within this
process upon request of the instructor. After the reports are evaluated within SL, if re-evaluation decision is made,
scores of the student is updated in the Evaluation process.
ASOES implementation
Distance education at Sakarya University was launched in 2001 with IBM Lotus LearningSpace. Online
examinations of 400 students were hardly conducted because of the huge technical problems notified via e-mail,
phone or discussion forums. It was obtained from exam reports that only 65% of exam sessions were started and
84% of those were completed without any error.
Beginning from the year 2003, the number of students was raised to 2500. Therefore, it became a necessity to find an
alternative solution due to the unmanageable structure of LearningSpace and expanding institutional requirements.
Thus, a new LMS software called SAULMS was developed in 2005 with useful functions such as calendar-based
161
online exam execution and exam-entry-right management. It was measured that 73% of exam sessions was started
during the period (2005-2010) and 94% of those have been finalized by success.
Actually, it was clearly identified by the year 2008 that a high-end LMS software was needed in conjunction with the
increase of conducted e-learning programs as well as the number of students. Throughout the searching period,
worldwide LMS platforms including IBM LMS, Blackboard, Moodle and Sakai was evaluated. Because those e-
learning systems under evaluation have either no customization support or a quite complicated configuration and
maintenance process, anew institutional-requirement-specific software (SAUPORT) development attempt was
started. SAUPORT has been employed since 2010 for serving more than 20000 students so far in 23 e-learning
programs.
Number of online examination sessions today is over 75000 for each semester and execution of those has been
performed more efficient than ever before by means of the latest added functions including improved examination
calendar, multimedia supported question and options, question bank, elaborative user or course based reporter and
advanced exam-entry right manager. Based on the latest reports, it was extracted that the average shares of entrance
and errorless completion of exam sessions between 2010 and 2012 were 82% and 99% respectively.
As seen in Table 5, 76156 exam sessions were defined during 2011-2012 fall semester, students started 62280 of
these sessions and 61285 of these sessions were completed successfully. In this case, 995 exams were labeled as
false exam session.
Table 5. Numbers of exam sessions
Degree
Total number of exam
sessions
Number of started exam
sessions
Number of completed exam
sessions
Associate
45318
33095
32433
Undergraduate
1538
1129
1111
Master
4829
4076
4026
Undergraduate
completion
24471 23980 23715
Figure 6 shows number of failed exam sessions and the number of problems written by students to the HD. As can be
understood from the graph, there are differences in the number of incomplete sessions and the number of messages
received. Most important reason behind this is that the student thinks that he/she has completed the exam
successfully or he/she does not write anything to the HD despite facing a problem. As it can be seen in the figure,
about 2/3 of incomplete exam sessions are not notified as problems by students. By use of the MA application in the
proposed system, it was possible to develop a solution to the problems of the students who do not make any
notification to the system along with those who make notification.
Figure 6. Incomplete exam sessions and problems notified
162
Exam implementation
Integrated with the external system (Student Affairs database), ASOES is included in the course-instructor/students
database. In order to prepare an exam, firstly there should be enough number of questions in the question pool
relating to the course. The instructor makes choice of questions from among the questions he/she has earlier entered
and publishes the exam. When the student enters into published exam, a session starts. Exam score of the student,
who has completed the exam session without a problem, is calculated by evaluating the session data. Students
experiencing a problem can notify to HD of their problems.
Problem notified by the students and the results of the analysis made by the MA are evaluated by the instructor and
he/she makes his/her decision regarding new exam entry rights, re-calculation of the scores, and update of question
text, answer key and difficulty levels of the questions as deemed necessary after which the exam process is finalized
and exam results are publicized.
Monitoring agent implementation
MA has three basic functions as Category, Exam Response and False Exam analysis. MA analysis results may be
managed on the system by the instructor through a single screen (See Figure 7).
Figure 7. Monitoring agent
In the Category analysis, item difficulty index in calculated based on the answers given to the exam questions, and
new difficulty level proposal, difficulty level averages of the same question in the previous exams and the difficulty
level pre-specified by the instructor are submitted to the instructor. Under light of this information, the instructor can
update the difficulty level through the MA interface.
As an example to the Category analysis provided in this study, data of the Computer for Beginners course in the
2011/2012 fall semester of the distance education programs of Sakarya University was used. According to the
category analysis performed, MA proposed to change the difficulty level for 9 out of 40 questions belonging to this
course (See Table 6).
Proposals to change difficulty level are made based on whether the item difficulty index values are within limits of
(Table 1). It is shown in the instructor decision column of Table 6 in bold that the instructor has reckoned 4 out of 9
proposed cases and decided to change the difficulty level. While the instructor decides to change the difficulty level,
he/she can consider how much the item difficulty index value converges to difficulty level limits. In the MA
interface, calculated difficulty index values are submitted to the instructor together with the new level information.
163
As can be seen in Figure 8, 29th question assigned to the Easy category and 2nd, 3rd and 18th questions assigned to the
Moderate category are quite far away from the limits. In such case, it can be seen that instructor’s decision to change
the difficulty level is a reasonable approach. For values near limits, the instructor may choose not to change the
difficulty level. It can be understood from Table 6 that the instructor decided not to change the difficulty level for
questions 5, 20, 33, 38 and 38, as item difficulty indices are very close to limits.
Table 6. Category analysis result table summary
Question No.
Difficulty Level Specified by
Instructor
Item Difficulty
Index*
Difficulty Level
Proposed by MA
Decision of
Instructor
2
MODERATE
0.211
HARD
HARD
3
MODERATE
0.121
HARD
HARD
5
EASY
0.628
MODERATE
EASY
18
MODERATE
0.112
HARD
HARD
20
MODERATE
0.322
HARD
MODERATE
29
EASY
0.513
MODERATE
MODERATE
33
MODERATE
0.347
HARD
MODERATE
37
MODERATE
0.343
HARD
MODERATE
38
MODERATE
0.342
HARD
MODERATE
* Calculated by Equation 1
Figure 8. (a) Distribution of item difficulty index: Easy
Figure 8. (b) Distribution of item difficulty index: Moderate
164
Figure 8. (c) Distribution of item difficulty index: Hard
As the instructors create questions and add them to the question pool together with their answers, they can make
errors in assigning the answer choices. It is almost impossible for these students to identify such problems. Exam
Response analysis is performed on questions that are used for the first time in the system in order to determine if
there are answer choice errors.
In the Exam Response analysis, choices selected by the students are compared with the answer key from the IL. At
the end of the comparison, number of correct answers as well as distribution of the incorrect answers is determined
for each question. Ratio of each incorrect choice to the choice defined as correct by the instructor is calculated. It is
inquired if this ratio is larger than threshold value identified as 0.70 based on experience in the previous years.
Values for the questions for which proposal of change was made at the end of the analysis concerning questions used
for the first time in all online midterm exams in the 2011/2012 fall semester are provided in Table 7.
Table 7. Exam Response analysis results table
Distribution
Correct Choice
Convergence Ratio
Decision of Correct Choice
Update
A
B
C
D
E
A
B
C
D
E
7
8
11
27
20
D
0.26
0.30
0.41
0.74
Yes
5
47
10
19
48
E
0.10
0.98
0.21
0.40
No
53
30
38
2
11
A
0.57
0.72
0.04
0.21
No
12
10
17
43
39
D
0.28
0.23
0.40
0.91
Yes
9
68
6
51
3
B
0.13
0.09
0.75
0.04
No
5
53
4
6
49
E
0.10
1.08
0.08
0.12
No
47
33
15
13
23
A
0.70
0.32
0.28
0.49
No
10
16
9
36
26
D
0.28
0.44
0.25
0.72
Yes
0
51
9
11
57
E
0.00
0.89
0.16
0.19
No
5
9
14
37
41
D
0.14
0.24
0.38
1.11
Yes
31
85
7
79
39
B
0.36
0.08
0.93
0.46
Yes
6
3
45
4
17
E
0.35
0.18
2.65
0.24
Yes
3
15
15
26
21
D
0.12
0.58
0.58
0.81
No
3
3
15
3
21
C
0.20
0.20
0.20
1.40
Yes
62
19
26
20
88
E
0.70
0.22
0.30
0.23
No
57
33
26
32
50
E
1.14
0.66
0.52
0.64
Yes
68
41
61
21
7
C
1.11
0.67
0.34
0.11
No
3
4
27
7
4
D
0.43
0.57
3.86
0.57
Yes
59
2
1
0
54
A
0.03
0.02
0.00
0.92
Yes
Exam Response analysis was performed for 148 new questions and MA created reports for 19 questions for possible
erroneous answer choice (See Table 7). At the end of the review by the instructor, answer choice error was detected
in 10 questions, and information was sent to the Authoring process in the IL for the update of the answer key, and to
the Evaluation process in the FL for re-calculation of the scores.
165
In the False Exam analysis, proposal decisions for the exam sessions, which the students could not complete, are
created by help of the inference algorithm in the MA (See Rule 3). Here, a four-step algorithm is executed where P1,
P2, P3 and P4 parameter values determined by the system are queried for the decision.
In the first step of the algorithm, P1 parameter that corresponds to the presence of active exam sessions during time
when the server does not reply is inquired to create giving new right for such exam sessions. Otherwise, it is passed
to the second step of the algorithm to check whether the student abandoned the session deliberately (P2). If the
session is abandoned deliberately, then the proposal will be not to give new right. Otherwise, we pass to the third
step. In order to make a proposal at this stage, T1 threshold value should be calculated by exam-based evaluation of
the exam sessions of previous semesters, and P3 parameter for the active exam.
In this study, T1 value is calculated by analyzing 370 exams in the 2010/2011 spring semester. When the frequency
values in Table 8 are investigated, maximum false exam session rate was found as 4.88 % in 351 exams, which is 95
% of all exams, and this value was taken as T1 threshold value.
Table 8. False exam sessions summary frequency table
Cumulative Number of Exams
Cumulative / Total Number of Exams (%)
Maximum Error Ratio (%)
141
38.1
0.00
185
50
0.70
222
60
0.96
259
70
1.45
296
80
2.14
332
90
3.23
351
95
4.88
370
100
14.81
If P3 values are larger than T1 threshold value, MA proposes to give a new right, in the other case it is passed to the
fourth step of the algorithm. In the fourth step, P4 value for the student whose problem is being investigated should
be calculated, along with T2 threshold values derived from frequencies of all P4 parameter values calculated in all
exams in the active semester. In this study, T2 value was calculated by analyzing the exam sessions of 5585 students
in the 2011/2012 fall semester.
Table 9. Student exam completion status summary frequency table
Cumulative Number of Students
Cumulative / Total Number of
Students (%)
Minimum Session Completion Rate
(%)
4347
77.8
100
4469
80
97.67
4741
85
95.92
5023
90
92.98
5307
95
86.84
5585
100
0.00
When the frequency values in Table 9 are investigated, it was found that 95 % of the students successfully completed
at least 86.84 % of the exam sessions they started, and this value was taken as the threshold value (T2).
Conclusion
Online exams are used as an assessment-evaluation tool in the distance education systems that have quite a number
of students today. For such systems, good execution of exams aimed at assessment & evaluation is of critical.
Problems arising from human-centered errors or technical difficulties may lead to questioning of the exams, and thus
(reliability and efficiency of) the distance education systems.
In this study, an intelligent agent-supported integrated online examination system architecture is proposed. This
architecture provides for integrated management of main functions such as question pool creation and update, exam
authoring, execution and evaluation, and management of the feedbacks from students, along with ensuring use of
166
analysis reports relating to questions and exams created by an intelligent agent in the decision-making processes.
This way, all problems that can arise in the operation process of the exam system will be able to reduce to an
acceptable level, owing to the ability to decide more easily and rapidly provided by the system to the instructors.
Student feedbacks are important for efficiency of online examinations systems. However, these feedbacks are not
always enough to identify all problems. In addition, it is possible that such cases as failure to determine the question
difficulty levels appropriately, erroneous answer keys, and incomplete sessions, for which no problem is notified,
remain without a solution. And in some other cases, problem feedback does not provide sufficient information to
make decisions such as giving new exam entry right, re-calculation of the exam score and changing question content
and the answer key. In this regard, a mechanism that generates reports by help of certain rules was deemed necessary.
For this purpose, a mechanism named MA was designed. This intelligent agent supported online examination system,
ASOES, has been effectively used since 2011 for operation of 23 distance education programs of Sakarya University,
aiming at 7000 students studying for associate’s degree, undergraduate degree, undergraduate completion degree and
master’s degree.
In practice, it was experienced that instructors became aware of the problems only when students informed them and
that they spent too much time to solve these problems. In such case, it is understood from the results of the analyses
conducted that intelligent agent-supported online examination system largely identified the problems and allowed the
instructors to decide more easily.
In this study, an intelligent agent structure with three functions was designed to meet the needs of Sakarya University,
with the aim of solving the most frequent problems. Yet, it is possible to encounter different error conditions in the
examination systems depending on the purpose of use of different institutions. The proposed system can be expanded
to include additional intelligent functions aimed at solving these different errors with its flexible structure. Also, the
intelligent agent structures used in this study can be expanded using semantics and fuzzy inference methods.
References
Aimin, W., & Jipeng, W. (2009, May). Design and implementation of web-based intelligent examination system. WRI World
Congress on Software Engineering, 2009 (Vol. 3, pp. 195-199). doi: 10.1109/WCSE.2009.77
Alexakos, C. E., Giotopoulos, K. C., Thermogianni, E. J., Beligiannis, G. N., & Likothanassis, S. D. (2006, May). Integrating e-
learning environments with computational intelligence assessment agents. Proceedings of World Academy of Science,
Engineering and Technology (Vol. 13, pp. 233-238). Retrieved from http://waset.org/publications/10745/integrating-e-learning-
environments-with-computational-intelligence-assessment-agents
Atılgan, H., Kan, A., & Doğan, N. (2006). Eğitimde ölçme ve değerlendirme [Assessment and Evaluation in Education]. Ankara,
Turkey: Anı Press.
Aye, M. M., & Thwin, M. M. T. (2008). Mobile agent based online examination system. Proceedings of the 5th International
Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (Vol. 1, pp. 193-
196). doi: 10.1109/ECTICON.2008.4600405
Bachman, L. F. (2004). Statistical analyses for language assessment. Cambridge, UK: Cambridge University Press.
Blackboard. (2012). Running item analysis on a test. Retrieved November 30, 2012, from http://help.blackboard.com
Brusilovsky, P., & Miller, P. (1999). Web-based testing for distance education. WebNet World Conference on the WWW and
Internet (Vol. 1999, No. 1, pp. 149-154). Chesapeake, VA: AACE.
Canvas (2012). Canvas instructor guide - Quizzes. Retrieved November 29, 2012, from
http://guides.instructure.com/s/2204/m/4152/c/23861
Crisp, V., & Ward, C. (2008). The development of a formative scenario-based computer assisted assessment tool in psychology
for teachers: The PePCAA project. Computers & Education, 50(4), 1509-1526.
Fotaris, P., Mastoras, T., Mavridis, I., & Manitsaris, A. (2010). Extending LMS to support IRT-based assessment test calibration.
Technology Enhanced Learning. Quality of Teaching and Educational Reform, 73, 534-543.
Gawali, R. D., & Meshram, B. B. (2009). Agent-based autonomous examination systems. Proceedings of the International
Conference on Intelligent Agent & Multi-Agent Systems (pp. 1-7). doi: 10.1109/IAMA.2009.5228095
167
Guo, P., Yu, H., & Yao, Q. (2008). The research and application of online examination and monitoring system. Proceedings of
IEEE International Symposium on IT in Medicine and Education (pp. 497-502). doi: 10.1109/ITME.2008.4743914
Guo, S., & Mao, Y. (2010). OPES: An on-line practice and examination system based on web. Proceedings of the International
Conference on E-Business and E-Government (ICEE) (pp. 5470-5473). doi: 10.1109/ICEE.2010.1370
Hang, B. (2011). The design and implementation of on-line examination system. Proceedings of the International Symposium on
Computer Science and Society (ISCCS) (pp. 227-230). doi: 10.1109/ISCCS.2011.68
Jin, X., & Ma, Y. (2008). Design and implementation to intelligent examination system for e-business application operation.
Knowledge-Based Intelligent Information and Engineering Systems (pp. 318-323). Berlin, Germany: Springer doi: 10.1007/978-3-
540-85565-1_40
Jun, L. (2009). Design of online examination system based on web service and COM. Proceedings of the International
Conference on Information Science and Engineering (ICISE) (pp. 3276-3279). doi: 10.1109/ICISE.2009.484
Keleş, A., Ocak, R., Keleş, A., & Gülcü, A. (2009). ZOSMAT: Web-based intelligent tutoring system for teaching–learning
process. Expert Systems with Applications, 36(2), 1229-1239.
Kelley, T. L. (1939). The selection of upper and lower groups for the validation of test items. Journal of Educational Psychology,
30(1), 17.
Kubiszyn, T., & Borich, G. D. (1990). Educational testing and measurement: Classroom application and practise. New York,
NY: Harper Collins Publishers
Li, X., & Wu, Y. (2007). Design and development of the online examination and evaluation system based on B/S structure.
Proceedings of the International Conference on Wireless Communications, Networking and Mobile Computing (pp. 6223-6225).
doi: 10.1109/WICOM.2007.1525
Moodle. (2012). Quiz statistics calculations. Retrieved November 29, 2012, from
http://docs.moodle.org/dev/Quiz_statistics_calculations
Mustakerov, I., & Borissova, D. (2011). A conceptual approach for development of educational Web-based e-testing system.
Expert Systems with Applications, 38(11), 14060-14064.
Post, G. V., & Hargis, J. (2012). Design features for online examination software. Decision Sciences Journal of Innovative
Education, 10(1), 79-107.
Shen, R., Tang, Y., & Zhang, T. (2001). The intelligent assessment system in web-based distance learning education. Proceedings
of the Frontiers in Education Conference (Vol. 1, pp. TIF-7). doi: 10.1109/FIE.2001.963855
Tallent-Runnels, M. K., Thomas, J. A., Lan, W. Y., Cooper, S., Ahern, T. C., Shaw, S. M., & Liu, X. (2006). Teaching courses
online: A review of the research. Review of Educational Research, 76(1), 93-135.
Tinoco, L. C., Barnette, N. D., & Fox, E. A. (1997, March). Online evaluation in WWW-based courseware. ACM SIGCSE
Bulletin, 29(1), 194-198. doi: 10.1145/268085.268156
Wang, T. H., Wang, K. H., & Huang, S. C. (2008). Designing a Web-based assessment environment for improving pre-service
teacher assessment literacy. Computers & Education, 51(1), 448-462.
Wilson, E. V. (2004). ExamNet asynchronous learning network: Augmenting face-to-face courses with student-developed exam
questions. Computers & Education, 42(1), 87-107.
Zhang, L., Zhuang, Y. T., Yuan, Z. M., & Zhan, G. H. (2006). A web-based examination and evaluation system for computer
education. Proceedings of the Sixth International Conference on Advanced Learning Technologies (pp. 120-124). doi:
10.1109/ICALT.2006.1652383
168


































