PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Drawing UML diagrams, such as class diagrams, is an essential learning task in many software engineering courses. In course assignments, students are tasked to draw models that describe scenarios, model requirements, or system designs. The course instructor usually grades the diagrams manually by comparing a student's solution model with a template solution model made by the instructor. However, modelling is not an exact science, and multiple correct solutions or variants may exist. This makes grading UML assignments a cumbersome task, especially when there are many assignments to grade. Therefore, there is a need for an automated grading tool that aids the instructor in the grading process. This paper presents an approach for automated grading of UML class diagrams. We propose a metamodel that establishes mappings between the instructor solution and all the solutions for a class. The approach uses a grading algorithm that uses syntactic, semantic and structural matching to match a student's solutions with the template solution. We evaluate the algorithm on a real assignment for modeling a Flight Ticketing domain model for a class of 20 students and report our findings.
Content may be subject to copyright.
Automated Grading of Class Diagrams
Weiyi Bian
Trent University
Peterborough, Canada
Omar Alam
Trent University
Peterborough, Canada
org Kienzle
McGill University
Montreal, Canada
Abstract—Drawing UML diagrams, such as class diagrams, is
an essential learning task in many software engineering courses.
In course assignments, students are tasked to draw models
that describe scenarios, model requirements, or system designs.
The course instructor usually grades the diagrams manually by
comparing a student’s solution model with a template solution
model made by the instructor. However, modelling is not an exact
science, and multiple correct solutions or variants may exist. This
makes grading UML assignments a cumbersome task, especially
when there are many assignments to grade. Therefore, there is a
need for an automated grading tool that aids the instructor in the
grading process. This paper presents an approach for automated
grading of UML class diagrams. We propose a metamodel that
establishes mappings between the instructor solution and all the
solutions for a class. The approach uses a grading algorithm
that uses syntactic, semantic and structural matching to match
a student’s solutions with the template solution. We evaluate the
algorithm on a real assignment for modeling a Flight Ticketing
domain model for a class of 20 students and report our findings.
Index Terms—automated grading, class diagrams, model com-
Software engineering education is high in demand driven by
the fast-changing job market. This created a supply-demand
imbalance between computing college graduates and the avail-
able technology jobs. According to the US employment pro-
jections, there will be three times more available computer
science jobs than the number of graduates who could fill them
through year 2022 [1]. This created a renewed interest in the
field of computing. Currently, computing schools experience
an increase in enrolment as students rush into computer sci-
ence programs in record numbers [2]. The increasing number
of computing students increases the workload on instructors as
they have to grade large number of assignments. Besides the
increased workload, instructors struggle to grade assignments
and exams fairly, which is not an easy task. It is difficult for
human graders to precisely follow the grading formulae when
grading each individual assignment, especially when grading
subjective topics. Therefore, automated grading techniques are
very important to aid the instructors.
Although a number of approaches have been proposed
to automatically assess programming assignments [3], [4],
grading UML models, e.g. class diagram, has received little
attention. Class diagram designs, and UML models in general,
are considered ill-defined problems, where multiple solutions
may exist for a particular problem [5], [6]. Unlike well-
defined problems, where a solution can be either correct or
incorrect, a diagram design problem involving class diagrams
can have a large solution space. For example, solutions can
vary based on the class names, i.e., a student’s solution can
use a synonym for a class name instead of the exact name used
in the teacher’s solution. Solutions also can vary based on the
structure, e.g., adding attributes to the subclasses instead to
the superclass. These variations create an additional overhead
on the instructors when grading assignments, as they have to
spend longer time to evaluate a student’s answer. Furthermore,
instructors often revise their marking scheme after grading
several student papers. For example, an instructor may want
redistribute the grades when she discovers that students had
trouble with a particular part of the model, which is an
indication that the problem description was maybe not clear.
In such cases, the instructor might want to adjust the grading
weights for parts of the model to compensate. Unfortunately
this means that she has to manually update the grades for the
students she already graded by revisiting the students solutions
again using the new marking scheme. Finally, after receiving
their grades, many students may request that their copies be
reevaluated, often because the instructor may not have been
consistent when grading, for example, a large class over a
longer period of time.
Motivated by the aforementioned reasons, we propose an
automated grading approach for UML class diagrams. We
introduce a metamodel that stores grades for each model
element, e.g. classes, attributes, and associations. We present
an algorithm that establish mappings between model elements
in the instructor’s solutions to elements in the student solu-
tions, exploiting syntactic, semantic and structural matching
strategies. The students gets full mark for each element that
is perfectly matched. Partial marks are given to solutions
that are partially correct, e.g. an attribute that is placed in
a wrong class. The mappings and student grades are stored
using another metamodel, which makes it possible to update
the grading scheme later on. We implemented the algorithm
and grading metamodels in the TouchCORE tool [7], which vi-
sually shows the marks on the classes and prints out feedback
to the student. We ran this algorithm on a real assignment for
modeling a Flight Ticketing domain model for a class of 20
undergraduate students. On average, our algorithm was able
automatically grade the assignments within 14% difference of
the instructor’s grade. One important benefit of our approach
is that it can easily update the grades of the students when the
instructor changes the grading scheme.
2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems Companion
978-1-7281-5125-0/19/$31.00 ©2019 IEEE
DOI 10.1109/MODELS-C.2019.00106
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Fig. 1. Instructor Solution for University Model
Fig. 2. Student Solution Model 1
The rest of this paper is organized as follows. The next
section motivates our paper with some examples. Section 3
introduces our grading metamodels. Section 4 discusses our
grading algorithm. Section 5 reports the results on our Flight
Ticketing case study. Section 6 discusses the related work and
Section 7 concludes the paper.
In this section, we motivate our approach using a simple
class diagram modeling a university. Fig. 1 shows the instruc-
tor solution. The first student solution, shown in Fig. 2 uses
as a name for the Teacher class the word Instructor, uses the
wrong spelling form for Studemt, and uses Select instead of
Selection. Although the words that were used are not the same,
we want our matching algorithm to determine that Instructor
is a synonym for Teacher, which we call a semantic match.
The class Student should be matched with the class Studemt
syntactically, even though there is a spelling difference. In a
similar way, the class Select should be matched with Selection
and the operation printinfo matched with printinformation.
We also notice that the attribute location was misplaced, i.e.,
it was added to the class Select, which is wrong. Although
location is misplaced, one could argue that the student should
receive partial marks for including it to the model. Finally,
Fig. 3. Student Solution Model 2
Fig. 4. Student Solution Model 3
two elements, the attribute department and the operation
selectCourse are missing in the student solution, i.e. they could
not be matched syntactically or semantically with any element.
Fig. 3 shows a solution by a different student. There are
three important comparison checkpoints in this model: (1)
Class Subject has the same attributes as the class Course in the
instructor’s solution in Fig. 1. It is reasonable to consider that
these two classes should match due to their similar structure.
(2) The class Register has associations with class Student
and Subject (Subject is matched with Course using semantic
match). Therefore, we can consider that class Register is
matched with Selection, although their names do not match,
neither syntactically nor semantically. Again, this is a struc-
tural match based on the similarity of the associations with
other classes in the respective models. (3) In the instructor’s
model, the attribute age belongs to the superclass Person.
In the solution model in Fig. 3, the student added two age
attributes to the subclasses, Teacher and Student. We should
give these two attributes partial marks.
The third student solution shown in Fig. 4 illustrates two
interesting cases, class splitting and class merging. (1) Class
Classroom does not syntactically or semantically match any
class. Furthermore, its content does not provide enough infor-
mation to match with any class structurally. However, based
on attribute matching, the attribute location, which belongs to
the class Course in the instructor’s model has been misplaced
in the class Classroom by the student. Together, Class Course
and class Classroom in the student’s model have the same
attributes of the class Course in the instructor’s model. Also,
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Fig. 5. Student Solution Model 4
there is a 1-to-multiple association between class Classroom
and class Course in student’s model, allowing a particular
value for location to be associated with multiple courses.
We can therefore consider that the student has split the
class Course into two classes, Course and Classroom. (2)
Class Selection seems to be missing from the student’s model
because it fails to match with any element using the matching
methods that we discussed before. Based on the attribute and
operation matching results, we detect that all properties of
class Selection, i.e. attribute mark, have been misplaced to
class Student in the student’s model. Also in the instructor’s
model, class Selection has an association with class Student.
Therefore, we consider that the class Student in the student’s
model is a combination of class Student and class Selection in
the instructor’s model, and might want to give partial marks.
The fourth solution, shown in Fig. 5, illustrates how asso-
ciations are matched. In this model, the student forgot the
class Selection. There is no association between the class
Student and Course in the instructor’s model, but the class
Selection has two associations, with class Student and Course
with multiplicities 1 on both ends. Therefore, the student’s
association between Student and Course can be considered a
derivative association and should receive partial marks.
From all the examples above, we identified several matching
strategies which should be taken into account by our algorithm.
First, strict string matching is not sufficient for grading. It is
essential to combine syntactic matching (eliminating spelling
mistakes) and semantic matching (considering synonyms and
words with related meaning) for strings in our algorithm.
Second, structural matching strategies should be incorporated,
e.g. matching by comparing the contents of a class, similarity
based on the associations with other classes, and considering
classes that are split or merged. Third, the algorithm should
handle class inheritance properly, i.e. handle the class elements
that are misplaced within the inheritance hierarchy. Fourth,
the algorithm should be able to match associations, including
finding potential derivative associations.
This section discusses the metamodels we defined to support
our automated grading approach. Rather than augmenting the
class diagram metamodel to support the definition of grades
and matchings for model elements, we decided to define
separate metamodels. This is less invasive, as it leaves the
class diagram metamodel unchanged, and hence all existing
modelling tools can continue to work. Furthermore, we avoid
referring to class diagram metaclasses directly, but instead
use the generic EClass,EAttribute and EReference
(as we are assuming metamodels expressed in the metameta-
modelling language ECore provided by the Eclipse Modelling
Framework). As a result, our grading metamodels can be ap-
plied to any modelling language with a metamodel expressed
in ECore.
Figure 6 shows the metamodel that augments any model
expressed in ECore with grades. The GradeModel maps EOb-
ject to EObjectGrade, which contains a points attribute. That
way any modelling element in a language that is modelled
with a metaclass can be given points to. In order to give
points for properties of modelling elements, EObjectGrade
maps EStructuralFeature, the ECore superclass of EAttribute
and EReference,toEStructuralFeatureGrade, which contains
again a points attribute.
To illustrate the use of the grade metamodel, imagine a
metamodel for class diagrams where attributes are modelled
with a metaclass CDAttribute that has a type EReference that
stores the type of the attribute. Now imagine the case where
we want to give 2 points for the age attribute of the Person
class in Figure 1, and an additional point if the type of the
attribute is int. In this case one would create a EObjectGrade
and insert it into the grade map using as a key CDAttribute,
and assign the points value 2.0. Additionally, one would create
aEStructuralFeatureGrade, insert it into the grade map using
as a key the EReference type of CDAttribute.
Figure 7 depicts the Classroom metamodel which is used
after the automated grading algorithm is run to store the
mappings that were discovered. It simply associates with each
model element in the teacher solution (EObject key) a list of
EObjects in the student solutions that were matched by the
algorithm. After the algorithm has been run, the matchings in
this data structure can be updated by the grader if necessary.
The information can also be used to automatically update the
grades of the students in case the teacher decides to change
the point weights in the teacher solution.
In this section, we discuss the algorithm for automated grad-
ing of class diagrams. The overall algorithm is divided into six
parts: matching classes, match split classes, merged classes,
attributes and operations, associations and enumerations. In
the following, the six parts are explained in detail.
Algorithm 1 illustrates this process in detail. The algorithm
takes as input the instructor model, InstructorModel, and
the student model, StudentM odel.
Two different strategies are used to compare the names
of the classes. To perform a syntactic match (line 5), the
Levenshtein distance [8] is used to measure the similarity
between the two names. The Levenshtein distance calculates
the minimum number of single-character edits required to
change one word into another. Two classes are matched when
their Levenshtein distance is smaller than 40 percentage of the
longest name string length.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
points: EDouble
EObject points: EDouble
Fig. 6. Grade Metamodel
Fig. 7. Classroom Metamodel
Algorithm 1 Compare Classes
1: procedure COMPARECLASS(InstructorModel,
2: instListInstructorModel.getClass()
3: studListStudentModel.getClass()
4: for all Class Ciin instList,Csin studList do
5: if syntacticMatch(, or
6: semanticMatch(, ) or
7: contentMatch(Cs.content, Ci.content) then
8: storePossibleMatch(Ci,Cs)
9: for all Class Ciin instList do
10: if matched classes for Cithen
11: find among the matches of Cithe Cbest
12: that obtains the highest mark among the matches
13: classMatchMap.put(Ci,Cbest)
14: for all Class Ciin missClassList do
15: for all Class Csin studList do
16: if no match exists for Csthen
17: ListICi.getAssociationEnds()
18: ListSCs.getAssociationEnds()
19: if assocMatch(ListS,ListI)then
20: classMatchMap.put(Ci,Cs)
21: if no match exists for Cithen
22: missClassList.add(Ci)
return classMatchMap, missClassList
The second strategy involves a semantic match (line 6).
We used three algorithms available from WS4J (WOrdNet
Similarity for Java) [9], which all calculate a similarity metric
between two words based on the WordNet database: Hirst and
St-Onge Measure (HSO) [10], Wu and Palmer (WUP) [11]
and LIN [12]. The combined use of three measures presents
a better performance than using only a single measure. If the
determined score is satisfactory, then the match is stored.
HSO: This measure compares two concepts based on the
path distances between them in the WordNet database.
It measures the similarity by the number of directions
change which should be needed to connect one concepts
to another.
WUP: Given two concepts, WUP measures their simi-
larity by the number of common concepts from the root
concepts to these two concepts.
LIN: Lin is an improvement of the Resnik measure [13]
and uses Information Content (IC) of two concepts to cal-
culate their semantic similarity. IC of a term is calculated
by measuring its frequency in a collection of documents.
The class structural similarity match strategy (named con-
tentMatch in line 7) includes property similarity match which
compares the properties of two classes. Two properties, e.g.,
two attributes, with matched names would be regarded as
similar. The number of properties that are needed to be edited
to change one class to another is key to determine whether
two classes could be regarded as structurally similar.
After we find all potential matched classes, we calculate
the grades for each potential match (lines 9 - 13). If Cs(in
the student solution) is a potential match for class Ci(in the
instructor’s solution), we calculate the points that Cswould
give the student based on the grades attributed to Ciand its
content. The matched class that gets the highest grade in the
student solution is then retained as the final match for Ci.
After finding possible matching classes based on their
names and content, we additionally search for classes that
could be matched based on their associations to other classes.
Lines 14 - 20 illustrate this process. For each pair of classes
that is not yet matched, we look at their association ends, and
if two classes have similar association ends, we consider them
While Algorithm 1 matches the classes, there could be
attributes or operations that are misplaced, i.e., are placed in
the wrong class in the student model. Let Aibe one property
(attribute or operation) in the instructor model and Asone
property in the student model. There are four scenarios: (1)
the names of Aiand Asmatch and their containing classes
also match. (2) the names of Aiand Asmatch while their
containing classes do not match. In this case Asis considered
misplaced. Based on the grading policy, misplaced properties
should score fewer points. For example, for the case study
presented in the next section, we deducted 0.5 points for
each misplaced property. (3) the names of Aiand Asmatch,
however, Aibelongs to a super class and Asbelongs to one
of the subclasses. If Aiis not private, then Aiand Asare
considered as matched. However, in this case, the student
could also only get partial marks because the scope of the
property is too limited. (4) Ai and As could not match with
each other at all. Algorithm 2 finds the matched attributes and
operations in two models. In addition to the instructor and
student models, this algorithm takes as input the matched class
map which was populated by Algorithm 1, classM atchMap.
The algorithm starts by finding the matched attributes in the
same classes, i.e., same matched classes (line 4-10), if it does
not find a corresponding matched attribute in the same class, it
will look for it in the super class (line 11-12). It is not shown
in Algorithm 2, but we traverse the inheritance hierarchy all
the way up. If it does not find the attribute in the super class,
it will look for it in other classes in the model that are not
matched with the class. If the attribute exists in an unmatched
class, then it is considered to be misplaced and should be given
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Algorithm 2 Compare attributes and operations in
InstructorModel with StudentModel
1: procedure COMPARECONTENT(InstructorModel,
2: instListInstructorModel.getAttribute()
3: studListStudentModel.getAttribute()
4: for all Attribute Aiin instList,Asin studList do
5: CiAi.eContainer()
6: CsAs.eContainer()
7: if Aiis synatax or semtantic match for Asthen
8: if classMatchMap.get(Cs).equals(Ci)then
9: matchedAttrMap.put(As,Ai)
10: else if Ciis superClass of classMatchMap.get(Cs)
and Aiis not private then
11: matchedAttrMap.put(As,Ai)
12: for all Attribute Aiin instAttrList,Asin
studAttrList do
13: if Asnot matched And Aiis synatax or semtantic match
for Asthen
14: misplaceAttrMap.put(As,Ai)
15: instListInstructorModel.getOperation()
16: studListStudentModel.getOperation()
17: for all Operation Oiin instList,Osin studList do
18: CiOi.eContainer()
19: CsOs.eContainer()
20: if Oi.synMatch(Os)or Oi.semanticMatch(Os)then
21: if classMatchMap.get(Cs) equals Cithen
22: matchedOperMap.put(Os,Oi)
23: else if Oiis superClass of classMatchMap.get(Cs)
and Oiis not private then
24: matchedOperMap.put(Os,Oi)
25: for all Operation Oiin instOperList,Osin
studOperList do
26: if Osis not matched And Oi.synlMatch(Os)or
27: misplaceOperMap.put(Os,Oi)
28: instOperList.put(Oi, true)
29: return matchedAttrMap,misplaceAttrMap,
Algorithm 3 Check whether a class is split into two classes
1: procedure CLASSSPLITMATCH(InstructorModel,
2: instListInstructorModel.getClass()
3: studListStudentModel.getClass()
4: for all Class Cs0in studList, zhuzhuCs1in studList
5: if Cs0and Cs1has 1-to-multiple association then
6: for all Class Ciin instList do
7: if Cihas same properties with Cs0and Cs1then
8: splitClassMap.put(Ci,<Cs0,Cs1>)
9: break
10: return splitClassMap
a partial grade. Operations are matched in a similar way (line
19-32). After finding all the matches, the algorithm returns
a map of matched attributes, matchedAttrM ap, a map of
misplaced attributes, misplaceAttrM ap, a map of matched
operations, matchedOperM ap and a map of misplaced op-
erations, misplaceOperM ap.
Algorithm 3 checks whether the student splits one class into
two classes. Let be Cs0and Cs1be two classes in the student
Algorithm 4 Check whether a class is merged into another
1: procedure CLASSMERGEMATCH(InstructorModel,
2: for all Class Ci1in InstructorModel matched with Cs
in StudentModel do
3: for all Class Ci2in InstructorModel which content
is misplaced in Cs do
4: if Ci1has association with Ci2then
5: mergeClassMap.put(Cs,<Ci1Ci2>)
6: break
7: return mergeClassMap
model. The algorithm first checks if there is a 1-to-multiple
association between Cs0and Cs1(line 6). If an attribute is
extracted from a class Aand placed in a different class B,
then, there should be 1-to-multiple association from the Bto
A. This allows a value for the attribute in Bto be associated
with multiple instances of A, as discussed previously in the
example of Fig. 4. Then if there exists one class Ci in the
instructor model that has the similar properties in both Cs0
and Cs1, we consider that class Ci has been split into Cs0and
Cs1by student. The algorithm returns a map of split classes,
Algorithm 4 checks whether two class in the instructor
model could be matched with one class in the student model,
which means the student merged the two classes into one class
in her solution. Let be Ci1and Ci2two classes in the instructor
model, where all properties of Ci1have been misplaced into
class Csin the solution model. If Csis already matched with
Ci2based on the class matching algorithm and Ci1and Ci2
have an association between them (line 3), we can consider
that the student used Cs to combine both Ci1and Ci2.We
only give points to two classes that are merged into one class.
We do not give points to more than two classes that are merged
into one classes. In that case, the merged class will become
quite complex and with less cohesive. After finding all the
merged classes, the algorithm returns a map of merged classes,
mergeC lassMap.
Algorithm 5 matches the associations in two models. Con-
trary to other matching algorithms mentioned before, this
algorithm does not focus on comparing associations based
on their names, rather it compares them based on the classes
that an association is connected with. Let C0and C1be two
classes connected by association Ai in the instructor model,
and C2and C3be two classes connected by association As
in the student model. If C0and C1in the instructor model,
and C2and C3in the student model could be matched as
two pairs of classes, As and Ai should be also matched.
Then, if some classes are missing in the student model,
we try to find potential derivative associations that could
go through the missing class. For each missing class, we
first find the classes that it is connected with (line 8-9).
We do this process recursively, although, it is not shown in
the algorithm. This means that we also find classes that the
missing class is connected with indirectly, i.e., through other
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Algorithm 5 Compare association in InstructorModel
and StudentModel
1: procedure COMPAREASSOC(InstructorModel,
2: instAssocList InstructorModel.getAssociation()
3: studAssocList StudentModel.getAssociation()
4: for all Association Aiin instAssocList,Asin
studAssocList do
5: if Aiand Asconnect two pairs of matched classes then
6: associationMatchMap.put(As,Ai)
7: for all class Cin missClassList do
8: for all Class Ciin InstructorModel is connected
with Cdo
9: possibleAssocMap.get(C).add(Ci)
10: for all Association Asin studAssocList do
11: endClass1 As.getEnd1()
12: endClass2 As.getEnd2()
13: for all Key Class Cin possibleAssocMap do
14: possibleClassListpossibleAssocMap.get(C)
15: if endClass1 in possibleClassList and
endClass2 in possibleClassList then
16: derivationList.add(As)
17: return associationMatchMap,
classes. Then, we search if there is an association in the
student model which has one end connected to the missing
class (line 10-16). The algorithm returns a map of matched
associations, associationMatchMap and a list of derived
association, derivationList. It is important to note that a
grader may want to give grades for any derivative association,
i.e., not necessarily derived from a missing class in the student
solution. In that case, we have to relax the condition check for
missing classes in this algorithm. In the case study discussed
in the next section, the instructor opted to give grades for any
derived association.
An enumeration is a type that has a defined number of
possible values. Algorithm 6 matches enumerations of two
models. The straightforward way to match enumerations is to
compare their names and their literal values. Let be Ei be an
enumeration in instructor model and Es be an enumeration
in the solution model. If the entries of Ei and Es could be
matched by their names, Ei and Es would be considered
as matched. It is possible that student does not model the
enumeration perfectly, in which case there will entries in Ei
that are not matched with entries in Es. The algorithm returns
a map of matched enumerations, enumMatchMap.
For the missing literals in the enumeration, it is possible that
the student used other model elements, such as an attribute
or a class, to represent a missing entry in the enumeration.
If a literal ein Ei could not be matched with any entry in
enumeration Es, we search whether there is a class or attribute
in the student solution which name matches with e. Depending
on the grading scheme, the instructor can opt for giving a full
grade or partial grade when a student use an attribute or a
class to represent an enum literal.
Algorithm 6 Compare ENUM in InstructorModel and
1: procedure COMPAREENUM(InstructorModel,
2: instENUMList InstructorModel.getENUM()
3: studENUMList StudentModel.getENUM()
4: for all ENUM Eiin instENUMList,Esin
studENUMList do
5: if syntacticMatch(, or
6: semanticMatch(, then
7: enumMatchMap.put(Es,Ei)
8: else if Esand Eihave similar literal values then
9: enumMatchMap.put(Es,Ei)
10: studClassList StudentModel.getClass()
11: studAttrList StudentModel.getAttribute()
12: for all ENUM Eiin instENUMList do
13: for all literal Lin Ei.literal do
14: for all Attribute Asin studClassList do
15: if As.Name.syntacticMatch(L.Name) or
As.Name.semanticMatch(L.Name) then
16: consider Asrepresent L
17: for all class Csin studClassList do
18: if Cs.Name.syntacticMatch(L.Name) or
Cs.Name.semanticMatch(L.Name) then
19: consider Csrepresent L
20: return enumMatchMap
In this section, we apply our approach on an assignment to
draw a domain model for Flight Ticketing system. Students
were given an assignment handout detailing the assignment
questions and requirements, which is not shown here. The as-
signment was given to a third year software engineering class.
All students were taught how to design domain models and had
prior knowledge about class diagrams. Twenty students sub-
mitted the assignment and we used their submitted solutions
to run our experiment. This assignment was given a year prior
to developing our tool. Therefore, the instructor who graded
the students and the students themselves did not make any
assumption that the assignments will be automatically graded
by a tool. Fig. 8 shows the instructor’s solution and his grading
scheme. Based on this scheme, the maximum grade that could
be achieved is 55. Table II lists the grading that each student
received. It shows the instructor’s grading, our tool’s grading
and the reason for the difference between the two gradings.
The classroom average based on the instructor’s grading was
36.9, compared to 34.7 achieved automatically by our tool.
The average difference between the instructor’s grade and our
tool’s grade was 4.65, i.e., our tool was able to automatically
grade the students within less than 14% difference to the
instructor’s grade. The previous section discussed the matching
algorithm, but did not discuss the grades that are provided for
each matched or missing element. Our tool implementation
allows the instructor to change the grading scheme that is
shown in Fig. 8. Because we keep a map to the students
using the metamodels that were discussed earlier, it is easy
to update the grades of the students based on the new grading
scheme. In addition, our implementation allows to change the
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Fig. 8. Instructor’s Solution for Flight Ticketing Domain Model
deduction policy to be close to a particular instructor’s style.
We examined closely how the instructor graded the students
for this case study, and adopted our algorithm accordingly.
Here is a summary of the deduction policy that we adopted
for this case study:
Misplaced attribute/operation: deduct half a mark.
Derived association: give half of the mark of the correct
Missing element: deduct whole mark.
Attribute or class representing an enumeration entry: give
average mark, i.e., if the enumeration has 3 entries and
was assigned 1 total mark in the instructor’s solution. We
assign 1/3 mark for each attribute or class representing
an entry for this enumeration.
We investigated the elements that the algorithm graded
differently than the instructor. When we discussed the dif-
ferences in Table II. We listed each element that the algo-
rithm graded differently. For example, for student 1, for the
class CheckedIn, the instructor gave it 2 points while the
algorithm gave it 0.33 points (I:2, A:0.33). For the same
student, the algorithm gave 0 marks for a wrong attribute.
This case is interesting, because the algorithm completely
disagreed with the instructor. Student 1 had an attribute called
airlinedesignator which did not match with any attribute in
the instructor model. The instructor chose to give it a point
regardless. In other similar cases, we found that the instructor
was lenient when grading students, therefore, he gave points
for elements that do not match. In other cases, the instructor
gave 0 to an element while our algorithm gave some points.
For example, for student 5, the student had a class called Time
which had a date/time attribute. The instructor did not give any
points for this, while our algorithm matched it with the class
F lightO ccurence based on its content. Furthermore, because
Time was matched with F lig htOccurence, the algorithm
was able to give points to two associations: an association
between Time and Flight and between Time and Person.
The first association matches with the association between
F lightO ccurence and Flight in the instructor model. The
second association could be derived from two associations:
the association between F lightO ccurence and Trip, and the
association between Trip amnd Person as shown in Fig. 8.
Fig. 9 shows the grades for student 12. This model is
shown in the TouchCORE tool [7] which we used to imple-
ment our approach. We extended the tool to use the grading
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
No. Instructor Algorithm Reason for Difference
1 31 28.33 Class CheckedIn: (I:2, A:0.33), Attribute airlinedesignator: (I:1, A:0)
2 15 22.5 Class Tick et :(I:0, A:2), Associations Ticket-Flight, Person-Flight, Ticket-Person: (I:0, A:6)
Attribute flightClass misplaced:(I:0, A: 0.5)
3 47 41.83 Class Section:(I:3, A:0), Class CheckedIn: (I:2, A:0.33)
Attribute Date misplaced: (I:1, A:0.5)
4 41 41 all matched
5 22 28.33 Class Time: (I:0, A:2), Associations Time-Flight, Time-Person: (I:0, A:4),
Attribute isBoarded: (I:0, A:0.33)
6 44 33.83 Class CheckedIn: (I:2, A:0.33), Associations CheckedIn-Trip, CheckedIn-Person: (I:4, A:0)
Class BookedFlight: (I:4, A:0), Attribute Date misplaced: (I:1, A:0.5)
7 34 39.33 Class Tick et : (I:0, A:2), Association Person-Ticket, Seat-Ticket, Ticket-Flight: (I:0, A:6)
Class Checking: (I:2, A:0.33), Attribute flightClass,date Mispalced: (I:2, A:1)
8 39 37 Attribute Dep: (I:2, A:0)
9 38 31.83 Attribute departure, arrival: (I:4, A:0), Attribute boarded: (I:1, A:0.33)
Class Class: (I:2, A:1), Attribute date misplaced: (I:1, A:0.5)
10 32 29 Attribute seatNumber: (I:1, A:0)
Attribute fullName: (I:2, A:0)
11 34 36.83 Class Company: (I:0, A:2), Association Ticket-Seat: (I:0, A:2)
Attribute Board: (I:1, A:0.33), Attribute date misplaced: (I:1, A:0.5)
12 44 42.83 Attribute date misplaced: (I:1, A:0.5), Attribute isCheckedIn: (I:1, A:0.33)
13 41 41 Class Date: (I:0, A:2), Association Reservation-Seat: (I:2, A:0)
14 44 32.83 Class CheckedIn: (I:2, A:0.33), Association Person-CheckIn,CheckIn-Flight: (I:4, A:0)
Attribute number: (I:3, A:0), Attribute date misplaced: (I:1, A:0.5),Class BookedFlight: (I:2, A:0)
15 44 41.33 Attribute CheckedIn: (I:2, A:0.33), Association: Luggage-Ticket: (I:2, A:1)
16 29 18.16 Attribute isBoarded: (I:1, A:0.33), Attribute date misplaced: (I:1, A:0.5), Attribute seat: (I:4, A:0)
Association Passenger-Luggage, Flight-Passenger:(I:4,A:0), Class CheckedIn: (I:2, A:0.33)
17 40 34.83 Attribute luggage: (I:4, A:0), Attribute CheckedIn: (I:1, A:0.33)
Attribute date misplaced: (I:1, A:0.5)
18 45 38.83 Attribute date misplaced: (I:1, A:0.5), Class CheckIn: (I:2, A:0.33)
Association CheckIn-Luggage, CheckIn-Booking: (I:4, A:0)
19 43 44.5 Attribute date misplaced: (I:1, A:0.5), Association Flight-Seat: (I:0, A:2)
20 30 34.66 Associations Luggage-City, Flight-Passenger, Flight-Ticket, Luggage-Seat: (I:0, A:8),Class CheckedIn: (I:2, A:0.33)
Attribute isBoarded: (I:0, A:0.33), Attribute date misplaced: (I:1, A:0.5), Attribute name: (I:2, A:0)
City: 2.0/2.0, matches with Class City
Flight: 2.0/2.0, matches with Class Flight
Ticket: 2.0/2.0, matches with Class FlightOccurrence
number in Class Flight: 1.0/1.0, matches with number in Class Flight
status in Class BookedFlight: 0.0/2.0 missing attribute
date in Class FlightOccurrence: 0.5/1.0 misplaced attribute
City-Flight: 2.0/2.0 matches with association between Flight and City
Seat-Ticket: 2.0/2.0 Match association between BookedFlight and Seat
Ticket-Seat: 2.0 derivative association matched with associations FlightOccurence-BookedFlight-Seat
metamodels described earlier and implemented the matching
algorithms. The grades are shown in circles around each class
or an attribute. When the instructor loads a model, she can
automatically grade it by pressing on the last button on the
right (below the + button). The tool graded two elements
differently than the instructor (highlighted in yellow in Fig. 9).
The first element is the attribute date in the class Flight.
This attribute should belong to the class Ticket but the
student misplaced it, because the class Ticket is matched
with F lightO ccurence which has date as an attribute in
the instructor model. However, the instructor was lenient
and decided to give it a full point regardless. The second
element that the tool graded differently is isCheckedI n in
the class Ticket. This attribute belongs to the enumeration
P assengerS tatus in the instructor solution. The instructor
was again lenient and did not deduct points for misplacing
the attribute in the Ticket instead of having it as an enum
literal in P assengerS tatus.
The instructor was not always lenient in grading. For
example, for student 20, the instructor did not give points
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Fig. 9. Solution Model for student 12 showing their grading
for the four derived associations, although, he chose to give
points for derived associations to other students. When we
talked to the instructor, he admitted that he was not completely
consistent in grading and that he should have given some
points for this student. The tool also prints a feedback sheet
for the student listing the points that she received and points
that she missed. The student also receives a feedback sheet
explaining where they received or lost marks. An excerpt of
this sheet is shown in Table II.
A number of approaches have been proposed to compare
UML class diagrams. Haggarth and Lockyer [14] proposed a
framework that provides feedback to the student based on the
comparison between student’s model with the teacher model.
Ali et. al. [15] proposed an assessment tool to compare a
class diagram drawn by students to the teacher’s diagram.
Soler et. al. [16] developed a web-based tool for correcting
UML class diagram. Their approach checks for errors in a
student’s model by comparing with models in a repository of
similar models. Unlike our approach, they do not allow to
assign grades to models. Hasker [17] introduced UMLGrader,
a tool for automated grading UML models by comparing
a student’s solution with a standard solution as we do in
our approach. However, their approach relies only on string
matching. Our matching algorithm uses syntax, semantic and
structural matching to compare models.
There are approaches that proposed automated assessment
of other kinds of UML models, e.g. Use Case Specifications.
Jayal and Shepperd [18] proposed a method for label matching
for UML diagrams and using different levels of decomposition
and syntactical matching. They evaluate their approach using
a case study on matching activity diagrams. Tselonis et.
al. [19] introduced a diagram marking technique based on
graph matching. Thomas et. al. [20] introduced a framework
that used synonyms and an edit distance algorithm to mark
graph-based models. Vachharajani et al. [21] introduced a
framework for automatic assessment of Use Case Diagrams
using syntactic and semantic matching. Sousa and Paulo [22]
introduced a structural approach for graphs that establishes
mappings from a teacher’s solution to elements in the student
solution that maximizes a student’s grade. Finally, our tool
provides feedback to the student about the deducted marks. A
number of educational tools for learning programming provide
different kinds of feedback to the learner. Keuning et. al.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
provides a summary of feedback generation approaches for
programming tasks [23].
There are three main differences between our approach
and the approaches discussed above: (1) our approach com-
bines syntactic, semantic and structural matching for grading
class diagrams. In addition to using Levenshtein distance for
syntactical matching, our approach uses three algorithms for
semantic matching and performs structural matching between
two diagrams. Based on the matching results, the approach
assigns marks to the model elements. Most of the above
approaches are limited to syntactic matching of names. (2)
Our approach proposes a non-invasive grading metamodel that
stores the determined grades alongside the model as feedback
to the students. (3) Our approach proposes a new classroom
metamodel that allows for saving and automatic updating the
grades of a group of students in case the teacher changes the
grading scheme.
UML diagrams in general, and class diagrams in particular,
are widely used in computer science and software engineer-
ing education. In many courses, computer science students
are required to solve assignments or answer exam questions
involving class diagrams. Instructors usually grade these dia-
grams manually by comparing each student solution with the
template solution that they prepared for the assignment/exam.
This could be a cumbersome task, especially when they have
to grade large number of student papers. Furthermore, a
particular problem could have different possible design solu-
tions using class diagrams. Solutions could vary based on the
names of the classes, their properties, or relationships between
classes. Therefore, instructors have to spend longer time to
examine each student’s solution. In this paper, we propose an
automated grading approach for class diagrams. In particular,
we propose two metamodels, one to establish mappings be-
tween an instructor’s solution and student solutions, the other
metamodel assigns grades to model elements and stores them.
Furthermore, we introduced a grading algorithm that matches
model elements in the student model with elements in the
instructor model. Based on the matching, students are provided
with the grades. We implemented our ideas on TouchCORE
tool and used it to automatically grade a third year assignment
to draw a domain model for Flight Ticketing System. Our
tool was able to automatically grade 20 students within 14%
difference to the grading received by the instructor.
In future, we plan to expand our approach to be able
to grade other UML models, e.g., sequence diagrams and
statemachine diagrams. We also plan to run more experiments
with assignments obrained from different instructors.
[1] J. Adams, Computing Is The Safe STEM Career Choice Today, Novem-
ber 3, 2014. [Online]. Available:
[2] N. Singer, The Hard Part of Computer Sci-
ence? Getting Into Class, January 24, 2019. [Online].
[3] P. Ihantola, T. Ahoniemi, V. Karavirta, and O. Sepp ¨
a, “Review of
recent systems for automatic assessment of programming assignments,”
in Proceedings of the 10th Koli Calling International Conference
on Computing Education Research, ser. Koli Calling ’10. New
York, NY, USA: ACM, 2010, pp. 86–93. [Online]. Available:
[4] J. C. Caiza and J. M. d. ´
Alamo Ramiro, “Programming assignments
automatic grading: review of tools and implementations,” 2013.
[5] N.-T. Le, F. Loll, and N. Pinkwart, “Operationalizing the continuum be-
tween well-defined and ill-defined problems for educational technology,
IEEE Trans. Learn. Technol., vol. 6, no. 3, pp. 258–270, Jul. 2013.
[6] P. Fournier-Viger, R. Nkambou, and E. M. Nguifo, Building Intelligent
Tutoring Systems for Ill-Defined Domains. Berlin, Heidelberg: Springer
Berlin Heidelberg, 2010, pp. 81–101.
[7] M. Sch ¨
ottle, N. Thimmegowda, O. Alam, J. Kienzle, and G. Muss-
bacher, “Feature modelling and traceability for concern-driven software
development with touchcore,” in Companion Proceedings of the 14th
International Conference on Modularity, MODULARITY 2015, Fort
Collins, CO, USA, March 16 - 19, 2015, 2015, pp. 11–14.
[8] V. I. Levenshtein, “Binary codes capable of correcting deletions, inser-
tions, and reversals,” in Soviet physics doklady, vol. 10, no. 8, 1966, pp.
[9] H. Shima, “Wordnet similarity for java (ws4j),” HYPERLINK” https:
llcode. google. com/p/ws4jl” https: llcode. google. comlp/ws4j, 2016.
[10] G. Hirst, D. St-Onge et al., “Lexical chains as representations of
context for the detection and correction of malapropisms,” WordNet:
An electronic lexical database, vol. 305, pp. 305–332, 1998.
[11] Z. Wu and M. Palmer, “Verbs semantics and lexical selection,” in Pro-
ceedings of the 32nd annual meeting on Association for Computational
Linguistics. Association for Computational Linguistics, 1994, pp. 133–
[12] D. Lin, “An information-theoretic definition of similarity,” in
Proceedings of the Fifteenth International Conference on Machine
Learning, ser. ICML ’98. San Francisco, CA, USA: Morgan
Kaufmann Publishers Inc., 1998, pp. 296–304. [Online]. Available:
[13] P. Resnik, “Semantic similarity in a taxonomy: An information-based
measure and its application to problems of ambiguity in natural lan-
guage,” J. Artif. Int. Res., vol. 11, no. 1, pp. 95–130, Jul. 1999.
[14] G. Hoggarth and M. Lockyer, “An automated student diagram assess-
ment system,” SIGCSE Bull., vol. 30, no. 3, pp. 122–124, Aug. 1998.
[15] N. Haji Ali, Z. Shukur, and S. Idris, “Assessment system for uml class
diagram using notations extraction,” International Journal of Computer
Science and Network Security, vol. 7, no. 8, pp. 181–187, 2007.
[16] J. Soler, I. Boada, F. Prados, J. Poch, and R. Fabregat, “A web-
based e-learning tool for uml class diagrams,” in IEEE EDUCON 2010
Conference, April 2010, pp. 973–979.
[17] R. W. Hasker, “Umlgrader: An automated class diagram grader,J.
Comput. Sci. Coll., vol. 27, no. 1, pp. 47–54, Oct. 2011.
[18] A. Jayal and M. Shepperd, “The problem of labels in e-assessment of
diagrams,” J. Educ. Resour. Comput., vol. 8, no. 4, pp. 12:1–12:13, Jan.
[19] C. Tselonis, J. Sargeant, and M. McGee Wood, “Diagram matching for
human-computer collaborative assessment,” 2005.
[20] P. Thomas, K. Waugh, and N. Smith, “Learning and automatically as-
sessing graph-based diagrams,” in Beyond Control: learning technology
for the social network generation. Research Proceedings of the 14th
Association for Learning Technology Conference (ALT-C, 46 September,
Nottingham, UK, 2007), 2007, pp. 61–74.
[21] V. Vachharajani and J. Pareek, “Framework to approximate label match-
ing for automatic assessment of use-case diagram,” International Journal
of Distance Education Technologies (IJDET), vol. 17, no. 3, pp. 75–95,
[22] R. Sousa and J. P. Leal, “A structural approach to assess graph-based
exercises,” in International Symposium on Languages, Applications and
Technologies. Springer, 2015, pp. 182–193.
[23] H. Keuning, J. Jeuring, and B. Heeren, “Towards a systematic review
of automated feedback generation for programming exercises,” in Pro-
ceedings of the 2016 ACM Conference on Innovation and Technology in
Computer Science Education, ser. ITiCSE ’16. New York, NY, USA:
ACM, 2016, pp. 41–46.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Conference Paper
Full-text available
After 25 years, university courses still teach UML modeling to some extent. Also, UML is still the go-to language when practitioners need to model software systems. Evaluating UML models created by students remains a challenge faced by instructors. Some challenges are: (1) assessment criteria are not clear, making it difficult to justify test scores and produce qualitative feedback to enhance the learning process; (2) evaluating UML models is a time-consuming task, limiting the broader development of models by students; and (3) difficulty in giving feedback on the modeling in a timely manner. While recognizing this problem, current literature does not explore it to full extent. This article sheds light on these issues when introducing the UMLGrade, which is an initial proposal to streamline the process of grading UML diagrams. UMLGrade seeks to enhance the learning of UML models through assessment reports considering semantic, syntactic aspects, design rules, readability, and object-oriented principles. An initial process is introduced which can serve as a starting point for new initiatives.
Conference Paper
Full-text available
Formative feedback, aimed at helping students to improve their work, is an important factor in learning. Many tools that offer programming exercises provide automated feedback on student solutions. We are performing a systematic literature review to find out what kind of feedback is provided, which techniques are used to generate the feedback, how adaptable the feedback is, and how these tools are evaluated. We have designed a labelling to classify the tools, and use Narciss' feedback content categories to classify feedback messages. We report on the results of the first iteration of our search in which we coded 69 tools. We have found that tools do not often give feedback on fixing problems and taking a next step, and that teachers cannot easily adapt tools to their own needs.
Conference Paper
Full-text available
This paper proposes a structure driven approach to assess graph-based exercises. Given two graphs, a solution and an attempt of a student, this approach computes a mapping between the node sets of both graphs that maximizes the student’s grade, as well as a description of the differences between the two graph. The proposed algorithm uses heuristics to test the most promising mappings first and prune the remaining when it is sure that a better mapping cannot be computed. The proposed algorithm is applicable to any type of document that can be parsed into its graph-inspired data model. This data model is able to accommodate diagram languages, such as UML or ER diagrams, for which this kind of assessment is typically used. However, the motivation for developing this algorithm is to combine it with other assessment models, such as the test case model used for programming language assessment. The proposed algorithm was validated with thousands of graphs with different features produced by a synthetic data generator. Several experiments were designed to analyse the impact of different features such as graph size, and amount of difference between solution and attempt.
Full-text available
One of the most effective ways to learn is through problem solving. Recently, researchers have started to develop educational systems which are intended to support solving ill-defined problems. Most researchers agree that there is no sharp distinction but rather a continuum between well-definedness and ill-definedness. However, positioning a problem within this continuum is not always easy, which may lead to difficulties in choosing an appropriate educational technology approach. We propose a classification of the degree of ill-definedness of educational problems based on the existence of solution strategies, the implementation variability for each solution strategy, and the verifiability of solutions. The classification divides educational problems into five classes: 1) one single solution, 2) one solution strategy with different implementation variants, 3) a known number of typical solution strategies, 4) a great variety of solution strategies beyond the anticipation of a teacher where solution correctness can be verified automatically, and 5) problems whose solution correctness cannot be verified automatically. The benefits of this problem classification are twofold. First, it helps researchers choose or develop an appropriate modeling technique for educational systems. Second, it offers the learning technology community a communication means to talk about sorts of more or less ill-defined educational problems more precisely.
Full-text available
Summary The extraction is the process of removing or obtaining something from something else; whether with force or difficulty, or chemically. It is a one of separation technique used in most science researches. In our research focusing, extraction process is the process of converting captured notation information into data. Our proposed approach for notation extraction will play an important role in assessment process later. Notations differ in their extraction depending on their keyword and text structure in Rational Rose petal files. An ideal notation extraction process can digest target Rational Rose file that are visible only as petal file pages, and create a local replica of those tables as a result. Proper notation extraction also requires solid data validation and error recovery to handle data extraction failures. Extraction process should be as accurate and reliable as possible because its results will be used as a base to develop an Assessment system for UML Class Diagram. This paper discusses the extraction process from Rational Rose petal file that represents the structure for each notation of UML class diagram as a text form. The UML class diagram of notations have involved are class object notation, inheritance notation and relationship notation such as Association, Association class, generalization, aggregation and composition. Each notation which is extracted will keep as a data in a few tables. All these tables will be accessed in assessment process later to implement the UML Class Diagram Assessment System that proposed in our main research.
Conference Paper
Full-text available
This paper will focus on the semantic representation of verbs in computer systems and its impact on lexical selection problems in machine translation (MT). Two groups of English and Chinese verbs are examined to show that lexical selection must be based on interpretation of the sentences as well as selection restrictions placed on the verb arguments. A novel representation scheme is suggested, and is compared to representations with selection restrictions used in transfer-based MT. We see our approach as closely aligned with knowledge-based MT approaches (KBMT), and as a separate component that could be incorporated into existing systems. Examples and experimental results will show that, using this scheme, inexact matches can achieve correct lexical selection.
Full-text available
Domains in which traditional approaches for building tutoring systems are not applicable or do not work well have been termed “ill-defined domains.” This chapter provides an updated overview of the problems and solutions for building intelligent tutoring systems for these domains. It adopts a presentation based on the following three complementary and important perspectives: the characteristics of ill-defined domains, the approaches to represent and reason with domain knowledge in these domains, and suitable teaching models. Numerous examples are given throughout the chapter to illustrate the discussion.
E-learning plays a significant role in educating large number of students. In the delivery of e-learning material, automatic e-assessment has been applied only to some extent in the case of free response answers in highly technical diagrams in domains like software engineering, electronics, etc., where there is a great scope of imagination and wide variations in answers. Therefore, the automatic assessment of diagrammatic answers is a challenging task. In this article, algorithms that compute the syntactic and semantic similarities of nodes to fulfill the objective of automatic assessment of use-case diagrams are described. To illustrate the performance of these algorithms, students' use-case diagrams are matched with model use-case diagram. Results from 13,749 labels of 445 student answers based on 14 different scenarios are analyzed to provide quantitative and qualitative feedback. No comparable study has been reported by any other label matching algorithms before in the research literature.
The teaching of systems analysis and design diagramming methods commonly utilises Computer Aided Software Engineering (CASE) tools to provide a way for students to actively practice the subject. However, many versions of these tools do not cater for the academic users who will require assistance in the underlying methods as well as the usage of the tool. The automated diagram comparison system developed at the University of Teesside can be used by students to compare a diagram that they consider to be a solution to a given problem against a model answer, and receive feedback commenting on their solution, which strengthens their understanding of the subject. This paper outlines a framework for such interactive learning, describes the use of the diagram comparison system, and highlights the benefits for the student.
We present UMLGrader, a system designed to provide automated feedback to students on class diagrams written in the Unified Modeling Language (UML). Given a diagram which is constructed to model a tightly constrained problem, the tool compares the diagram against a standard solution and provides feedback on missing elements and other errors. This supports using canned exercises to familiarize students with UML notation. We discuss tool requirements, our experiences with using it in a class at the sophomore/junior level, and possible future improvements.
Conference Paper
Automatic grading of programming assignments is an important topic in academic research. It aims at improving the level of feedback given to students and optimizing the professor’s time. Its importance is more remarkable as the amount and complexity of assignments increases. Several studies have reported the development of software tools to support this process. They usually consider particular deployment scenarios and specific requirements of the interested institution. However, the quantity and diversity of these tools makes it difficult to get a quick and accurate idea of their features. This paper reviews an ample set of tools for automatic grading of programming assignments. The review includes a description of every tool selected and their key features. Among others, the key features analyzed include the programming language used to build the tool, the programming languages supported for grading, the criteria applied in the evaluation process, the work mode (as a plugin, as an independent tool, etc.), the logical and deployment architectures, and the communications technology used. Then, implementations and operation results are described with quantitative and qualitative indicators to understand how successful the tools were. Quantitative indicators include number of courses, students, tasks, submissions considered for tests, and acceptance percentage after tests. Qualitative indicators include motivation, support, and skills improvement. A comparative analysis among the tools is shown, and as result a set of common gaps detected is provided. The lack of normalized evaluation criteria for assignments is identified as a key gap in the reviewed tools. Thus, an evaluation metrics frame to grade programming assignments is proposed. The results indicate that many of the analyzed features highly depend on the current technology infrastructure that supports the teaching process. Therefore, they are a limiting factor in reusing the tools in new implementation cases. Another fact is the inability to support new programming languages, which is limited by tools’ updates. On metrics for evaluation process, the set of analyzed tools showed much diversity and inflexibility. Knowing which implementation features are always specific and particular independently of the project, and which others could be common will be helpful before the implementation and operation of a tool. Considering how much flexibility could be attained in the evaluation process will be helpful to design a new tool, which will be used not only in particular cases, and to define the automation level of the evaluation process.