Automated Grading of Class Diagrams
Weiyi Bian
Trent University
Peterborough, Canada
weiyibian@trentu.ca
Omar Alam
Trent University
Peterborough, Canada
omaralam@trentu.ca
J¨
org Kienzle
McGill University
Montreal, Canada
Joerg.Kienzle@mcgill.ca
Abstract—Drawing UML diagrams, such as class diagrams, is
an essential learning task in many software engineering courses.
In course assignments, students are tasked to draw models
that describe scenarios, model requirements, or system designs.
The course instructor usually grades the diagrams manually by
comparing a student’s solution model with a template solution
model made by the instructor. However, modelling is not an exact
science, and multiple correct solutions or variants may exist. This
makes grading UML assignments a cumbersome task, especially
when there are many assignments to grade. Therefore, there is a
need for an automated grading tool that aids the instructor in the
grading process. This paper presents an approach for automated
grading of UML class diagrams. We propose a metamodel that
establishes mappings between the instructor solution and all the
solutions for a class. The approach uses a grading algorithm
that uses syntactic, semantic and structural matching to match
a student’s solutions with the template solution. We evaluate the
algorithm on a real assignment for modeling a Flight Ticketing
domain model for a class of 20 students and report our findings.
Index Terms—automated grading, class diagrams, model com-
parison
I. INTRODUCTION
Software engineering education is high in demand driven by
the fast-changing job market. This created a supply-demand
imbalance between computing college graduates and the avail-
able technology jobs. According to the US employment pro-
jections, there will be three times more available computer
science jobs than the number of graduates who could fill them
through year 2022 [1]. This created a renewed interest in the
field of computing. Currently, computing schools experience
an increase in enrolment as students rush into computer sci-
ence programs in record numbers [2]. The increasing number
of computing students increases the workload on instructors as
they have to grade large number of assignments. Besides the
increased workload, instructors struggle to grade assignments
and exams fairly, which is not an easy task. It is difficult for
human graders to precisely follow the grading formulae when
grading each individual assignment, especially when grading
subjective topics. Therefore, automated grading techniques are
very important to aid the instructors.
Although a number of approaches have been proposed
to automatically assess programming assignments [3], [4],
grading UML models, e.g. class diagram, has received little
attention. Class diagram designs, and UML models in general,
are considered ill-defined problems, where multiple solutions
may exist for a particular problem [5], [6]. Unlike well-
defined problems, where a solution can be either correct or
incorrect, a diagram design problem involving class diagrams
can have a large solution space. For example, solutions can
vary based on the class names, i.e., a student’s solution can
use a synonym for a class name instead of the exact name used
in the teacher’s solution. Solutions also can vary based on the
structure, e.g., adding attributes to the subclasses instead to
the superclass. These variations create an additional overhead
on the instructors when grading assignments, as they have to
spend longer time to evaluate a student’s answer. Furthermore,
instructors often revise their marking scheme after grading
several student papers. For example, an instructor may want
redistribute the grades when she discovers that students had
trouble with a particular part of the model, which is an
indication that the problem description was maybe not clear.
In such cases, the instructor might want to adjust the grading
weights for parts of the model to compensate. Unfortunately
this means that she has to manually update the grades for the
students she already graded by revisiting the students solutions
again using the new marking scheme. Finally, after receiving
their grades, many students may request that their copies be
reevaluated, often because the instructor may not have been
consistent when grading, for example, a large class over a
longer period of time.
Motivated by the aforementioned reasons, we propose an
automated grading approach for UML class diagrams. We
introduce a metamodel that stores grades for each model
element, e.g. classes, attributes, and associations. We present
an algorithm that establish mappings between model elements
in the instructor’s solutions to elements in the student solu-
tions, exploiting syntactic, semantic and structural matching
strategies. The students gets full mark for each element that
is perfectly matched. Partial marks are given to solutions
that are partially correct, e.g. an attribute that is placed in
a wrong class. The mappings and student grades are stored
using another metamodel, which makes it possible to update
the grading scheme later on. We implemented the algorithm
and grading metamodels in the TouchCORE tool [7], which vi-
sually shows the marks on the classes and prints out feedback
to the student. We ran this algorithm on a real assignment for
modeling a Flight Ticketing domain model for a class of 20
undergraduate students. On average, our algorithm was able
automatically grade the assignments within 14% difference of
the instructor’s grade. One important benefit of our approach
is that it can easily update the grades of the students when the
instructor changes the grading scheme.
700
2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems Companion
(MODELS-C)
978-1-7281-5125-0/19/$31.00 ©2019 IEEE
DOI 10.1109/MODELS-C.2019.00106
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Fig. 1. Instructor Solution for University Model
Fig. 2. Student Solution Model 1
The rest of this paper is organized as follows. The next
section motivates our paper with some examples. Section 3
introduces our grading metamodels. Section 4 discusses our
grading algorithm. Section 5 reports the results on our Flight
Ticketing case study. Section 6 discusses the related work and
Section 7 concludes the paper.
II. MOTIVATING EXAMPLES
In this section, we motivate our approach using a simple
class diagram modeling a university. Fig. 1 shows the instruc-
tor solution. The first student solution, shown in Fig. 2 uses
as a name for the Teacher class the word Instructor, uses the
wrong spelling form for Studemt, and uses Select instead of
Selection. Although the words that were used are not the same,
we want our matching algorithm to determine that Instructor
is a synonym for Teacher, which we call a semantic match.
The class Student should be matched with the class Studemt
syntactically, even though there is a spelling difference. In a
similar way, the class Select should be matched with Selection
and the operation printinfo matched with printinformation.
We also notice that the attribute location was misplaced, i.e.,
it was added to the class Select, which is wrong. Although
location is misplaced, one could argue that the student should
receive partial marks for including it to the model. Finally,
Fig. 3. Student Solution Model 2
Fig. 4. Student Solution Model 3
two elements, the attribute department and the operation
selectCourse are missing in the student solution, i.e. they could
not be matched syntactically or semantically with any element.
Fig. 3 shows a solution by a different student. There are
three important comparison checkpoints in this model: (1)
Class Subject has the same attributes as the class Course in the
instructor’s solution in Fig. 1. It is reasonable to consider that
these two classes should match due to their similar structure.
(2) The class Register has associations with class Student
and Subject (Subject is matched with Course using semantic
match). Therefore, we can consider that class Register is
matched with Selection, although their names do not match,
neither syntactically nor semantically. Again, this is a struc-
tural match based on the similarity of the associations with
other classes in the respective models. (3) In the instructor’s
model, the attribute age belongs to the superclass Person.
In the solution model in Fig. 3, the student added two age
attributes to the subclasses, Teacher and Student. We should
give these two attributes partial marks.
The third student solution shown in Fig. 4 illustrates two
interesting cases, class splitting and class merging. (1) Class
Classroom does not syntactically or semantically match any
class. Furthermore, its content does not provide enough infor-
mation to match with any class structurally. However, based
on attribute matching, the attribute location, which belongs to
the class Course in the instructor’s model has been misplaced
in the class Classroom by the student. Together, Class Course
and class Classroom in the student’s model have the same
attributes of the class Course in the instructor’s model. Also,
701
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Fig. 5. Student Solution Model 4
there is a 1-to-multiple association between class Classroom
and class Course in student’s model, allowing a particular
value for location to be associated with multiple courses.
We can therefore consider that the student has split the
class Course into two classes, Course and Classroom. (2)
Class Selection seems to be missing from the student’s model
because it fails to match with any element using the matching
methods that we discussed before. Based on the attribute and
operation matching results, we detect that all properties of
class Selection, i.e. attribute mark, have been misplaced to
class Student in the student’s model. Also in the instructor’s
model, class Selection has an association with class Student.
Therefore, we consider that the class Student in the student’s
model is a combination of class Student and class Selection in
the instructor’s model, and might want to give partial marks.
The fourth solution, shown in Fig. 5, illustrates how asso-
ciations are matched. In this model, the student forgot the
class Selection. There is no association between the class
Student and Course in the instructor’s model, but the class
Selection has two associations, with class Student and Course
with multiplicities 1 on both ends. Therefore, the student’s
association between Student and Course can be considered a
derivative association and should receive partial marks.
From all the examples above, we identified several matching
strategies which should be taken into account by our algorithm.
First, strict string matching is not sufficient for grading. It is
essential to combine syntactic matching (eliminating spelling
mistakes) and semantic matching (considering synonyms and
words with related meaning) for strings in our algorithm.
Second, structural matching strategies should be incorporated,
e.g. matching by comparing the contents of a class, similarity
based on the associations with other classes, and considering
classes that are split or merged. Third, the algorithm should
handle class inheritance properly, i.e. handle the class elements
that are misplaced within the inheritance hierarchy. Fourth,
the algorithm should be able to match associations, including
finding potential derivative associations.
III. GRADING METAMODELS
This section discusses the metamodels we defined to support
our automated grading approach. Rather than augmenting the
class diagram metamodel to support the definition of grades
and matchings for model elements, we decided to define
separate metamodels. This is less invasive, as it leaves the
class diagram metamodel unchanged, and hence all existing
modelling tools can continue to work. Furthermore, we avoid
referring to class diagram metaclasses directly, but instead
use the generic EClass,EAttribute and EReference
(as we are assuming metamodels expressed in the metameta-
modelling language ECore provided by the Eclipse Modelling
Framework). As a result, our grading metamodels can be ap-
plied to any modelling language with a metamodel expressed
in ECore.
Figure 6 shows the metamodel that augments any model
expressed in ECore with grades. The GradeModel maps EOb-
ject to EObjectGrade, which contains a points attribute. That
way any modelling element in a language that is modelled
with a metaclass can be given points to. In order to give
points for properties of modelling elements, EObjectGrade
maps EStructuralFeature, the ECore superclass of EAttribute
and EReference,toEStructuralFeatureGrade, which contains
again a points attribute.
To illustrate the use of the grade metamodel, imagine a
metamodel for class diagrams where attributes are modelled
with a metaclass CDAttribute that has a type EReference that
stores the type of the attribute. Now imagine the case where
we want to give 2 points for the age attribute of the Person
class in Figure 1, and an additional point if the type of the
attribute is int. In this case one would create a EObjectGrade
and insert it into the grade map using as a key CDAttribute,
and assign the points value 2.0. Additionally, one would create
aEStructuralFeatureGrade, insert it into the grade map using
as a key the EReference type of CDAttribute.
Figure 7 depicts the Classroom metamodel which is used
after the automated grading algorithm is run to store the
mappings that were discovered. It simply associates with each
model element in the teacher solution (EObject key) a list of
EObjects in the student solutions that were matched by the
algorithm. After the algorithm has been run, the matchings in
this data structure can be updated by the grader if necessary.
The information can also be used to automatically update the
grades of the students in case the teacher decides to change
the point weights in the teacher solution.
IV. GRADING ALGORITHM
In this section, we discuss the algorithm for automated grad-
ing of class diagrams. The overall algorithm is divided into six
parts: matching classes, match split classes, merged classes,
attributes and operations, associations and enumerations. In
the following, the six parts are explained in detail.
Algorithm 1 illustrates this process in detail. The algorithm
takes as input the instructor model, InstructorModel, and
the student model, StudentM odel.
Two different strategies are used to compare the names
of the classes. To perform a syntactic match (line 5), the
Levenshtein distance [8] is used to measure the similarity
between the two names. The Levenshtein distance calculates
the minimum number of single-character edits required to
change one word into another. Two classes are matched when
their Levenshtein distance is smaller than 40 percentage of the
longest name string length.
702
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
GradeModel
points: EDouble
EObjectGrade
grade
1
EObject points: EDouble
EStructuralFeatureGrade
grade
1
EStructuralFeature
Fig. 6. Grade Metamodel
ClassroomModel
EObject
matched
0..*
EObject
Fig. 7. Classroom Metamodel
Algorithm 1 Compare Classes
1: procedure COMPARECLASS(InstructorModel,
StudentModel)
2: instList←InstructorModel.getClass()
3: studList←StudentModel.getClass()
4: for all Class Ciin instList,Csin studList do
5: if syntacticMatch(Cs.name, Ci.name) or
6: semanticMatch(Cs.name, Ci.name) ) or
7: contentMatch(Cs.content, Ci.content) then
8: storePossibleMatch(Ci,Cs)
9: for all Class Ciin instList do
10: if ∃matched classes for Cithen
11: find among the matches of Cithe Cbest
12: that obtains the highest mark among the matches
13: classMatchMap.put(Ci,Cbest)
14: for all Class Ciin missClassList do
15: for all Class Csin studList do
16: if no match exists for Csthen
17: ListI←Ci.getAssociationEnds()
18: ListS←Cs.getAssociationEnds()
19: if assocMatch(ListS,ListI)then
20: classMatchMap.put(Ci,Cs)
21: if no match exists for Cithen
22: missClassList.add(Ci)
return classMatchMap, missClassList
The second strategy involves a semantic match (line 6).
We used three algorithms available from WS4J (WOrdNet
Similarity for Java) [9], which all calculate a similarity metric
between two words based on the WordNet database: Hirst and
St-Onge Measure (HSO) [10], Wu and Palmer (WUP) [11]
and LIN [12]. The combined use of three measures presents
a better performance than using only a single measure. If the
determined score is satisfactory, then the match is stored.
•HSO: This measure compares two concepts based on the
path distances between them in the WordNet database.
It measures the similarity by the number of directions
change which should be needed to connect one concepts
to another.
•WUP: Given two concepts, WUP measures their simi-
larity by the number of common concepts from the root
concepts to these two concepts.
•LIN: Lin is an improvement of the Resnik measure [13]
and uses Information Content (IC) of two concepts to cal-
culate their semantic similarity. IC of a term is calculated
by measuring its frequency in a collection of documents.
The class structural similarity match strategy (named con-
tentMatch in line 7) includes property similarity match which
compares the properties of two classes. Two properties, e.g.,
two attributes, with matched names would be regarded as
similar. The number of properties that are needed to be edited
to change one class to another is key to determine whether
two classes could be regarded as structurally similar.
After we find all potential matched classes, we calculate
the grades for each potential match (lines 9 - 13). If Cs(in
the student solution) is a potential match for class Ci(in the
instructor’s solution), we calculate the points that Cswould
give the student based on the grades attributed to Ciand its
content. The matched class that gets the highest grade in the
student solution is then retained as the final match for Ci.
After finding possible matching classes based on their
names and content, we additionally search for classes that
could be matched based on their associations to other classes.
Lines 14 - 20 illustrate this process. For each pair of classes
that is not yet matched, we look at their association ends, and
if two classes have similar association ends, we consider them
matched.
While Algorithm 1 matches the classes, there could be
attributes or operations that are misplaced, i.e., are placed in
the wrong class in the student model. Let Aibe one property
(attribute or operation) in the instructor model and Asone
property in the student model. There are four scenarios: (1)
the names of Aiand Asmatch and their containing classes
also match. (2) the names of Aiand Asmatch while their
containing classes do not match. In this case Asis considered
misplaced. Based on the grading policy, misplaced properties
should score fewer points. For example, for the case study
presented in the next section, we deducted 0.5 points for
each misplaced property. (3) the names of Aiand Asmatch,
however, Aibelongs to a super class and Asbelongs to one
of the subclasses. If Aiis not private, then Aiand Asare
considered as matched. However, in this case, the student
could also only get partial marks because the scope of the
property is too limited. (4) Ai and As could not match with
each other at all. Algorithm 2 finds the matched attributes and
operations in two models. In addition to the instructor and
student models, this algorithm takes as input the matched class
map which was populated by Algorithm 1, classM atchMap.
The algorithm starts by finding the matched attributes in the
same classes, i.e., same matched classes (line 4-10), if it does
not find a corresponding matched attribute in the same class, it
will look for it in the super class (line 11-12). It is not shown
in Algorithm 2, but we traverse the inheritance hierarchy all
the way up. If it does not find the attribute in the super class,
it will look for it in other classes in the model that are not
matched with the class. If the attribute exists in an unmatched
class, then it is considered to be misplaced and should be given
703
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Algorithm 2 Compare attributes and operations in
InstructorModel with StudentModel
1: procedure COMPARECONTENT(InstructorModel,
StudentModel,classMatchMap)
2: instList←InstructorModel.getAttribute()
3: studList←StudentModel.getAttribute()
4: for all Attribute Aiin instList,Asin studList do
5: Ci←Ai.eContainer()
6: Cs←As.eContainer()
7: if Aiis synatax or semtantic match for Asthen
8: if classMatchMap.get(Cs).equals(Ci)then
9: matchedAttrMap.put(As,Ai)
10: else if Ciis superClass of classMatchMap.get(Cs)
and Aiis not private then
11: matchedAttrMap.put(As,Ai)
12: for all Attribute Aiin instAttrList,Asin
studAttrList do
13: if Asnot matched And Aiis synatax or semtantic match
for Asthen
14: misplaceAttrMap.put(As,Ai)
15: instList←InstructorModel.getOperation()
16: studList←StudentModel.getOperation()
17: for all Operation Oiin instList,Osin studList do
18: Ci←Oi.eContainer()
19: Cs←Os.eContainer()
20: if Oi.synMatch(Os)or Oi.semanticMatch(Os)then
21: if classMatchMap.get(Cs) equals Cithen
22: matchedOperMap.put(Os,Oi)
23: else if Oiis superClass of classMatchMap.get(Cs)
and Oiis not private then
24: matchedOperMap.put(Os,Oi)
25: for all Operation Oiin instOperList,Osin
studOperList do
26: if Osis not matched And Oi.synlMatch(Os)or
Oi.semanticMatch(Os)then
27: misplaceOperMap.put(Os,Oi)
28: instOperList.put(Oi, true)
29: return matchedAttrMap,misplaceAttrMap,
matchedOperMap,misplaceOperMap
Algorithm 3 Check whether a class is split into two classes
1: procedure CLASSSPLITMATCH(InstructorModel,
StudentModel)
2: instList←InstructorModel.getClass()
3: studList←StudentModel.getClass()
4: for all Class Cs0in studList, zhuzhuCs1in studList
do
5: if Cs0and Cs1has 1-to-multiple association then
6: for all Class Ciin instList do
7: if Cihas same properties with Cs0and Cs1then
8: splitClassMap.put(Ci,<Cs0,Cs1>)
9: break
10: return splitClassMap
a partial grade. Operations are matched in a similar way (line
19-32). After finding all the matches, the algorithm returns
a map of matched attributes, matchedAttrM ap, a map of
misplaced attributes, misplaceAttrM ap, a map of matched
operations, matchedOperM ap and a map of misplaced op-
erations, misplaceOperM ap.
Algorithm 3 checks whether the student splits one class into
two classes. Let be Cs0and Cs1be two classes in the student
Algorithm 4 Check whether a class is merged into another
class
1: procedure CLASSMERGEMATCH(InstructorModel,
StudentModel)
2: for all Class Ci1in InstructorModel matched with Cs
in StudentModel do
3: for all Class Ci2in InstructorModel which content
is misplaced in Cs do
4: if Ci1has association with Ci2then
5: mergeClassMap.put(Cs,<Ci1Ci2>)
6: break
7: return mergeClassMap
model. The algorithm first checks if there is a 1-to-multiple
association between Cs0and Cs1(line 6). If an attribute is
extracted from a class Aand placed in a different class B,
then, there should be 1-to-multiple association from the Bto
A. This allows a value for the attribute in Bto be associated
with multiple instances of A, as discussed previously in the
example of Fig. 4. Then if there exists one class Ci in the
instructor model that has the similar properties in both Cs0
and Cs1, we consider that class Ci has been split into Cs0and
Cs1by student. The algorithm returns a map of split classes,
splitClassMap.
Algorithm 4 checks whether two class in the instructor
model could be matched with one class in the student model,
which means the student merged the two classes into one class
in her solution. Let be Ci1and Ci2two classes in the instructor
model, where all properties of Ci1have been misplaced into
class Csin the solution model. If Csis already matched with
Ci2based on the class matching algorithm and Ci1and Ci2
have an association between them (line 3), we can consider
that the student used Cs to combine both Ci1and Ci2.We
only give points to two classes that are merged into one class.
We do not give points to more than two classes that are merged
into one classes. In that case, the merged class will become
quite complex and with less cohesive. After finding all the
merged classes, the algorithm returns a map of merged classes,
mergeC lassMap.
Algorithm 5 matches the associations in two models. Con-
trary to other matching algorithms mentioned before, this
algorithm does not focus on comparing associations based
on their names, rather it compares them based on the classes
that an association is connected with. Let C0and C1be two
classes connected by association Ai in the instructor model,
and C2and C3be two classes connected by association As
in the student model. If C0and C1in the instructor model,
and C2and C3in the student model could be matched as
two pairs of classes, As and Ai should be also matched.
Then, if some classes are missing in the student model,
we try to find potential derivative associations that could
go through the missing class. For each missing class, we
first find the classes that it is connected with (line 8-9).
We do this process recursively, although, it is not shown in
the algorithm. This means that we also find classes that the
missing class is connected with indirectly, i.e., through other
704
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Algorithm 5 Compare association in InstructorModel
and StudentModel
1: procedure COMPAREASSOC(InstructorModel,
StudentModel,missClassList)
2: instAssocList ←InstructorModel.getAssociation()
3: studAssocList ←StudentModel.getAssociation()
4: for all Association Aiin instAssocList,Asin
studAssocList do
5: if Aiand Asconnect two pairs of matched classes then
6: associationMatchMap.put(As,Ai)
7: for all class Cin missClassList do
8: for all Class Ciin InstructorModel is connected
with Cdo
9: possibleAssocMap.get(C).add(Ci)
10: for all Association Asin studAssocList do
11: endClass1 ←As.getEnd1()
12: endClass2 ←As.getEnd2()
13: for all Key Class Cin possibleAssocMap do
14: possibleClassList←possibleAssocMap.get(C)
15: if endClass1 in possibleClassList and
endClass2 in possibleClassList then
16: derivationList.add(As)
17: return associationMatchMap,
derivationList
classes. Then, we search if there is an association in the
student model which has one end connected to the missing
class (line 10-16). The algorithm returns a map of matched
associations, associationMatchMap and a list of derived
association, derivationList. It is important to note that a
grader may want to give grades for any derivative association,
i.e., not necessarily derived from a missing class in the student
solution. In that case, we have to relax the condition check for
missing classes in this algorithm. In the case study discussed
in the next section, the instructor opted to give grades for any
derived association.
An enumeration is a type that has a defined number of
possible values. Algorithm 6 matches enumerations of two
models. The straightforward way to match enumerations is to
compare their names and their literal values. Let be Ei be an
enumeration in instructor model and Es be an enumeration
in the solution model. If the entries of Ei and Es could be
matched by their names, Ei and Es would be considered
as matched. It is possible that student does not model the
enumeration perfectly, in which case there will entries in Ei
that are not matched with entries in Es. The algorithm returns
a map of matched enumerations, enumMatchMap.
For the missing literals in the enumeration, it is possible that
the student used other model elements, such as an attribute
or a class, to represent a missing entry in the enumeration.
If a literal ein Ei could not be matched with any entry in
enumeration Es, we search whether there is a class or attribute
in the student solution which name matches with e. Depending
on the grading scheme, the instructor can opt for giving a full
grade or partial grade when a student use an attribute or a
class to represent an enum literal.
Algorithm 6 Compare ENUM in InstructorModel and
StudentModel
1: procedure COMPAREENUM(InstructorModel,
StudentModel)
2: instENUMList ←InstructorModel.getENUM()
3: studENUMList ←StudentModel.getENUM()
4: for all ENUM Eiin instENUMList,Esin
studENUMList do
5: if syntacticMatch(Es.name, Ei.name) or
6: semanticMatch(Es.name, Ei.name)) then
7: enumMatchMap.put(Es,Ei)
8: else if Esand Eihave similar literal values then
9: enumMatchMap.put(Es,Ei)
10: studClassList ←StudentModel.getClass()
11: studAttrList ←StudentModel.getAttribute()
12: for all ENUM Eiin instENUMList do
13: for all literal Lin Ei.literal do
14: for all Attribute Asin studClassList do
15: if As.Name.syntacticMatch(L.Name) or
As.Name.semanticMatch(L.Name) then
16: consider Asrepresent L
17: for all class Csin studClassList do
18: if Cs.Name.syntacticMatch(L.Name) or
Cs.Name.semanticMatch(L.Name) then
19: consider Csrepresent L
20: return enumMatchMap
V. C ASE STUDY IN FLIGHT TICKETING DOMAIN MODEL
In this section, we apply our approach on an assignment to
draw a domain model for Flight Ticketing system. Students
were given an assignment handout detailing the assignment
questions and requirements, which is not shown here. The as-
signment was given to a third year software engineering class.
All students were taught how to design domain models and had
prior knowledge about class diagrams. Twenty students sub-
mitted the assignment and we used their submitted solutions
to run our experiment. This assignment was given a year prior
to developing our tool. Therefore, the instructor who graded
the students and the students themselves did not make any
assumption that the assignments will be automatically graded
by a tool. Fig. 8 shows the instructor’s solution and his grading
scheme. Based on this scheme, the maximum grade that could
be achieved is 55. Table II lists the grading that each student
received. It shows the instructor’s grading, our tool’s grading
and the reason for the difference between the two gradings.
The classroom average based on the instructor’s grading was
36.9, compared to 34.7 achieved automatically by our tool.
The average difference between the instructor’s grade and our
tool’s grade was 4.65, i.e., our tool was able to automatically
grade the students within less than 14% difference to the
instructor’s grade. The previous section discussed the matching
algorithm, but did not discuss the grades that are provided for
each matched or missing element. Our tool implementation
allows the instructor to change the grading scheme that is
shown in Fig. 8. Because we keep a map to the students
using the metamodels that were discussed earlier, it is easy
to update the grades of the students based on the new grading
scheme. In addition, our implementation allows to change the
705
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Fig. 8. Instructor’s Solution for Flight Ticketing Domain Model
deduction policy to be close to a particular instructor’s style.
We examined closely how the instructor graded the students
for this case study, and adopted our algorithm accordingly.
Here is a summary of the deduction policy that we adopted
for this case study:
•Misplaced attribute/operation: deduct half a mark.
•Derived association: give half of the mark of the correct
association.
•Missing element: deduct whole mark.
•Attribute or class representing an enumeration entry: give
average mark, i.e., if the enumeration has 3 entries and
was assigned 1 total mark in the instructor’s solution. We
assign 1/3 mark for each attribute or class representing
an entry for this enumeration.
We investigated the elements that the algorithm graded
differently than the instructor. When we discussed the dif-
ferences in Table II. We listed each element that the algo-
rithm graded differently. For example, for student 1, for the
class CheckedIn, the instructor gave it 2 points while the
algorithm gave it 0.33 points (I:2, A:0.33). For the same
student, the algorithm gave 0 marks for a wrong attribute.
This case is interesting, because the algorithm completely
disagreed with the instructor. Student 1 had an attribute called
airlinedesignator which did not match with any attribute in
the instructor model. The instructor chose to give it a point
regardless. In other similar cases, we found that the instructor
was lenient when grading students, therefore, he gave points
for elements that do not match. In other cases, the instructor
gave 0 to an element while our algorithm gave some points.
For example, for student 5, the student had a class called Time
which had a date/time attribute. The instructor did not give any
points for this, while our algorithm matched it with the class
F lightO ccurence based on its content. Furthermore, because
Time was matched with F lig htOccurence, the algorithm
was able to give points to two associations: an association
between Time and Flight and between Time and Person.
The first association matches with the association between
F lightO ccurence and Flight in the instructor model. The
second association could be derived from two associations:
the association between F lightO ccurence and Trip, and the
association between Trip amnd Person as shown in Fig. 8.
Fig. 9 shows the grades for student 12. This model is
shown in the TouchCORE tool [7] which we used to imple-
ment our approach. We extended the tool to use the grading
706
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
TABLE I
GRADES FOR FLIGHT TICKETING DOMAIN MODEL
No. Instructor Algorithm Reason for Difference
1 31 28.33 Class CheckedIn: (I:2, A:0.33), Attribute airlinedesignator: (I:1, A:0)
2 15 22.5 Class Tick et :(I:0, A:2), Associations Ticket-Flight, Person-Flight, Ticket-Person: (I:0, A:6)
Attribute flightClass misplaced:(I:0, A: 0.5)
3 47 41.83 Class Section:(I:3, A:0), Class CheckedIn: (I:2, A:0.33)
Attribute Date misplaced: (I:1, A:0.5)
4 41 41 all matched
5 22 28.33 Class Time: (I:0, A:2), Associations Time-Flight, Time-Person: (I:0, A:4),
Attribute isBoarded: (I:0, A:0.33)
6 44 33.83 Class CheckedIn: (I:2, A:0.33), Associations CheckedIn-Trip, CheckedIn-Person: (I:4, A:0)
Class BookedFlight: (I:4, A:0), Attribute Date misplaced: (I:1, A:0.5)
7 34 39.33 Class Tick et : (I:0, A:2), Association Person-Ticket, Seat-Ticket, Ticket-Flight: (I:0, A:6)
Class Checking: (I:2, A:0.33), Attribute flightClass,date Mispalced: (I:2, A:1)
8 39 37 Attribute Dep: (I:2, A:0)
9 38 31.83 Attribute departure, arrival: (I:4, A:0), Attribute boarded: (I:1, A:0.33)
Class Class: (I:2, A:1), Attribute date misplaced: (I:1, A:0.5)
10 32 29 Attribute seatNumber: (I:1, A:0)
Attribute fullName: (I:2, A:0)
11 34 36.83 Class Company: (I:0, A:2), Association Ticket-Seat: (I:0, A:2)
Attribute Board: (I:1, A:0.33), Attribute date misplaced: (I:1, A:0.5)
12 44 42.83 Attribute date misplaced: (I:1, A:0.5), Attribute isCheckedIn: (I:1, A:0.33)
13 41 41 Class Date: (I:0, A:2), Association Reservation-Seat: (I:2, A:0)
14 44 32.83 Class CheckedIn: (I:2, A:0.33), Association Person-CheckIn,CheckIn-Flight: (I:4, A:0)
Attribute number: (I:3, A:0), Attribute date misplaced: (I:1, A:0.5),Class BookedFlight: (I:2, A:0)
15 44 41.33 Attribute CheckedIn: (I:2, A:0.33), Association: Luggage-Ticket: (I:2, A:1)
16 29 18.16 Attribute isBoarded: (I:1, A:0.33), Attribute date misplaced: (I:1, A:0.5), Attribute seat: (I:4, A:0)
Association Passenger-Luggage, Flight-Passenger:(I:4,A:0), Class CheckedIn: (I:2, A:0.33)
17 40 34.83 Attribute luggage: (I:4, A:0), Attribute CheckedIn: (I:1, A:0.33)
Attribute date misplaced: (I:1, A:0.5)
18 45 38.83 Attribute date misplaced: (I:1, A:0.5), Class CheckIn: (I:2, A:0.33)
Association CheckIn-Luggage, CheckIn-Booking: (I:4, A:0)
19 43 44.5 Attribute date misplaced: (I:1, A:0.5), Association Flight-Seat: (I:0, A:2)
20 30 34.66 Associations Luggage-City, Flight-Passenger, Flight-Ticket, Luggage-Seat: (I:0, A:8),Class CheckedIn: (I:2, A:0.33)
Attribute isBoarded: (I:0, A:0.33), Attribute date misplaced: (I:1, A:0.5), Attribute name: (I:2, A:0)
TABLE II
FEEDBACK FOR STUDENT 12
Classes:
City: 2.0/2.0, matches with Class City
Flight: 2.0/2.0, matches with Class Flight
Ticket: 2.0/2.0, matches with Class FlightOccurrence
...
Attributes:
number in Class Flight: 1.0/1.0, matches with number in Class Flight
status in Class BookedFlight: 0.0/2.0 missing attribute
date in Class FlightOccurrence: 0.5/1.0 misplaced attribute
...
Associations:
City-Flight: 2.0/2.0 matches with association between Flight and City
Seat-Ticket: 2.0/2.0 Match association between BookedFlight and Seat
Ticket-Seat: 2.0 derivative association matched with associations FlightOccurence-BookedFlight-Seat
...
metamodels described earlier and implemented the matching
algorithms. The grades are shown in circles around each class
or an attribute. When the instructor loads a model, she can
automatically grade it by pressing on the last button on the
right (below the + button). The tool graded two elements
differently than the instructor (highlighted in yellow in Fig. 9).
The first element is the attribute date in the class Flight.
This attribute should belong to the class Ticket but the
student misplaced it, because the class Ticket is matched
with F lightO ccurence which has date as an attribute in
the instructor model. However, the instructor was lenient
and decided to give it a full point regardless. The second
element that the tool graded differently is isCheckedI n in
the class Ticket. This attribute belongs to the enumeration
P assengerS tatus in the instructor solution. The instructor
was again lenient and did not deduct points for misplacing
the attribute in the Ticket instead of having it as an enum
literal in P assengerS tatus.
The instructor was not always lenient in grading. For
example, for student 20, the instructor did not give points
707
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
Fig. 9. Solution Model for student 12 showing their grading
for the four derived associations, although, he chose to give
points for derived associations to other students. When we
talked to the instructor, he admitted that he was not completely
consistent in grading and that he should have given some
points for this student. The tool also prints a feedback sheet
for the student listing the points that she received and points
that she missed. The student also receives a feedback sheet
explaining where they received or lost marks. An excerpt of
this sheet is shown in Table II.
VI. RELATED WORK
A number of approaches have been proposed to compare
UML class diagrams. Haggarth and Lockyer [14] proposed a
framework that provides feedback to the student based on the
comparison between student’s model with the teacher model.
Ali et. al. [15] proposed an assessment tool to compare a
class diagram drawn by students to the teacher’s diagram.
Soler et. al. [16] developed a web-based tool for correcting
UML class diagram. Their approach checks for errors in a
student’s model by comparing with models in a repository of
similar models. Unlike our approach, they do not allow to
assign grades to models. Hasker [17] introduced UMLGrader,
a tool for automated grading UML models by comparing
a student’s solution with a standard solution as we do in
our approach. However, their approach relies only on string
matching. Our matching algorithm uses syntax, semantic and
structural matching to compare models.
There are approaches that proposed automated assessment
of other kinds of UML models, e.g. Use Case Specifications.
Jayal and Shepperd [18] proposed a method for label matching
for UML diagrams and using different levels of decomposition
and syntactical matching. They evaluate their approach using
a case study on matching activity diagrams. Tselonis et.
al. [19] introduced a diagram marking technique based on
graph matching. Thomas et. al. [20] introduced a framework
that used synonyms and an edit distance algorithm to mark
graph-based models. Vachharajani et al. [21] introduced a
framework for automatic assessment of Use Case Diagrams
using syntactic and semantic matching. Sousa and Paulo [22]
introduced a structural approach for graphs that establishes
mappings from a teacher’s solution to elements in the student
solution that maximizes a student’s grade. Finally, our tool
provides feedback to the student about the deducted marks. A
number of educational tools for learning programming provide
different kinds of feedback to the learner. Keuning et. al.
708
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.
provides a summary of feedback generation approaches for
programming tasks [23].
There are three main differences between our approach
and the approaches discussed above: (1) our approach com-
bines syntactic, semantic and structural matching for grading
class diagrams. In addition to using Levenshtein distance for
syntactical matching, our approach uses three algorithms for
semantic matching and performs structural matching between
two diagrams. Based on the matching results, the approach
assigns marks to the model elements. Most of the above
approaches are limited to syntactic matching of names. (2)
Our approach proposes a non-invasive grading metamodel that
stores the determined grades alongside the model as feedback
to the students. (3) Our approach proposes a new classroom
metamodel that allows for saving and automatic updating the
grades of a group of students in case the teacher changes the
grading scheme.
VII. CONCLUSION
UML diagrams in general, and class diagrams in particular,
are widely used in computer science and software engineer-
ing education. In many courses, computer science students
are required to solve assignments or answer exam questions
involving class diagrams. Instructors usually grade these dia-
grams manually by comparing each student solution with the
template solution that they prepared for the assignment/exam.
This could be a cumbersome task, especially when they have
to grade large number of student papers. Furthermore, a
particular problem could have different possible design solu-
tions using class diagrams. Solutions could vary based on the
names of the classes, their properties, or relationships between
classes. Therefore, instructors have to spend longer time to
examine each student’s solution. In this paper, we propose an
automated grading approach for class diagrams. In particular,
we propose two metamodels, one to establish mappings be-
tween an instructor’s solution and student solutions, the other
metamodel assigns grades to model elements and stores them.
Furthermore, we introduced a grading algorithm that matches
model elements in the student model with elements in the
instructor model. Based on the matching, students are provided
with the grades. We implemented our ideas on TouchCORE
tool and used it to automatically grade a third year assignment
to draw a domain model for Flight Ticketing System. Our
tool was able to automatically grade 20 students within 14%
difference to the grading received by the instructor.
In future, we plan to expand our approach to be able
to grade other UML models, e.g., sequence diagrams and
statemachine diagrams. We also plan to run more experiments
with assignments obrained from different instructors.
REFERENCES
[1] J. Adams, Computing Is The Safe STEM Career Choice Today, Novem-
ber 3, 2014. [Online]. Available: https://cacm.acm.org/blogs/blog-
cacm/180053-computing-is-the-safe-stem-career-choice-today/fulltext
[2] N. Singer, The Hard Part of Computer Sci-
ence? Getting Into Class, January 24, 2019. [Online].
Available: https://www.nytimes.com/2019/01/24/technology/computer-
science-courses-college.html
[3] P. Ihantola, T. Ahoniemi, V. Karavirta, and O. Sepp ¨
al¨
a, “Review of
recent systems for automatic assessment of programming assignments,”
in Proceedings of the 10th Koli Calling International Conference
on Computing Education Research, ser. Koli Calling ’10. New
York, NY, USA: ACM, 2010, pp. 86–93. [Online]. Available:
http://doi.acm.org/10.1145/1930464.1930480
[4] J. C. Caiza and J. M. d. ´
Alamo Ramiro, “Programming assignments
automatic grading: review of tools and implementations,” 2013.
[5] N.-T. Le, F. Loll, and N. Pinkwart, “Operationalizing the continuum be-
tween well-defined and ill-defined problems for educational technology,”
IEEE Trans. Learn. Technol., vol. 6, no. 3, pp. 258–270, Jul. 2013.
[6] P. Fournier-Viger, R. Nkambou, and E. M. Nguifo, Building Intelligent
Tutoring Systems for Ill-Defined Domains. Berlin, Heidelberg: Springer
Berlin Heidelberg, 2010, pp. 81–101.
[7] M. Sch ¨
ottle, N. Thimmegowda, O. Alam, J. Kienzle, and G. Muss-
bacher, “Feature modelling and traceability for concern-driven software
development with touchcore,” in Companion Proceedings of the 14th
International Conference on Modularity, MODULARITY 2015, Fort
Collins, CO, USA, March 16 - 19, 2015, 2015, pp. 11–14.
[8] V. I. Levenshtein, “Binary codes capable of correcting deletions, inser-
tions, and reversals,” in Soviet physics doklady, vol. 10, no. 8, 1966, pp.
707–710.
[9] H. Shima, “Wordnet similarity for java (ws4j),” HYPERLINK” https:
llcode. google. com/p/ws4jl” https: llcode. google. comlp/ws4j, 2016.
[10] G. Hirst, D. St-Onge et al., “Lexical chains as representations of
context for the detection and correction of malapropisms,” WordNet:
An electronic lexical database, vol. 305, pp. 305–332, 1998.
[11] Z. Wu and M. Palmer, “Verbs semantics and lexical selection,” in Pro-
ceedings of the 32nd annual meeting on Association for Computational
Linguistics. Association for Computational Linguistics, 1994, pp. 133–
138.
[12] D. Lin, “An information-theoretic definition of similarity,” in
Proceedings of the Fifteenth International Conference on Machine
Learning, ser. ICML ’98. San Francisco, CA, USA: Morgan
Kaufmann Publishers Inc., 1998, pp. 296–304. [Online]. Available:
http://dl.acm.org/citation.cfm?id=645527.657297
[13] P. Resnik, “Semantic similarity in a taxonomy: An information-based
measure and its application to problems of ambiguity in natural lan-
guage,” J. Artif. Int. Res., vol. 11, no. 1, pp. 95–130, Jul. 1999.
[14] G. Hoggarth and M. Lockyer, “An automated student diagram assess-
ment system,” SIGCSE Bull., vol. 30, no. 3, pp. 122–124, Aug. 1998.
[15] N. Haji Ali, Z. Shukur, and S. Idris, “Assessment system for uml class
diagram using notations extraction,” International Journal of Computer
Science and Network Security, vol. 7, no. 8, pp. 181–187, 2007.
[16] J. Soler, I. Boada, F. Prados, J. Poch, and R. Fabregat, “A web-
based e-learning tool for uml class diagrams,” in IEEE EDUCON 2010
Conference, April 2010, pp. 973–979.
[17] R. W. Hasker, “Umlgrader: An automated class diagram grader,” J.
Comput. Sci. Coll., vol. 27, no. 1, pp. 47–54, Oct. 2011.
[18] A. Jayal and M. Shepperd, “The problem of labels in e-assessment of
diagrams,” J. Educ. Resour. Comput., vol. 8, no. 4, pp. 12:1–12:13, Jan.
2009.
[19] C. Tselonis, J. Sargeant, and M. McGee Wood, “Diagram matching for
human-computer collaborative assessment,” 2005.
[20] P. Thomas, K. Waugh, and N. Smith, “Learning and automatically as-
sessing graph-based diagrams,” in Beyond Control: learning technology
for the social network generation. Research Proceedings of the 14th
Association for Learning Technology Conference (ALT-C, 46 September,
Nottingham, UK, 2007), 2007, pp. 61–74.
[21] V. Vachharajani and J. Pareek, “Framework to approximate label match-
ing for automatic assessment of use-case diagram,” International Journal
of Distance Education Technologies (IJDET), vol. 17, no. 3, pp. 75–95,
2019.
[22] R. Sousa and J. P. Leal, “A structural approach to assess graph-based
exercises,” in International Symposium on Languages, Applications and
Technologies. Springer, 2015, pp. 182–193.
[23] H. Keuning, J. Jeuring, and B. Heeren, “Towards a systematic review
of automated feedback generation for programming exercises,” in Pro-
ceedings of the 2016 ACM Conference on Innovation and Technology in
Computer Science Education, ser. ITiCSE ’16. New York, NY, USA:
ACM, 2016, pp. 41–46.
709
Authorized licensed use limited to: University of Glasgow. Downloaded on June 04,2020 at 02:56:11 UTC from IEEE Xplore. Restrictions apply.