UML collaboration diagram syntax: An empirical study of comprehension

Conference Paper (PDF Available) · February 2002with57 Reads
DOI: 10.1109/VISSOF.2002.1019790 · Source: IEEE Xplore
Conference: Visualizing Software for Understanding and Analysis, 2002. Proceedings. First International Workshop on
Abstract
The UML syntactic notation used in texts, papers, documentation and CASE tools is often different, despite UML being considered a software engineering standard. Our initial empirical study considered variations in the notation used for UML class diagrams; the experiment reported concentrates on UML collaboration diagrams. The decision as to which of the semantically equivalent notational variations within the UML standard to use appears to be according to the personal preference of the author or publisher, rather than based on any consideration of the ease with which the notation can be understood by human readers. This paper reports on an experiment that takes a human comprehension perspective on UML collaboration diagrams. Five notations were considered: for each, two semantically equivalent (yet syntactically or stylistically different), variations were chosen from published texts. Our experiment required subjects to indicate whether a supplied pseudo-code specification matched each of a set of experimental UML collaboration diagrams. The results reveal that our informal, personal intuitions (which were based on our view of the complexity of the notation) are validated with respect to confirming that a specification matches a diagram, but not when errors in a diagram are to be identified. The subjects' preferences are in favour of the more concise notational variants.
UML collaboration diagram syntax: an empirical study of comprehension
Helen C. Purchase*, Linda Colpoys
, Matthew McGill
and David Carrington
*Department of Computing Science, University of Glasgow
School of Information Technology and Electrical Engineering, University of Queensland
hcp@dcs.gla.ac.uk
, davec@itee.uq.edu.au
Abstract
The UML syntactic notation used in texts, papers,
documentation and CASE tools is often different,
despite UML being considered a software engineering
standard. Our initial empirical study considered
variations in the notation used for UML class
diagrams; the experiment reported here concentrates on
UML collaboration diagrams.
The decision as to which of the semantically
equivalent notational variations within the UML
standard to use appears to be according to the personal
preference of the author or publisher, rather than based
on any consideration of the ease with which the
notation can be understood by human readers. This
paper reports on an experiment that takes a human
comprehension perspective on UML collaboration
diagrams. Five notations were considered: for each,
two semantically equivalent (yet syntactically or
stylistically different), variations were chosen from
published texts. Our experiment required subjects to
indicate whether a supplied pseudo-code specification
matched each of a set of experimental UML
collaboration diagrams.
The results reveal that our informal, personal
intuitions (which were based on our view of the
complexity of the notation) are validated with respect to
confirming that a specification matches a diagram, but
not when errors in a diagram are to be identified. The
subjects' preferences are in favour of the more concise
notational variants.
1. Introduction
In recent years, the Unified Modelling Language
(UML) has emerged as the defacto standard for the
representation of software engineering diagrams [15].
While adopted as a standard by the membership of the
OMG (Object Management Group) in 1997 [15], a
glance through different texts that use UML, or an
investigation of current CASE tools, reveals a number
of notational variations [2, 4, 9, 10].
As a standard, UML is still evolving, and within
each version of the standard are many semantically
equivalent notational variations from which authors or
publishers may choose, according to their personal
preference. There seems to be no basis for the choice of
one notational variation over another.
It may be that one notational variation is easier to
comprehend than another. Our previous study [14]
investigated notational variants of UML class diagrams,
and found that the best performing notation may depend
on the task for which it is used, and that our personal,
intuitive predictions were partly confirmed. This study
considers various notations for UML collaboration
diagrams in a similar manner, and aims to determine
which notational variants are easier for humans to
understand.
The research reported in this paper attempts to
determine whether there are any comprehension
differences between five notational variations for UML
collaboration diagrams, through an empirical study of
subjects' performance on a specification matching task.
The notational variants may be equally as good as each
other, in which case, it does not matter which one is
used. But, if there are comprehension differences, the
results of this study could assist in the definition of
appropriate notational standards (from a human
understanding point of view), and in determining the
most suitable notation to use in UML texts, CASE tools,
and practical software engineering tasks.
Some theoretical analytical work has been
performed on software engineering notations and visual
programming languages, but no empirical studies on
human comprehension of these notations have been
found. While software engineering notations (including
UML) have been analysed and compared using the
Cognitive Dimensions framework [1, 5], this framework
is theoretical, and these analyses, while providing
interesting structured perspectives on the notational
features, are not based on experimental data [3, 6, 7, 8].
The experiment reported here complements such
analytical work.
Proceedings of the First International Workshop on Visualizing Software for Understanding and Analysis (VISSOFT’02)
0-7695-1662-9/02 $17.00 © 2002 IEEE
sys:System los:ListOfStudents
s:Student
User
showAge(name:String)
2: s:=getStudent(name)
3: getName()
4: i:=getAge()
5: displayAge(i)
ui:UI
1: showAge(name)
Figure 1. Example of a UML collaboration diagram.
1.1. Experimental aim and definitions
Our previous study [12] considered user preferences
for some UML class and collaboration notations and
layout features. This experiment concentrates on user
performance, rather than preference, and focuses on
collaboration diagrams.
We identified two syntactically different ways of
representing each of five different semantic constructs
in UML collaboration diagrams. The aim of this
experiment was to determine which of the two variants
of each of these five notations is more suitable with
respect to human performance. By asking subjects to
perform comprehension tasks on a number of
semantically equivalent UML collaboration diagrams
that vary with respect to the notations used, we aimed to
identify the notational variants that resulted in the best
performance.
UML collaboration diagrams describe the
interactive view of a UML class diagram [15], detailing
the objects and interactions of the software system.
UML collaboration diagrams demonstrate how objects
in the system interact with each other, and the order in
which interactions occur by way of message passing.
Figure 1 is an example of a small UML collaboration
diagram, showing the interactions between the objects
in a simple system that finds and displays the age of a
student, whose name is entered by the user.
1.2. Notational variations
Five independent notational variations for UML
collaboration diagrams were considered (see Figure 2):
each notation (N1, N2, N3, N4, N5) has two variations
(a) and (b), with identical semantics.
N1(a) and N1(b) represent the syntax of two
different notational variants for numbering
messages.
N2(a) and N2(b) represent the syntax of two
different notational variants for depicting
message arrows.
N3(a) and N3(b) represent the syntax of two
different notational variants for depicting self-
messages.
N4(a) and N4(b) represent the syntax of two
different notational variants for associating
message labels with arrows.
N5(a) and N5(b) represent the syntax of a
further two different notational variants for
associating message labels with arrows.
All notational variations had been found in
published UML documents, except N5(b), which was
used in a prior experiment which considered user
preference of notational variations [12]. Our informal
predictions, based purely on our personal intuition and
experience, and on informal discussions with
colleagues, were that the (a) notations would produce
better performance, as we considered them to be more
concise.
2. Experimental task
The comprehension task was to match a given
textual code description against a set of diagrams,
indicating whether each diagram correctly matches the
description or not. The set of diagrams included both
correct and incorrect diagrams.
2.1. Experimental materials
2.1.1. The application domain
The collaboration diagram used was based on a
simple domain, which models a preferential voting
system, in which the voter must rank the candidates in
order of preference. The example includes an actor, 7
objects, 11 links and 13 messages (see Figure 3). It
wouldbeinfeasibletouselargediagramsina
Proceedings of the First International Workshop on Visualizing Software for Understanding and Analysis (VISSOFT’02)
0-7695-1662-9/02 $17.00 © 2002 IEEE
Notational
Difference
Variation (a) Variation (b)
Message
numbering:
whole or
partial (N1)
a:Class1 b:Class2
c:Class3
1: message1()
2: message
2
3: message3()
(Larman, 1998)
a:Class1 b:Class2
c:Class3
1: message1()
1.1: message2()
2: message3()
(Fowler, 2000)
Arrows
associated
with messages:
placement and
size(N2)
a:Class1 b:Class2
c:Class3
1: message1()
2: message
2
3: message3()
(Booch et al., 1999)
a:Class1 b:Class2
c:Class3
1: message1()
3: message3()
2: message2()
(Larman, 1998)
Representation
of self-
messages (N3)
a:Class1
1: message1()
(Maciaszek, 2001)
a:Class1 self:Class1
1: message1()
(Page-Jones, 2000)
Broken
arrows,
placement of
message text
(N4)
a:Class1 b:Class2
c:Class3
1: message1()
2: messag
e
3: message3()
(Fowler, 2000)
a:Class1 b:Class2
c:Class3
1: message1()
2: message2()
3: message3()
(Maciaszek, 2001)
Message
adjacency,
horizontal or
vertical
placement of
text (N5)
a:Class1 b:Class2
c:Class3
1: message1()
2: message
2
3: message3()
(Booch et al., 1999)
a:Class1 b:Class2
c:Class3
1: message1()
2: message2()
3
:
m
e
ss
a
g
e
3
(
)
(Purchase et al, 2000)
Figure 2. The five notational variations.
comprehension task like that used in this experiment, as
a large domain would take too long for the subjects to
read and comprehend. In addition, the modularity
principle encourages designers to partition large
diagrams into multiple smaller ones, so the size of the
application used in this study is realistic
A textual code description of this application
domain was produced in pseudo code, along with a
short description of the system being modelled. The
subjects were asked to match the experimental diagrams
against this specification.
2.1.2. UML tutorial and worked example
A tutorial sheet explained the meaning of UML
collaboration diagrams, and, using a simple example,
described the semantics of all the notational variations.
While the subjects had some prior knowledge of UML
as part of their studies, this tutorial provided all the
UML background information they required for the
experimental task, and was intended to ensure that all
subjects had a consistent base of UML knowledge. A
worked example demonstrated the task that the subjects
were to perform, by presenting a small specification
with four different diagrams, and for each diagram
indicating whether it matched the given specification or
not. Where the diagram did not match the specification,
the errors were explained. Care was taken to ensure that
neither the tutorial nor the worked example would bias
the subjects towards one notational variation over
another by using all notational variations equivalently.
Proceedings of the First International Workshop on Visualizing Software for Understanding and Analysis (VISSOFT’02)
0-7695-1662-9/02 $17.00 © 2002 IEEE
Moderator
ui:UI
sys:System
vs:VoteSet
v:Vote
cs:CandidateSet
c:Candidate
d:Candidate
2: assignVotes()
1: startCount()
startCount()
3: c:=getFirstPref()
10: d:=getNextPref()
4: add(v)
5: dropLowest()
[there's no clear winner]
6: i:=getCount()
7: dropped(c)
[c has the smallest number of votes]
8: remove(c)
9: reassignVotes()
11: add(v)
12: dropLowest()
[there's no clear winner]
Figure 3. The application domain for the experiment.
2.2. The experimental diagrams
The UML diagram representing the experimental
domain was drawn four times, in four differing layouts.
Each layout was constructed with a view to minimising
the potential for any confounding layout factors. Thus,
each layout had no edge bends, a similar number of
sloping lines, no edge crosses, comparable
orthogonality (the property of fixing nodes and edges to
the intersections and lines of an invisible unit grid), and
was of a similar size. Different layouts were required so
that the subjects would not merely use visual pattern
matching in performing the comprehension tasks: if all
the diagrams had identical layout, the differences
between them would be visually obvious and detectable
without the subject needing to understand the
information embodied in the diagram.
Using these four layouts of the diagram, the
following experimental diagrams were produced:
2.2.1. Correct diagrams (24)
o 1 diagram using all the (a) notations, in four different
layouts (N0t)
o 1 diagram using the (b) notation for N1, (a) notation
otherwise, in four different layouts (N1t)
o 1 diagram using the (b) notation for N2, (a) notation
otherwise, in four different layouts (N2t)
o 1 diagram using the (b) notation for N3, (a) notation
otherwise, in four different layouts (N3t)
o 1 diagram using the (b) notation for N4, (a) notation
otherwise, in four different layouts (N4t)
o 1 diagram using the (b) notation for N5, (a) notation
otherwise, in four different layouts (N5t)
The naming convention of these diagrams is that the
number (0-5) indicates which notation is being varied
(where 0 represents that none of the notations is varied,
that is, the (a) notation is used throughout), and the
letter (t) indicates that these are true (correct) diagrams.
2.2.2. Incorrect diagrams (20)
o For N1: For two of the layouts of N0t, errors were
introduced which affected the representation of
sequence order (N0d1). For the corresponding two
layoutsofN1t,thesameerrorswereintroduced
(N1d). The error introduced was that two of the
message numbers (but not the actual messages)
were swapped and propagated throughout the
diagram.
o For N2: For two of the layouts of N0t, errors were
introduced which affected the representation of
message direction (N0d2). For the corresponding
Proceedings of the First International Workshop on Visualizing Software for Understanding and Analysis (VISSOFT’02)
0-7695-1662-9/02 $17.00 © 2002 IEEE
two layouts of N2t, the same errors were introduced
(N2d). The error introduced was that two of the
arrows adjacent to the links, indicating the flow of
the diagram, were reversed.
o For N3: For two of the layouts of N0t, errors were
introduced which affected the representation of
self-loops (N0d3). For the corresponding two
layouts of N3t, the same errors were introduced
(N3d). The error introduced was that two of the
nodes were swapped, one being involved in the
self-loop.
o For N4: For two of the layouts of N0t, errors were
introduced which affected the representation of
sequencing (N0d4). For the corresponding two
layouts of N4t, the same errors were introduced
(N4d). The error introduced was that two of the
messages were exchanged.
o For N5: For two of the layouts of N0t, errors were
introduced which affected the representation of
sequencing (N0d5). For the corresponding two
layouts of N5t, the same errors were introduced
(N5d). The error introduced was that two of the
messages were exchanged
The naming convention of these diagrams is that the
letter (d) indicates that these are "defective" (incorrect)
diagrams. Diagrams beginning with N0 are those that
use the (a) variation throughout, with the final number
(1-5) indicating to which notation the diagram alteration
relates. The other diagrams (N1-N5) indicate which
notation has been varied to use (b) instead of (a).
2.3. Experimental procedure
2.3.1. Preparation
The students were given preparatory materials to
read as an introduction to the experiment. These
materials consisted of a consent form, an instruction
sheet, a tutorial on UML collaboration diagrams and
notation, and a worked example of the experimental
task.
As part of this document set, the subjects were also
given the textual pseudo-code specification of the UML
case study to be used in the experiment: this was the
specification against which they would need to match
the experimental diagrams. The subjects were asked to
study this specification closely, and memorise it if
possible,
The subjects were given 20 minutes to sign the
consent form, read through and understand the
materials, ask questions, take notes, or draw diagrams
as necessary.
2.3.2. Online task
The subjects then used an online system to perform
the experimental task. A copy of the textual description
of the process in pseudo-code was placed in front of the
computer for easy reference, and UML collaboration
diagrams were presented in random order on the
computer screen for each subject. The subjects gave a
yes/no response to each presented diagram, indicating
whether they thought the diagram matched the
specification or not: two keys on the keyboard were
used for this input.
16 practice diagrams (randomly selected from the 44
experimental diagrams) were presented first. The data
from these diagrams was not collected, and the subjects
were not aware that these diagrams were not part of the
experiment. These diagrams gave the subjects an
opportunity to practise the task before experimental data
was collected.
All 44 experimental diagrams were then presented
in a different random order for each subject, in blocks
of eight, with a rest break between each block (the
length of which was controlled by the subject). The
final block was four diagrams in length.
Each diagram was displayed until the subject
answered Y or N, or 60 seconds had passed. A beep
indicated to the subject when the next diagram was
displayed after a timeout (which was recorded as an
error). The practice diagrams helped the subjects get
used to the length of the allocated time period. The
timeout period and the time needed for the subjects to
prepare for the experiment were determined as
appropriate through extensive pilot tests.
A within-subjects analysis was used to reduce any
variability that may have been attributed to differences
between subjects: thus, each subject's performance on
one notation was compared with his or her performance
on the alternative notation. The practice diagrams and
the randomisation of the order of presentation of the
experimental diagrams for each subject helped counter
the learning effect (whereby subjects' performance on
the task may improve over time, as they become more
competent in the task).
2.3.3. Data collection
The response time and accuracy of the subjects'
responses to the 44 experimental diagrams were
recorded by the online system.
2.4. Subjects
The 35 subjects were second and third year
Computer Science and Information Systems students at
the University of Queensland. The subjects were paid
$15 for their time, and, as an incentive for them to take
the experiment seriously, the best performer was given
aCDvoucher.
Proceedings of the First International Workshop on Visualizing Software for Understanding and Analysis (VISSOFT’02)
0-7695-1662-9/02 $17.00 © 2002 IEEE
3. Results
Both the speed and accuracy of each subject's
responses were measured, enabling the analysis of two
different measures of understanding. Analysis was
performed on both the subjects' performance in
identifying the correct diagrams and their performance
in identifying the incorrect diagrams. Preference data
was also collected: students were asked which of the
variants they preferred for each notation, and they were
asked to state reasons for their preference.
3.1. Performance Data:
Figures 4 and 5 show the time and accuracy results
for each notation, for both the correct diagrams and the
incorrect diagrams. They show the average time taken
for the subjects to respond to the diagrams embodying
each of the notational variants, and the average
accuracy of their responses.
To determine whether any of the performance
differences between the (a) and (b) variants for any of
the notations could be attributed to chance (and
therefore could not contribute anything meaningful to
our analysis), we conducted a two-tailed t-test to
identify which of these results are statistically
significant.
The statistically significant results are:
¾ Notation 1: accuracy
o for matching the specification to correct diagrams:
(a) is better than (b) (p<0.05)
¾ Notation 1: time
o for matching the specification to correct diagrams:
(a)isfasterthan(b)(p<0.05)
o for identifying an error in the diagrams: (a) is
faster than (b) (p=0.059, only approaching
significance)
¾ Notation 2: accuracy
o for matching the specification to correct diagrams:
(a) is better than (b) (p<0.05)
¾ Notation 3: accuracy
o for matching the specification to correct diagrams:
(a) is better than (b) (p<0.05)
¾ Notation 4:
o no significant results for either accuracy or time
¾ Notation 5: accuracy
o for matching the specification to correct diagrams:
(a) is better than (b) (p<0.05)
All other performance differences between the (a) and
(b) variants could be attributed to chance.
3.1.1. Identifying the correct diagrams:
Average Times
0
5
10
15
20
25
30
35
12345
Notation
Time (sec)
(a)
(b)
Notational Accuracy
0
20
40
60
80
100
12345
Notation
Accuracy (%)
(a)
(b)
Figure 4. The time and accuracy results for subject
performance on the correct diagrams.
3.1.2. Identifying the incorrect diagrams:
Average Times
0
5
10
15
20
25
12345
Notation
Time (sec)
(a)
(b)
Notational Accuracy
0
20
40
60
80
100
12345
Notation
Accuracy (%)
(a)
(b)
Figure 5. The time and accuracy results for subject
performance on the incorrect diagrams.
3.2. Correlation Data:
Tables 1 and 2 show the linear correlations between
the accuracy and response time for each of the
diagrams, according to their notational variants. A
statistically significant linear correlation indicates that a
decrease in accuracy is related to an increase in
response time (and vice-versa). The interpretation of the
Proceedings of the First International Workshop on Visualizing Software for Understanding and Analysis (VISSOFT’02)
0-7695-1662-9/02 $17.00 © 2002 IEEE
Subject preference for notation
0
10
20
30
40
12345
notation
no. of subjects
(a)
(b)
Figure 6. User preferences for each notational variation.
time and accuracy data for those notational variants for
which the linear correlation is significant needs to take
this relationship into account.
Table 1.
Linear correlations for accuracy and response
time for the correct diagrams: there are no significant
correlations (Rcrit=0.19,α=0.05,two-tailed)
Diagram Set Correlation Coeficient
N0t 0.11
N1t -0.07
N2t 0.13
N3t 0.08
N4t 0.08
N5t 0.15
Table 2. Linear correlations for accuracy and response
time for the incorrect diagrams: significant values are
shown in bold (Rcrit=0.25,α=0.05,two-tailed)
Notational Variance
Error
operations
(a) (b)
1: message
no.s swapped
N0d1 -0.20 N1d 0.04
2: arrows
reversed
N0d2
-0.68
N2d -0.20
3: nodes
reversed
N0d3 -0.24 N3d -0.03
4: messages
exchanged
N0d4
-0.44
N4d
-0.53
5: messages
exchanged
N0d5
-0.45
N5d
-0.35
3.3. Preference Data:
The subjects’ preference results are shown in Figure
6. Using a 2-tailed t-test, the statistically significant
results were that all subjects preferred variation (a) to
variation (b) for all five notations (p << 0.01).
3.3.1. Notation 1
Most subjects who preferred the (a) notation (simple
number sequencing) gave reasons like "easier to
follow", "clearer" and felt that the simple number
sequencing was more "natural" and therefore more
easily understood than the decimal number sequencing.
Those who preferred the (b) notation felt that the multi-
level approach was "more structured", "processes are
classified in an organised manner" and it helped them to
understand the steps involved (even though the simple
numbering was easier at a glance).
3.3.2. Notation 2
Subjects who preferred notation (a) thought the long
adjacent arrows were "clearer", "more indicative of the
flow", "easier to follow" and were less subject to
ambiguity, especially on sloping lines. Some went on to
say that it didn't really matter much but that they
preferred notation (a). Subjects preferring notation (b)
gave reasons like "less clutter", "looks neater" and
"from experience there is not a lot of room when doing
diagrams".
3.3.3. Notation 3
Most subjects preferred the (a) notation and their
rationale for this was "it is easier to understand instead
of looking at the words in the box then to realise that it
is trying to call it self", "it is more self-explanatory that
the operation is performed on the same object", and also
because it was more compact while still an adequate
representation of the same thing. Those preferring
notation (b) thought it was "a little bit clearer, less room
for error if any system classes exist" though some
subjects preferred notation (b) because "[in notation (a)]
I don't really know which way is the right direction, and
what happens if the loop is upside down?" These people
obviously misunderstood the semantics of this notation
because it does not matter which way the arrow points -
it is a loop, so the end result is the same.
3.3.4. Notation 4
The subjects tended to prefer the (a) notation
because it was "clearer", "easier to follow the arrow as
Proceedings of the First International Workshop on Visualizing Software for Understanding and Analysis (VISSOFT’02)
0-7695-1662-9/02 $17.00 © 2002 IEEE
the 'label' doesn't break the arrow up", "more
standardised and neat", "found the breaking of the lines
distracting" and "arrows through words are misleading
and can clutter diagrams". Some subjects stated that
they liked both notations, one even went on to say they
liked "[notation (a)] if there is only one task happening.
[notation (b)] if multiple tasks can occur." One subject
thought the diagrams were the same and consequently
had no preference for either notation. Subjects who
preferred notation (b) explained their choice with "no
special reason, feel like it", "clear indication of arrow
belonging to process" and "doesn't make any difference
on this diagram, but when they are complex, [notation
(b)] could be easier to read".
3.3.5. Notation 5
The experimental subjects who preferred notation
(a) all seemed to follow the same theme, with responses
like "definitely [notation (a)] because when you have
paper you can turn it to read but when you are looking
at a screen that is impossible", "easier to read", "don't
need to turn head around" and "it is easier to read
horizontal text". The two subjects who preferred the (b)
notation said "compact and easier to follow
associations" and "this way the whole diagram looks a
lot more clear, especially for beginners", while one
subject did not prefer either notation.
4. Analysis
In performing this task, the subjects would be
looking for mistakes in each diagram, and, as soon as
they identified a single mistake, would press the N key.
The accuracy data therefore needs to be analysed with
the following points in mind:
o We do not know the reason why subjects have
rejected diagrams: it may be because they indeed
have identified the planted errors relating to the
notation in question, in which case, the rejection is
valid.
o If subjects have a misunderstanding about one of
the variations of a notation, then they may identify
correct diagrams (incorrectly) as incorrect (by
identifying what they believe to be an error), and
incorrect diagrams (correctly) as incorrect (but for
the wrong reason: an invalid rejection).
o As the (a) notations represented those variations
that we considered to be intuitively the simpler,
invalid rejections may be more likely to occur with
the (b) notation, as this was the more complex.
o Invalid rejections relating to the (b) variation of any
one notation would not affect the results for the
other notations: the diagrams for each notation
include their own variations, but use the (a)
notation otherwise.
Unlike the results from the UML class diagrams
experiment [14], there was no significant accuracy data
for the task of identifying the incorrect diagrams, so the
possibility of invalid rejections did not need to be
considered in the analysis.
4.1. Accuracy
All notations except notation 4 produced significant
data that indicated that the (a) variant produced better
accuracy, but this was only the case for the task of
identifying the correct diagrams. As the (b) variation is
the more complex, we had thought that subjects might
have been more vigilant in identifying the errors when
the (b) notation was used (as appeared to be the case in
the UML class diagrams experiment [14]), but there
was no significant accuracy data for the task of
identifying errors for any of the notations.
While the accuracy data for notation 4 looks as
though it ought to be significant for the task of
identifying correct diagrams (as the respective means
are the same as those for notation 5), on closer
inspection, we observed that there was a difference in
the standard deviations (notation 4: 0.85; notation 5:
0.92). It is this difference that means that the accuracy
data for notation 4 is not significant (p=0.10).
The accuracy for the incorrect diagrams is
noticeably less than that of the correct diagrams (in all
but notation 2), indicating that subjects did not notice
many of the planted errors. However, there was no
significant difference in the performance produced by
the (a) and (b) variants for any notation for this task.
4.2. Time
Only notation 1 produced any significant data with
respect to response time (favouring notation (a)),
indicating that subjects spent no longer on interpreting
the other four (b) notations than they did on interpreting
the respective (a) diagrams. This significant result was
related to the task of identifying the correct diagrams,
although the time data for the task of noticing errors
approaches significance, also in favour of the (a)
notation over the (b) notation.
4.3. Time/accuracy correlations
There is a negative correlation between time and
accuracy for the task of identifying incorrect diagrams
for notations 2, 4 and 5, which means that independent
analyses of the time and accuracy data as two separate
measures of understanding may be inappropriate. None
of the t-test analyses for these notations for the task of
identifying incorrect diagrams were significant.
Proceedings of the First International Workshop on Visualizing Software for Understanding and Analysis (VISSOFT’02)
0-7695-1662-9/02 $17.00 © 2002 IEEE
For notation 2, subjects responded quickly, with
high accuracy, on the incorrect diagrams that used all
the (a) notations, and which had an error introduced
which reversed the direction of arrows adjacent to the
links. This indicates that the reversal of these arrows
were very easy to perceive.
For notations 4 and 5, in both the (a) and (b) cases,
subjects took a long time to make an erroneous choice:
thus, although they looked at the diagrams thoroughly,
the introduced errors of exchanging two of the messages
were not easy to perceive. This may have been due to an
experimental oversight, whereby the textual description
of the process was not as clear to the subjects as we had
thought, or the subjects had been given insufficient time
to comprehend it fully.
4.4. Preference
The preference results were markedly in favour of
the (a) variations for all notations, in keeping with our
personal intuitive predictions.
5. Conclusions
The results concur with our intuitions that the (a)
notations would produce better performance than the (b)
notations, except in the case of notation 4, when it does
not matter whether the directional arc is broken by text
or not. The form of sequence numbering, the length of
the directional arrows, the representation of a self-loop,
and the direction of the text does matter when the task is
determining whether a specification matches a diagram
or not. The identification of errors is not supported any
better by any one of the notational variants.
While it is gratifying that our informal predictions
have been validated, the aim of this paper is wider than
merely confirming our informal judgements. Our aim
has been to raise the issue of the lack of consistency
between texts, propose a methodology for attempting to
determine the most appropriate variation from the
perspective of human comprehension, and to make
some preliminary suggestions.
The experiment has been run in a formal empirical
manner, and is therefore subject to the constraints of
such a methodology: constraints on the nature of the
subjects, application domain and tasks. Case study
investigations including, for example, the use of UML
in a real world industrial application, or the learning of
UML by students over an entire semester, would give
greater insight into the suitability of the different
notational variations from a human comprehension
point of view. Such experiments may, in particular,
relate the effectiveness of the variations according to the
task for which UML is being used. It may be the case,
for example, that some variations may be preferable for
learning UML, but may need to be adapted for real
world use on a multi-person project. In addition, the
conventions and standards of different organisations
may require use of one notation over another.
In depicting UML diagrams in CASE tools and
UML texts, choices need to be made regarding which
notation to use between semantically equivalent
variations within the UML standard. Choosing the
variation that most supports the users' comprehension
can only enhance the value of the tool or text: empirical
studies can assist in determining which of the variations
aremoresuitable.
6. Acknowledgements
We are grateful to the students of the School of
Information Technology and Electrical Engineering at
the University of Queensland who willingly took part in
the experiment, to members of the Distributed Systems
Technology Centre at the University of Queensland for
their advice, and to the Australian Research Council,
which funded this research. Ethical clearance for this
study was granted by The University of Queensland,
2001.
References
[1] Blackwell, A.F. & Green, T.R.G. A Cognitive
Dimensions questionnaire optimised for users. In A.F.
Blackwell & E. Bilotta (Eds.) Proceedings of the
Twelfth Annual Meeting of the Psychology of
Programming Interest Group, Memoria, pp. 137-152,
2000.
[2] Booch, G., Jacobson, I. & Rumbaugh, J., The Unified
Modeling Language User Guide, Addison Wesley
Longman, Inc., Reading, Mass, 1999.
[3] Cox, K., Cognitive Dimensions of Use Cases -
Feedback from a student questionnaire. In A.F.
Blackwell & E. Bilotta (Eds.) Proceedings of the
Twelfth Annual Meeting of the Psychology of
Programming Interest Group, Memoria, pp. 99-121,
2000.
[4] Fowler, M. & Scott, K, UML Distilled, (2nd ed),
Addison Wesley Longman, Inc., Reading, Mass, 2000.
[5] Green, T. R. G. & Petre, M. Usability analysis of visual
programming environments: a 'cognitive dimensions'
framework. Journal of Visual Languages and
Computing, 7:131-174, 1996.
[6] Gurr C. & Stevens P., A cognitively informed approach
to describing product lines in UML, Human
Communication Research Centre, University of
Edinburgh, unpublished, 1999.
Proceedings of the First International Workshop on Visualizing Software for Understanding and Analysis (VISSOFT’02)
0-7695-1662-9/02 $17.00 © 2002 IEEE
[7] Gurr, C. & Tourlas, K., Towards the Principled Design
of Software Engineering Diagrams, In Proceedings of
22nd International Conference on Software Engineering,
ACM Press, pp. 509-518, 2000.
[8] Kutar, M., Britton, C. & Wilson, J., Cognitive
Dimensions An Experience Report. In A.F. Blackwell &
E. Bilotta (Eds.) Proceedings of the Twelfth Annual
Meeting of the Psychology of Programming Interest
Group, Memoria, pp. 81-98, 2000.
[9] Larman, C., Applying UML and Patterns, Prentice Hall,
Inc., Upper Saddle River, NJ, 1998.
[10] Maciaszek, L. A., Requirements Analysis and System
Design, Pearson Education Ltd, Harlow, England, 2001.
[11] Page-Jones, M., Fundamentals of Object-Oriented
Design in UML, Dorset House Pub, New York, 2000.
[12] Purchase,H.C.Allder,J-A.&Carrington,D.,User
preference of Graph Layout Aesthetics: a UML study,
In J. Marks (Ed.) Proceedings of the Graph Drawing
Symposium, Lecture Notes in Computer Science 1984,
pp. 5-18, Springer Verlag, 2000.
[13] Purchase, H.C., Carrington, D. & Allder, J-A.,
Experimenting with aesthetics-based graph layout. In M.
Anderson, P. Cheng & V. Haarslev (Eds.) Proceedings
of the Theory and Application of Diagrams Conference,
Lecture Notes in Artificial Intelligence 1889, pp. 498-
501, Springer Verlag, 2000.
[14] Purchase, H.C., Colpoys, L. McGill, M., Carrington D.
& Britton, C. UML class diagram syntax: an empirical
study of comprehension. Proceedings of the Australian
Information Visualisation Symposium, Eades, P. and
Pattison, T. (eds), Australian Computer society, pp. 113-
120, 2001.
[15] Rumbaugh, J., Jacobson, I. & Booch, G., The Unified
Modeling Language Reference Manual, Addison
Wesley Longman, Inc., Reading, Mass, 1999.
[16] Schmuller, J., SAMS Teach Yourself UML in 24 hours,
SAMS, Indianapolis, 1999.
Proceedings of the First International Workshop on Visualizing Software for Understanding and Analysis (VISSOFT’02)
0-7695-1662-9/02 $17.00 © 2002 IEEE
    • "The most empirically-studied quality attribute is undoubtedly comprehensibility, especially in UML diagrams [5,6,14151617181920. The explanation for this is that the ease with which respective UML diagrams can be understood affects how they must be maintained, tested, etc. "
    [Show abstract] [Hide abstract] ABSTRACT: Context: The conventional wisdom states that stereotypes are used to clarify or extend the meaning of model elements and consequently should be helpful in comprehending the diagram semantics. Objective: The main goal of this work is to present a family of experiments that we have carried out to investigate whether the use of stereotypes improves the comprehension of UML sequence diagrams. Method: The family of experiments consists of an experiment and two replications carried out with 78, 29 and 36 undergraduate Computer Science students, respectively. The comprehension of UML sequence diagrams with and without stereotypes was analyzed from three different perspectives borrowed from the Cognitive Theory of Multimedia Learning (CTML): semantic comprehension, retention and transfer. In addition, we carried out a meta-analysis study to integrate the different data samples. Results: The statistical analysis and meta-analysis of the data obtained from each experiment separately indicates that the use of the proposed stereotypes helps improving the comprehension of the diagrams, especially when the subjects are not familiar with the domain. Conclusions: The set of stereotypes presented in this work seem to be helpful for a better comprehension of UML sequence diagrams, especially with not well-known domains. Although further research is necessary for strengthening these results, introducing these stereotypes both in academia and industry could be an interesting practice for checking the validity of the results.
    Article · Dec 2011
    • "The main focus of previous work on UML diagram types and their layout has been with one of four aspects: diagram comprehension (cf. [26], [27], [20], [21] and/or user preference (cf. [18], [29]), automatic layout (cf. "
    [Show abstract] [Hide abstract] ABSTRACT: Practical experience suggests that use and understanding of UML diagrams is greatly affected by the quality of their layout. However, existing experimental evidence for this effect is been weak and inconclusive. In this paper, we explore two explanations. Firstly, we observe that the visual qualities of diagrams are more prominent in earlier life cycle phases so that the impact of layout quality should be more apparent in models and diagram types used there, an aspect not studied in previous research. Secondly, in practice, good layouts use many different heuristics simultaneously whereas previous research considered them in isolation only. In this paper, we report the results of a series of controlled experiments using compound layouts on requirements analysis models. With very high significance, we find a notable impact of the layout quality measured by different aspects of cognitive load.
    Full-text · Conference Paper · Sep 2011
    • "The main focus of previous work on UML diagram types and their layout has been with one of four aspects: diagram comprehension (cf. [26], [27], [20], [21] and/or user preference (cf. [18], [29]), automatic layout (cf. "
    [Show abstract] [Hide abstract] ABSTRACT: Practical experience suggests that usage and understanding of UML diagrams is greatly affected by the quality of their layout. While existing research failed to provide conclusive evidence in support of this hypothesis, our own previous work provided substantial evidence to this effect. When studying different factors like diagram type and expertise level, it became apparent that diagram size plays an important role, too. Since we lack an adequate understanding of this notion, in this paper, we define diagram size metrics and study their impact to modeler performance. We find that there is a strong negative correlation between diagram size and modeler performance. Our results are highly significant. We utilize these results to derive a recommendation on diagram sizes that are optimal for model understanding.
    Full-text · Conference Paper · Jan 2011 · Information and Software Technology
Show more