Content uploaded by Ahmed Seffah
Author content
All content in this area was uploaded by Ahmed Seffah on May 24, 2016
Content may be subject to copyright.
1
Empirical Study of Software Developers’ Experiences
R. Kline and A. Seffah
Human-Centered Software Engineering Group
Psychology Department and Computer Science Department, Concordia University
1455 Maisonneuve Blvd. W., Montréal, Québec, H3G 1M8, Canada
E-mail: {r_kline, seffah}@cs.concordia.ca
Abstract
There is evidence that CASE tools do not completely meet
the goal of their diverse users. Part of the problem seems
to be a gap between how programs are represented and
manip ulated in the user interfaces of CASE tools and the
experiences of software developers, includin g
maintainers, testers and programmers. The empirical
study presented in this paper uses two different methods
to measure the experiences of skilled and novice
developers with the same CASE tool for C++. The
methods include h euristic evaluation conducted with the
experienced developers and psychometric evaluation
conducted with both groups. The results indicate that
experienced and inexperienced developers reported
similar kinds of problems, including poor program
learnability, difficulties with the visibility and usefulness
of program functionalities, and ambig uous erro r and he lp
messages. These findings are discussed in relation to
other empirical results about developers’ experiences
with CA SE tools.
1. Introduction
Integrated CASE tools are intended to support all phases
of the software d evelopm ent lifecycle, including coding,
implementation, and mainte nance. U nfortunately, there is
evidence that these integrated development environments
do not always fulfill these goals. For example, Iivari [1]
found that the best single negative predictor of CASE tool
use is perceived voluntariness. That is, if developers
perceive that use of a CA SE tool is vo luntary, they tend
not to use it. Results of other surveys by Seffah and
Rilling [2] and Jarzabek and Huang [3] suggest that even
highly experienced developers often use only a small
number of CASE tools capabilities out of the total set
available. Either developers are un aware that these
functionalities exist or they do not know how to apply
them in a different context. For example, a developer may
know how to use correctly use a visual debugger tool
during development, but they may not know how to use it
during maintenance to understand the structure of the
application from the exe cution grap h.
Developers in the same surveys cited above also describe
CASE tools as relatively difficult to learn. They also cited
numerous problems with the user interfaces (UI) of CASE
tools. Included amon g these problems are unclear error
messages and non-intuitive organization of icons, menus,
or toolbars (i.e., low affordance). They also reported that
graphical representations intended to decrease memory
load often h ave just the op posite effect.
Findings similar to those just summarized above have
been interpreted as evidence for a cognitive gap between
developers’ experiences and how CASE tools are
organized in terms of how their users interact with them
[1-3]. Understanding developers' experiences with extant
tools is an importa nt step towards developing more
human-cen tric CASE tools. The obvious question we are
addressing here is how developer experiences can be
quantified or appre ciated. In wha t follows, we present a
measurement-oriented framework for understanding and
quantifying developer experience. As an example, we
consider the case of a widely-used CASE tool, Microso ft
Visual C++ 6.0.
2. Measuring Developer Experiences
There are several different methods to measure developer
experiences with CASE tools, including ethnographic
interviews, laboratory observation, remote testing,
measurement of formal usability metrics, psychome tric
evaluation, and heuristic evaluation. Interview methods
can be rather subjective, and laboratory testing or remote
testing can be expensive or difficult to im plement.
Heuristic evaluation an d psychom etric evaluation both
have the advantage of being relatively low cost. In
heuristic evaluation, about 5-10 experienced users
evaluate a program’s user interface against a set of
accepted principles, or heuristics [4]. Early lists of
heuristics were lengthy and rather difficult to apply, but
Nielsen [4] proposed a relatively small set that was
specialized for use in this study—see Table 1.
Psychometric evaluation in which develo pers com plete
objective questionnaires offers a more systematic wa y to
measure and com pare exp eriences ac ross different
developers and CASE tools. The questionnaire selected
for this study, the Software Usability Measurement
Inventory (SUMI; [6]), facilitates these ends through its
multi-factor form at, generally exc ellent psycho metric
2
characteristics, and relatively lar ge norma tive sample. The
50-item SUM I measures five different usabili ty areas,
including affect (whether users like the program),
program helpfulness, learnability, efficiency, control
(whether users feel like they know what the progra m is
doing), and overall satisfaction. Ratings of individual
developers can be compare d against thos e in the SUM I’s
normative sample in a m etric where a sc ore of 50 is
average, the standard deviation is 10, and higher scores
indicate greater satisfaction. The SUMI scoring program
also identifies individual items where the respo ndents
report statistically lower levels of satisfaction. Inspection
of these critical items m ay identify specific kinds of
usability problems.
Heur istic Definition
Visibility Informs user about status.
Real World Match Speaks the developer’s language.
Consistency User feels in control of what they are
developing (the program), not just
controlling the CASE tool
Error Handling Clear erro r mess ages and recovery.
Rec ognit ion, n ot Re call Minimizes memory load.
Flexibility
Suits users of varying experience.
Minimalist Design Shows only relevant information.
Rele vant H elp Easy to search and find information
Table 1. Definitions of heuristics (adapted from [5]).
Unlike heuristic evalua tion, small samples are inadequa te
for psychometric method s. This is because results from
small samples tend to be statistically unstab le due to
sampling error. As a rule of thumb, there should be at
least one subject (user) for each item on a questionnaire.
The SUMI has 50 items; thus, a sample of at least 50
subjects is needed in order to place much confidence in
the stability of the results. Psychometric evaluation is also
more suitable for novice developers. For this reason, we
used in this study both methods with a group of
experienced developers and only psychometric evaluation
with a larger sample of novice d evelopers.
3. Method
3.1 Participants
A total of seven experienced de velopers p articipated in
the heuristic evaluation of a widely-used CASE tool for
C++. These sa me participants also completed the SUMI
questionnaire about this CASE tool. Most of these users
had similar backgrounds and demographic c haracteristics.
All but one were male, most were in their late twenties or
early thirties, and they had on average about 10 years of
general experience with computers and about 5 years of
professional programming experience. All reported that
they were very familiar with the C++ CASE tool they
rated.
The inexperienced developers made up a comparison
group. These 54 individuals were enrolled in a
undergraduate computer science program (64% male,
36% female; average age = 27.3 years), taking an
advanced course in C++ programming, and used the same
CASE tool rated by the experienced developers. A total of
75% of the students reported using computers for over 3
years, 18% fo r 1-3 years, and only about 6% for less than
6 months. The average student reported using the C++
CASE tool an aver age of 3-10 hours a wee k.
3.2 Procedure
The experienced developers completed a rating form with
detailed definitions of the nine heuristics listed in Table 1.
The rating form also solicited comments about program
usability in each area or about usability problems not
mentioned by any of the heuristics. The experienced users
completed the SUMI at the same time. All questionnaires
were completed anonymously. The inexperienced
developers also comp leted the SU MI ano nymously. T heir
participation in this research project was voluntary and
had no bearing on their course standing.
4. Results
4.1 Heuristic evaluation
Review of the comments by the experienced developers
generally indicated few usability prob lems for the co ntrol,
flexibility, and minima list design heuris tics (Table 1).
Other areas did n ot fare so well. A notable example is
error handling for which many problems were noted, such
as ambiguous messages and poor system error
representation. Software developers using the C++ CASE
tool for testing generally also complained of insufficient
information about recovering from errors. Concerns about
inadequate feedback from the C++ CASE tool were also
noted for the visibility heuristic. Other concerns were
expressed about the CASE tool’s real world match: Some
of its dialogs use sp ecific technical te rminology or are
ambiguously worded.
4.2 Psychometric evaluation
4.2.1 Experienced developers. Presented in Figure 1 is
the median SUMI satisfaction profile for the experienced
developers. This profile suggests about average o verall
satisfaction with the C++ CASE tool. Specifically, the
experienced develop ers’ descriptio ns of the efficiency,
helpfulness, and sense o f control in using the tool are all
3
about average compared to the SUMI's normative sample.
However, they expressed below average satisfaction with
the program's learnability; that is, they described it as
difficult to learn. The content of critical SUMI items was
generally consistent with problems identified in the
heuristic evaluation by the same expe rienced developers.
Figure 1. Satisfaction profile for experienced
developers.
4.2.2 Inexperienced developers. Presented in Figure 2
is the median SUM I satisfaction pro file for the
inexperienced developers. As a gro up, they expressed less
satisfaction with the C++ CASE tool than the experienced
developers. Specifically, the only median scores for the
inexperienced developers that were about avera ge are in
the areas of affect and program helpfulness. However, the
below average score of about 40 on the learnability scale
is comparable to that of the experienced developers. Thus,
both groups described the C++ CASE tool as relatively
difficult to learn. The inexperienced developers also
reported additional problems in the areas of global
satisfaction, control, and efficiency. Results of
nonpara metric statistical tests paint a similar picture: The
group medians d o not differ statistica lly on the affect,
helpfulness, learnability scales at the .05 level, but they do
on the rest of the scales.
Specific usability problems identified by SUMI critical
items among the inexperienced users include the lack of
helpful inform ation on the sc reen when it is ne eded,
Figure 2. Satisfaction profile for inexperienced
developers.
difficulties learning new functions, trouble movin g data in
and out of the program, and disruption of a preferred way
of working.
5. Discussion
It is not surprising that experienced developers rep orted
greater overall satisfaction with the CASE tool for C++
than inexperienced developers. Of greater interest are the
areas of agreement: Both groups described the CASE tool
as difficult to learn and mentioned concerns about lack of
understandable on-screen information. These points of
agreement are also consistent with results of other studies
of programmer experiences of CASE tools [1, 2, 3]. The
latter suggests that the find ings of this study ma y not be
specific to the particular C++ CASE tool evaluated here.
It is also of interest that two very different measurement
methods, heuristic evaluation and psychometric
assessment, identified quite similar kinds of user
experiences. This suggests convergent validity—that the
usability of a particular CASE tool is robust enough to be
evaluated in more than o ne way.
We believe that these results provide additional evidence
for a conceptual gap between the software engineers who
develop CASE too ls and the software engineers who use
them. This study is also part of an ongoing effort to better
apprecia te develop ers’ experien ces with extant software
engineering tools used in program development or
maintenance.
6. References
[1] Iivari, J., “Why are CASE Tools not Used?,”
Communication of the ACM, 39 (10), 1996, 94-103.
[2] Seffah A, and Rilling, J., “Investigating the
Relationship between Usability and Conceptual Gaps for
Human -Centric CASE Tools,” IEEE Symposium on
Hum an-Cen tric Computing Languages and Environ ments,
Stresa, Italy, September 5-7, 2001.
[3] Jarzabe k, S., and Huang, R., “The Case for User-
Centered CASE Tools,” Communication of the ACM, 41
(8), 1998, 93-99.
[4] Nielsen, J., “Heuristic Evaluation,” in J. Nielsen and
R. L. Mack (Eds.), Usability Inspection Methods, John
Wiley, New York, 1994.
[6] Kirakowski, J., and Corbett, M., “SUMI: The
Software Measurement Inventory,” British Journal of
Educational Technology, 24 (5), 1993, 210-212.