Conference PaperPDF Available

How to Master Challenges in Experimental Evaluation of 2D versus 3D Software Visualizations

Authors:
  • ipoque - a Rohde & Schwarz company
  • Leipzig University, Leipzig, Saxony, Germany

Abstract and Figures

Software visualizations in 3D and virtual reality are an interesting and debated research topic in academia. However, the benefits and drawbacks of 3D software visualizations in immersive environments compared to its 2D counterparts are not very well understood due to the lack of empirical evaluations. The challenge is to plan valid experiments with analogous 2D and 3D visualization techniques, while avoiding various influence factors and minimizing the threats to validity. In this paper, we present an experiment as part of a series using a structured approach to meet these challenges.
Content may be subject to copyright.
How to Master Challenges in Experimental Evaluation of 2D versus 3D
Software Visualizations
Richard M¨
uller
Information Systems
Institute
University of Leipzig
Pascal Kovacs
Information Systems
Institute
University of Leipzig
Jan Schilbach
Information Systems
Institute
University of Leipzig
Dirk Zeckzer§
Institute of Computer
Science
University of Leipzig
ABSTRACT
Software visualizations in 3D and virtual reality are an interest-
ing and debated research topic in academia. However, the benefits
and drawbacks of 3D software visualizations in immersive environ-
ments compared to its 2D counterparts are not very well understood
due to the lack of empirical evaluations. The challenge is to plan
valid experiments with analogous 2D and 3D visualization tech-
niques, while avoiding various influence factors and minimizing the
threats to validity. In this paper, we present an experiment as part
of a series using a structured approach to meet these challenges.
Index Terms: Information Interfaces and Presentation [H.5.1]:
Multimedia Information Systems—Evaluation/methodology
Computer Graphics [I.3.7]: Three-Dimensional Graphics and
Realism—Virtual reality
1 INTRODUCTION
Performing controlled experiments in software visualization while
minimizing the threats to validity is hard to accomplish. There are
many influence factors such as the user, the task, the visualized soft-
ware artifact, its representation with the corresponding navigation
and interaction techniques as well as the implementation.
Furthermore, to derive general statements about the benefits or
drawbacks of visualizations one single experiment is not sufficient.
Rather, a series of experiments is needed [9]. Thus, selected influ-
ence factors should be varied in different experiments while keep-
ing the remaining factors constant or measure their influence.
This paper has two contributions. (1) We present an experiment
as part of such a series to answer the question: Does the additional
dimension in inherent 3D [17] software visualizations lead to ad-
vantages in solving software engineering tasks? (2) Further, we
explain how we met the challenge to control the influence factors in
comparing 2D vs. 3D software visualizations.
2 RE LATE D WORK
To control the different influence factors and to minimize the threats
to validity during an experiment and over the whole series, we used
a structured approach for planning and conducting controlled ex-
periments in software visualization [12]. The approach is based on
the extended process model for design and validation of visualiza-
tions by Munzner et al. [13, 10].
Further, we considered the lessons learned from other experi-
ments in software visualization [15], hints, guidelines, and frame-
works [16, 5, 20].
Important prior work in conducting controlled experiments com-
paring 2D and 3D information and software visualizations has been
e-mail: rmueller@wifa.uni-leipzig.de
e-mail: kovacs@wifa.uni-leipzig.de
e-mail: schilbach@wifa.uni-leipzig.de
§e-mail: zeckzer@informatik.uni-leipzig.de
performed [18, 6, 19]. However, all of these controlled experiments
were not performed as part of a series. Therefore, deducing general
statements about advantages and disadvantages of the third dimen-
sion in software visualization is still difficult.
3 TH E EXPERIMENT IN A NUTSHELL
In the experiment, we investigated the comprehension of a medium-
sized software system focusing on three main research questions:
1. Does an inherent 3D software visualization reduce the time to
solve a task, compared to a 2D software visualization?
2. Does an inherent 3D software visualization increase the cor-
rectness of the solution of a task, compared to a 2D software
visualization?
3. Does an inherent 3D software visualization require more in-
teraction to solve a task, compared to a 2D software visual-
ization?
From these research questions, we derived the directed hypothe-
ses given in Table 1. All hypotheses refer to a software comprehen-
sion task of medium-sized software systems.
Our research questions and hypotheses aim at comparing 2D ver-
sus 3D software visualizations. Therefore, the dimension of the
software visualization is the independent variable that is to be var-
ied. To verify our hypotheses, we measure the following dependent
variables in our experiment: the time a participant needs to com-
plete the task, the correctness of the participant’s solution, and the
click time the participant spends for interaction.
In order to measure the presumed effect, the random sample was
divided into two groups. Both groups had to solve the same tasks,
but the control group used a 2D and the experimental group used an
inherent 3D visualization. We applied a between-subjects design,
i.e., every participant was member of only one group.
4 CONTROLLING THE INFL UE NCE FAC TOR S
Next, we describe how we control the influence factors in the ex-
periment. Table 2 shows all considered factors and the way how we
control them, i.e., whether we hold them constant or measure them.
Alternative Hypothesis Null Hypothesis
H11: The third dimension de-
creases time to solve a task.
H10: The third dimension
does not decrease time to
solve a task.
H21: The third dimension in-
creases correctness of solved
tasks.
H20: The third dimension
does not increase correctness
of solved tasks.
H31: The third dimension in-
creases interaction required to
solve a task.
H30: The third dimension
does not increase the interac-
tion required to solve a task.
Table 1: Directed hypotheses operationalizing the research ques-
tions.
Factor/Sub-
Factor
Characteristic Control
User
Role Developer, Maintainer measured
Background 20-40 years measured
Male, Female measured
Color Blindness measured
Ability of Stereoscopic Viewing measured
Knowledge Bachelor, Master, PhD, Post Doc measured
Programming Experience measured
Domain Knowledge measured
Circumstances Occupation measured
Familiarity with Study Object measured
Familiarity with Study Tools measured
Task
Problem T1: Find a method, T2: Identify
dependencies
constant
Operation Retrieve Value, Filter, Find Ex-
tremum
constant
Software Artifact
Type Source Code: Java constant
Size Medium: 200K LOC constant
Aspect Structure constant
Representation
Dimensionality 2D, Inherent 3D varied
Technique Nested Node-Link constant
Navigation & Interaction
Technique Overview, Zoom, Details-on-
Demand, Relate
constant
Input Tablet constant
Output Virtual Reality Environment constant
Implementation
Algorithm Force-Directed Layout constant
Platform De-
pendence
Platform Independent constant
Automation Full constant
Data Abstrac-
tion
Famix constant
Table 2: Instance of the model: Comparing a 2D vs. an inherent 3D
software visualization [12].
4.1 User
Attributes of the user that might have an influence on the results are
color blindness, the ability of stereoscopic viewing, and the indi-
vidual experience in software development, software visualization,
virtual reality, and using a tablet. Furthermore, the frequency of ac-
tivities in playing 3D games, watching 3D movies in cinema or on
TV, as well as 3D modeling might also have an effect. In order to
check whether these factors are distributed almost equally among
both groups, we collected the necessary data and included it in our
analysis.
Furthermore, the user experience was measured with a question-
naire. The participants were assisted by pairs of words, in which
each pair represented contrasting judgments of the visualization.
This assessment was derived from the AttrakDiff questionnaire [8].
Finally, we asked for positive and negative aspects of the visualiza-
tions and for suggested improvements.
4.2 Task
A typical scenario in software development and re-engineering is
the identification of dependencies in complex software systems to
implement new features or to refactor code. The advantage of these
tasks is that a deep understanding of the software is not necessary,
whereby the training of the participant has less effect on the results.
Thus, the effort to train the participants and the effort needed to
review the experiment by others are reduced to a minimum [7].
Nevertheless, the participant must have at least some knowledge
about the system to solve this task in a proper way. Hence, the
first task of the experiment was to identify a single method using
a visualization. This gave a first insight into the visualized system.
The task was solved, if the participant identified the correct method.
In the second task, the participant had to identify six dependencies
of this method on other methods and attributes. To solve the sec-
ond task, the participant had to identify all six dependencies. Ev-
ery missing or wrongly identified dependency was rated as a single
mistake. The first visualization did not contain any dependency to
avoid solving the second task at the same time. An additional bene-
fit of this combination of tasks is, that the participant did not have to
search for the starting point of the second task, whereby the search
had no influence on the result of identifying the dependencies.
4.3 Software Artifact
The analyzed software system is the Apache Tomcat Project [1] be-
ing a good example of medium-sized and freely available software
systems. To handle the visualization in a proper way, the size of the
visualized code was limited to three bigger packages, selected ac-
cording to the amount of the contained classes, methods, attributes,
and relations. Presenting only a subset of the whole system imi-
tates a zoom interaction by the user and limits the time needed per
participant.
4.4 Representation
A suitable representation technique of the software artifact to solve
the two tasks is a nested node-link visualization. This decision has
the advantage that corresponding shapes for 2D and 3D visualiza-
tions exist, e.g., a rectangle in 2D is a cuboid in 3D. Addition-
ally, the containment relation, typical for the structure of software
systems, was realized by nested elements. The packages, classes,
methods, and attributes were mapped to nodes, while the invoca-
tion relations were mapped to edges. The complete mapping is de-
scribed in Table 3.
Entity 2D 3D Color
Package Rectangle Cuboid Pink
Class Rectangle Cuboid Red
Method Ellipse Sphere Blue
Attribute Ellipse Sphere Green
Method call Line Tube Blue
Attribute call Line Tube Green
Table 3: Mapping of software entities in 2D and 3D.
4.5 Navigation & Interaction
The device for interacting with the visualization was a tablet
(ODYS Xelio, Android 4) that contained a touch-surface and com-
municated with the controlling computer via WLAN. To minimize
the effect of the interaction on the outcome of the experiment, a cus-
tomized interface was implemented as an app for the tablet, which
provides similar interactions for both, the 2D and the 3D visual-
izations, using the touch screen. Figure 1 shows the graphical user
interface to control 3D visualizations, containing buttons for mov-
ing left, right, up, down, zooming in/out, rotating around the x-, y-,
z-axis, and resetting the position of the visualization. For 2D visual-
izations, the buttons for rotation were omitted, which was the only
difference in interaction between the 2D and the 3D visualization.
The app uses standard HTTP-GET requests, which were sent to the
web server of the InstantPlayer, to control the user’s movement in
the scene.
Figure 1: Mockups of the graphical user interface for 2D and 3D.
The experiment was performed in a virtual reality environment
with a powerwall composed of three connected screens as the out-
put device. The light in the lab was controlled by darkening all
windows. The image for the wall is generated by two projectors
per screen using the INFITEC-method. Therefore, the participants
had to wear special 3D glasses to receive an immersive view of the
3D visualization. To eliminate the influence of the glasses on the
results, the participants using the 2D visualization also had to wear
some.
4.6 Implementation
For the 3D visualizations, the layout was computed with the FDL
tool [21], whereas the FDP tool of Graphviz [3] was used for 2D
visualizations. Both tools apply a force-directed layout algorithm.
Extracts from the resulting visualizations for the second task are
shown in Figure 2.
We used a generator to create the visualizations for the experi-
ment automatically [11]. The generator utilizes the generative and
the model driven paradigm to process different input formats and
transform them into different output formats with minimum con-
figuration effort. The input for both visualizations was the source
code of the Apache Tomcat Project parsed into a Famix model [14].
The output format of the generator was Extensible 3D (X3D) [2].
X3D served as format for both visualizations and is platform inde-
pendent.
The visualizations were rendered by the InstantPlayer [4]. It has
a wide coverage of the X3D-standard and supports different output
device configurations, including stereoscopic virtual reality envi-
ronments. Additionally, it comes with a built-in web server, with
the ability to access and modify the scene via standard HTTP-GET
requests.
5 RE SULT S
We measured the influence of the independent variable dimension
on our dependent variables time,correctness, and click time. Due to
the sample size of 18 we applied the non-parametric Mann-Whitney
U-Test to check our hypotheses. We chose a significance level of
α= 0.05 corresponding to a 95% confidence interval. Two observa-
tions had to be partially excluded from the results due to technical
problems, one measurement of time and one of click time. Table 4
shows the detailed results of the experiment.
For the first task, none of the differences in time, in correct-
ness, or in the amount of interaction were significant (ptime,1=
0.271,pcorr,1=0.235,pclick,1=0.303). The results indicate, how-
ever, that the experimental group (3D) took less time (18.08%),
was more accurate (+28.57%), and also used less interac-
tion (10.59%).
For the second task, time and interaction were significantly dif-
ferent. There was an increase in time of 42.29% from the control
group to the experimental group. There was an increase in correct-
ness of +14.28% in the second task from the control group to the
experimental group. While the null hypothesis could not be rejected
for time and correctness (pcorr,2=0.405), it could be rejected for
interaction. In the second task, there was a significant increase of
Figure 2: Extracts from 2D and 3D nested node-link visualization
(overview/zoom).
111.41% of interactions from the control group to the experimen-
tal group (pclick,2=0.030). Thus, there is a strong indication that
3D does not decrease the time to analyze a software system. There
was no significant difference in correctness. The results from the
experiment rather suggest the following conclusions:
Time: The third dimension does not decrease the time to solve
a software comprehension task in medium-sized software systems.
The decrease of 18.08% for the first task was not significant. Only
the increase of 42.29% for the second task was significant.
Correctness: The third dimension does not increase the correct-
ness of a solved software comprehension task in medium-sized soft-
ware systems. Even though the 3D group made less mistakes, this
difference was not significant.
Click time: The third dimension increases the interaction re-
quired to solve a software comprehension task in medium-sized
software systems. There was an increase of 111.41%.
User experience: In the questionnaire regarding the user expe-
rience, all participants were included, also the formerly excluded
outliers. The experimental group, i.e., the 3D group, rated slightly
more positive. The majority of this group experienced the inherent
3D visualization more motivating, less demanding, more inventive,
more innovative, and more clearly structured.
6 DISCUSSION
In this section, we discuss the results of the experiment and the
feasibility of the structured approach.
6.1 Experiment
The main objective of the experiment was to investigate the influ-
ence of the dimensionality in software visualizations on time, cor-
rectness, and interaction. From the three hypotheses only the influ-
ence of interaction was significant in the second task. This effect
could be explained by the increased amount of degrees of freedom
necessary for navigation in inherent 3D software visualizations. In
addition to the left, right, up, down, zoom in/out and reset controls
there are rotations around the x-, y-, and z-axis in both directions
each.
Due to the small sample size and the choice of very basic tasks,
the results have to be interpreted with caution. It is possible that out-
Time [s] Correctness [%] Click Time [s]
Task 1 Task 2 Task 1 Task 2 Task 1 Task 2
2D 3D 2D 3D 2D 3D 2D 3D 2D 3D 2D 3D
n9 8 9 8 9 9 9 9 8 9 8 9
min 20.86 16.31 29.38 72.98 0 100 0 50.02 1.37 1.10 0.80 1.48
max 130.29 102.80 247.05 191.19 100 100 100 100 4.27 4.84 15.06 26.16
median 49.42 40.86 71.92 128.93 100 100 100 100 2.73 1.88 2.47 6.59
mean 59.09 48.40 89.33 127.11 77.78 100 77.78 88.89 2.77 2.48 3.88 8.21
difference [%] 18.08 +42.29 +28.57 +14.28 10.59 +111.41
σ33.27 31.43 63.92 38.67 44.10 0 39.96 18.63 1.10 1.48 4.69 7.39
Table 4: Descriptive statistics of the dependent variables.
liers distort the results. In future experiments we aim at a minimum
of 20 participants per group to gain more reliable results. Addition-
ally, the choice of tasks should be improved in the future. In our
experimental design the tasks were not independent of each other.
Therefore, if a participant made an error in the first task, the second
one mostly resulted in an error, too. For this reason, the different
tasks should be all mutually independent. Moreover, there should
be more than two tasks in such an experiment. Finally, participants
should be assigned to the groups before the experiment based on
a pre-questionnaire in order to assure an equal distribution among
groups from the beginning.
6.2 Approach
Nonetheless, the experiment shows, that our approach is suitable
to plan and conduct comparing experiments in the field of software
visualization. The structured approach was helpful to determine
the relevant influence factors. Further, we were able to control
these factors either by holding them constant or by measuring them.
Hence, we measured relevant characteristics from the participants.
We used the same tasks, the same software artifact, and the same
virtual reality environment. Moreover, we applied a similar visu-
alization (nested node-link) as well as navigation and interaction
technique (specific interface) reducing the differences between both
groups to a minimum. Finally, we assured that both visualizations
contained the same information by using the generator including
Famix and X3D.
7 CONCLUSION AND FUTURE WORK
In this paper, we presented an experiment examining the influence
of the third dimension in software visualization. The underlying
structured approach supported us in planning and conducting this
experiment and creating comparable empirical data for our series.
We outlined different approaches to overcome specific challenges
comparing 2D vs. 3D software visualizations.
For the moment, we state that the third dimension does not show
significant advantages in solving a task for medium-sized software
systems. But as discussed in the previous section, these results have
to be interpreted with caution.
In the future, we consider the findings from this experiment and
plan to conduct further experiments to investigate other factors re-
sponsible for the advantages and disadvantages of software visual-
izations.
REFERENCES
[1] Apache Tomcat. http://tomcat.apache.org/. Accessed:
2014-09-18.
[2] Extensible 3D (X3D). http://www.web3d.org/x3d/. Ac-
cessed: 2014-09-18.
[3] Graphviz. http://www.graphviz.org/. Accessed: 2014-09-
18.
[4] InstantReality. http://www.instantreality.org/. Ac-
cessed: 2014-09-18.
[5] S. Carpendale. Evaluating information visualizations. In A. Ker-
ren, J. Stasko, J.-D. Fekete, and C. North, editors, Inf. Vis. Human-
Centered Issues Perspect., pages 19–45, Berlin, Heidelberg, 2008.
Springer.
[6] A. Cockburn and B. McKenzie. 3D or not 3D?: evaluating the effect
of the third dimension in a document management system. In Proc.
SIGCHI Conf. Hum. factors Comput. Syst., pages 434–441. ACM,
Mar. 2001.
[7] M. Di Penta, R. E. K. Stirewalt, and E. Kraemer. Designing your Next
Empirical Study on Program Comprehension. In 15th Int. Conf. Progr.
Compr., pages 281–285, 2007.
[8] M. Hassenzahl, A. Platz, M. Burmester, and K. Lehner. Hedonic and
ergonomic quality aspects determine a software’s appeal. In CHI,
pages 201–208, 2000.
[9] P. Irani and C. Ware. Diagramming information structures using 3D
perceptual primitives. ACM Trans. Comput. Interact., 10(1):1–19,
2003.
[10] M. Meyer, M. Sedlmair, and T. Munzner. The four-level nested model
revisited: blocks and guidelines. In Work. BEyond time errors Nov.
Eval. methods Inf. Vis., pages 1–6, 2012.
[11] R. M¨
uller, P. Kovacs, J. Schilbach, and U. Eisenecker. Generative
Software Visualization: Automatic Generation of User-Specific Visu-
alizations. In Proc. Int. Work. Digit. Eng., pages 45–49, Magdeburg,
Germany, 2011.
[12] R. M¨
uller, P. Kovacs, J. Schilbach, U. Eisenecker, D. Zeckzer, and
G. Scheuermann. A Structured Approach for Conducting a Series of
Controlled Experiments in Software Visualization. In Proc. 5th Int.
Conf. Vis. Theory Appl., pages 204–209, Lisbon, Portugal, 2014.
[13] T. Munzner. A nested model for visualization design and validation.
IEEE Trans. Vis. Comput. Graph., 15(6):921–928, 2009.
[14] O. Nierstrasz, S. Ducasse, and T. Gˆ
ırba. The story of moose: an agile
reengineering environment. In Proc. 10th Eur. Softw. Eng. Conf. held
jointly with 13th SIGSOFT Int. Symp. Found. Softw. Eng., pages 1–10,
New York, USA, 2005. ACM.
[15] M. Sensalire, P. Ogao, and A. Telea. Evaluation of software visual-
ization tools: Lessons learned. In 5th Int. Work. Vis. Softw. Underst.
Anal., pages 19–26. IEEE, 2009.
[16] D. I. K. Sjø berg, T. Dyb˚
a, and M. Jø rgensen. The Future of Empirical
Methods in Software Engineering Research. In Fut. Softw. Eng., pages
358–378. IEEE, May 2007.
[17] J. Stasko and J. Wehrli. Three-dimensional computation visualization.
Proc. 1993 IEEE Symp. Vis. Lang., pages 100–107, 1993.
[18] C. Ware and G. Franck. Viewing a graph in a virtual reality display is
three times as good as a 2D diagram. IEEE Symp. Vis. Lang., pages
182–183, 1994.
[19] R. Wettel, M. Lanza, and R. Robbes. Software systems as cities: A
controlled experiment. In Proc. 33rd Int. Conf. Softw. Eng., pages
551–560, Waikiki, Honolulu, USA, 2011. ACM.
[20] C. Wohlin, P. Runeson, M. H ¨
ost, M. C. Ohlsson, B. Regnell, and
A. Wessl´
en. Experimentation in Software Engineering. Springer,
2012.
[21] D. Zeckzer, R. Kalckl¨
osch, L. Schr¨
oder, H. Hagen, and T. Klein. An-
alyzing the reliability of communication between software entities us-
ing a 3D visualization of clustered graphs. In Proc. 4th ACM Symp.
Softw. Vis., pages 37–46, New York, USA, Sept. 2008. ACM.
... We already conducted several quantitative and qualitative studies that are based entirely or partly on the presented toolset [6], [17], [32]. GETAVIZ is developed continuously. ...
Conference Paper
Full-text available
Software visualizations are used to support stake-holders in software engineering activities like development, project management, and maintenance. The respective tasks determine which aspects of software, i.e., structural, behavioral and/or evolutionary information, need to be visualized. To promote the usage of software visualizations they have to optimally support the needs of the respective stakeholder for the specific task at hand. Therefore, we see the necessity to create innovative visualizations and to optimize existing ones. In order to achieve this, it is necessary to empirically evaluate the different visual-izations and their variants. In this paper, we present GETAVIZ as a toolset to support these processes, i.e., designing visualizations, generating task-and role specific visualizations, and conducting empirical evaluations. The toolset implements the concept of generative and model-driven software visualization and makes it possible to generate different visualizations for all three aspects of software. Its strength lies in its adaptability, so that new visualizations and variations of existing ones can be implemented easily. In addition to the generator this toolset contains several extractors for different programming languages, a browser-based user interface for viewing and interacting with visualizations, and an evaluation server to facilitate the execution of local and remote experiments. The paper illustrates the capabilities of GETAVIZ and it discusses plans for its further development.
Chapter
The digital transformation occurring in enterprises results in an increasingly dynamic and complex IT landscape that in turn impacts enterprise architecture (EA) and its artefacts. New approaches for dealing with more complex and dynamic models and conveying EA structural and relational insights are needed. As EA tools attempt to address these challenges, virtual reality (VR) can potentially enhance EA tool capabilities and user insight but further investigation is needed in how this can be achieved. This paper contributes a VR solution concept for visualizing, navigating, and interacting with EA tool dynamically-generated diagrams and models using the EA tool Atlas. An implementation shows its feasibility and a case study using EA scenarios is used to demonstrate its potential.
Conference Paper
Full-text available
In the field of software visualization controlled experiments are an important instrument to investigate the specific reasons, why some software visualizations excel the expectations on providing insights and ease task solving while others fail doing so. Despite this, controlled experiments in software visualization are rare. A reason for this is the fact that performing such evaluations in general, and particularly performing them in a way that minimizes the threats to validity, is hard to accomplish. In this paper, we present a structured approach on how to conduct a series of controlled experiments in order to give empirical evidence for advantages and disadvantages of software visualizations in general and of 2D vs. 3D software visualizations in particular.
Chapter
Full-text available
Information visualization research is becoming more established, and as a result, it is becoming increasingly important that research in this field is validated. With the general increase in information visualization research there has also been an increase, albeit disproportionately small, in the amount of empirical work directly focused on information visualization. The purpose of this chapter is to increase awareness of empirical research in general, of its relationship to information visualization in particular; to emphasize its importance; and to encourage thoughtful application of a greater variety of evaluative research methodologies in information visualization.
Conference Paper
Full-text available
Many software visualization (SoftVis) tools are continuously being developed by both researchers as well as software development companies. In order to determine if the developed tools are effective in helping their target users, it is desirable that they are exposed to a proper evaluation. Despite this, there is still lack of a general guideline on how these evaluations should be carried out and many of the tool developers perform very limited or no evaluation of their tools. Each person that carries out one evaluation, however, has experiences which, if shared, can guide future evaluators. This paper presents the lessons learned from evaluating over 20 SoftVis tools with over 90 users in five different studies spread on a period of over two years. The lessons covered include the selection of the tools, tasks, as well as evaluation participants. Other discussed points are related to the duration of the evaluation experiment, its location, the procedure followed when carrying out the experiment, as well as motivation of the participants. Finally, an analysis of the lessons learned is shown with the hope that these lessons will be of some assistance to future SoftVis tool evaluators.
Conference Paper
Full-text available
MOOSE is a language-independent environment for reverse- and re-engineering complex software systems. MOOSE provides a set of services including a common meta-model, metrics evaluation and visualization, a model repository, and generic GUI support for querying, browsing and grouping. The development effort invested in MOOSE has paid off in precisely those research activities that benefit from applying a combination of complementary techniques. We describe how MOOSE has evolved over the years, we draw a number of lessons learned from our experience, and we outline the present and future of MOOSE.
Conference Paper
Full-text available
Software visualization is a popular program comprehension technique used in the context of software maintenance, reverse engineering, and software evolution analysis. While there is a broad range of software visualization approaches, only few have been empirically evaluated. This is detrimental to the acceptance of software visualization in both the academic and the industrial world. We present a controlled experiment for the empirical evaluation of a 3D software visualization approach based on a city metaphor and implemented in a tool called CodeCity. The goal is to provide experimental evidence of the viability of our approach in the context of program comprehension by having subjects perform tasks related to program comprehension. We designed our experiment based on lessons extracted from the current body of research. We conducted the experiment in four locations across three countries, involving 41 participants from both academia and industry. The experiment shows that CodeCity leads to a statistically significant increase in terms of task correctness and decrease in task completion time. We detail the experiment we performed, discuss its results and reflect on the many lessons learned.
Conference Paper
Full-text available
The present study examines the role of subjectively perceived ergonomic quality (e.g. simplicity, controllability) and hedonic quality (e.g. novelty, originality) of a software system in forming a judgement of appeal. A hypothesised research model is presented. The two main research question are: (1) Are ergonomic and hedonic quality subjectively different quality aspects that can be independently perceived by the users? and (2) Is the judgement of appeal formed by combining and weighting ergonomic and hedonic quality and which weights are assigned?The results suggest that both quality aspects can be independently perceived by users. Moreover, they almost equally contributed to the appeal of the tested software prototypes. A simple averaging model implies that both quality aspects will compensate each other.Limitations and practical implication of the results are discussed.
Chapter
The experiment data from the operation is input to the analysis and interpretation. After collecting experimental data in the operation phase, we want to be able to draw conclusions based on this data. To be able to draw valid conclusions, we must interpret the experiment data.
Conference Paper
MOOSE is a language-independent environment for reverse- and re-engineering complex software systems. MOOSE provides a set of services including a common meta-model, metrics evaluation and visualization, a model repository, and generic GUI support for querying, browsing and grouping. The development effort invested in MOOSE has paid off in precisely those research activities that benefit from applying a combination of complementary techniques. We describe how MOOSE has evolved over the years, we draw a number of lessons learned from our experience, and we outline the present and future of MOOSE.
Conference Paper
We propose an extension to the four-level nested model of design and validation of visualization system that defines the term "guidelines" in terms of blocks at each level. Blocks are the outcomes of the design process at a specific level, and guidelines discuss relationships between these blocks. Within-level guidelines provide comparisons for blocks within the same level, while between-level guidelines provide mappings between adjacent levels of design. These guidelines help a designer choose which abstractions, techniques, and algorithms are reasonable to combine when building a visualization system. This definition of guideline allows analysis of how the validation efforts in different kinds of papers typically lead to different kinds of guidelines. Analysis through the lens of blocks and guidelines also led us to identify four major needs: a definition of the meaning of block at the problem level; mid-level task taxonomies to fill in the blocks at the abstraction level; refinement of the model itself at the abstraction level; and a more complete set of mappings up from the algorithm level to the technique level. These gaps in visualization knowledge present rich opportunities for future work.
Article
We present a nested model for the visualization design and validation with four layers: characterize the task and data in the vocabulary of the problem domain, abstract into operations and data types, design visual encoding and interaction techniques, and create algorithms to execute techniques efficiently. The output from a level above is input to the level below, bringing attention to the design challenge that an upstream error inevitably cascades to all downstream levels. This model provides prescriptive guidance for determining appropriate evaluation approaches by identifying threats to validity unique to each level. We also provide three recommendations motivated by this model: authors should distinguish between these levels when claiming contributions at more than one of them, authors should explicitly state upstream assumptions at levels above the focus of a paper, and visualization venues should accept more papers on domain characterization.