Conference PaperPDF Available

VISUFLOW: a debugging environment for static analyses

Authors:
  • Google Switzerland

Abstract and Figures

Code developers in industry frequently use static analysis tools to detect and fix software defects in their code. But what about defects in the static analyses themselves? While debugging application code is a difficult, time-consuming task, debugging a static analysis is even harder. We have surveyed 115 static analysis writers to determine what makes static analysis difficult to debug, and to identify which debugging features would be desirable for static analysis. Based on this information, we have created Visijflow, a debugging environment for static data-flow analysis. Visuflow is built as an Eclipse plugin, and supports analyses written on top of the program analysis framework Soot. The different components in Visuflow provide analysis writers with visualizations of the internal computations of the analysis, and actionable debugging features to support debugging static analyses. A video demo of Visuflow is available online: https://www.youtube.com/watch?v=BkEfBDwiuH4
Content may be subject to copyright.
VISUFLOW: a Debugging Environment for Static Analyses
Lisa Nguyen Quang Do
Fraunhofer IEM
lisa.nguyen@iem.fraunhofer.de
Stefan Krüger
Paderborn University
stefan.krueger@upb.de
Patrick Hill
Paderborn University
pahill@campus.uni-paderborn.de
Karim Ali
University of Alberta
karim.ali@ualberta.ca
Eric Bodden
Paderborn University & Fraunhofer IEM
eric.bodden@upb.de
ABSTRACT
Code developers in industry frequently use static analysis tools to
detect and x software defects in their code. But what about defects
in the static analyses themselves? While debugging application
code is a dicult, time-consuming task, debugging a static analysis
is even harder. We have surveyed 115 static analysis writers to
determine what makes static analysis dicult to debug, and to iden-
tify which debugging features would be desirable for static analysis.
Based on this information, we have created V, a debugging
environment for static data-ow analysis. V is built as an
Eclipse plugin, and supports analyses written on top of the program
analysis framework S. The dierent components in V
provide analysis writers with visualizations of the internal computa-
tions of the analysis, and actionable debugging features to support
debugging static analyses. A video demo of V is available
online: https://www.youtube.com/watch?v=BkEfBDwiuH4
CCS CONCEPTS
Software and its engineering Software testing and debug-
ging; Theory of computation Program analysis; Human-
centered computing Empirical studies in visualization;
KEYWORDS
Debugging, Static analysis, IDE, Survey, User Study, Empirical
Software Engineering
1 INTRODUCTION
As more and more complex software is written every day, ensur-
ing its functionality, quality, and security becomes increasingly
important and dicult to achieve. Static analysis is particularly
useful in that regard, because it allows developers to reason even
about partial/incomplete programs. Researchers and practition-
ers are continuously contributing to various static analysis frame-
works [
3
,
4
,
9
,
13
]. Yet, as application code gets more sophisticated,
writing a static analysis for it becomes increasingly harder as well,
and, as we show in this paper, debugging the analysis can prove
more complicated than debugging the analyzed code.
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
ICSE ’18 Companion, Gothenburg, Sweden
© 2018 Copyright held by the owner/author(s). 978-1-4503-5663-3/18/05.. .$15.00
DOI: 10.1145/3183440.3183470
We have conducted a large-scale survey [
11
] with 115 analysis
writers to determine the diculties of debugging static analysis
compared to general application code. Our ndings show that bugs
found in static analyses are quite dierent from those generally
found in application code. Current debugging tools do not support
debugging those specic bugs (e.g., wrong assumptions on how the
static analysis framework interprets the analyzed code), making
it much harder for analysis writers to nd the cause of an error.
The main cause of this problem is the inability to visualize how the
analysis behaves at a given point in time.
Based on the responses we received from our survey participants,
we have identied the debugging features required to provide analy-
sis writers with helpful visuals. In this paper, we present V,
a debugging environment for static analysis built in Eclipse, an in-
tegrated development environment (IDE). V is designed to
help analysis writers debug their analyses through comprehensive
visualizations. In particular, it provides the following features:
Access to the intermediate representation of the analyzed code.
Interactive graph visualizations of the analyzed code.
Overview of the intermediate results of the analysis.
Breakpoint and stepping functionalities for both the analysis
code and the analyzed code.
2 MOTIVATION
Our survey explores how analysis writers debug static analysis
and which features are most useful when debugging static anal-
yses compared to general application code. The 115 participants
cover dierent branches of static analysis, including the analyzed
language, the types of static analysis (e.g., data-ow, abstract in-
terpretation), and the analysis frameworks. We have made the
anonymized responses available online [10].
We observed that 5.3
×
more participants nd static analyses
harder to debug than application code, because debugging static
analyses requires one to comprehend two dierent codebases (the
analysis code and the analyzed code) instead of just the application
code, resulting in more complex corner cases. Additionally, the
correctness of an analysis is not directly veriable, as opposed to
the output of application code:
“Static analysis code usually deals with
massive amounts of data. [...] It is harder to see where a certain state is
computed, or even worse, why it is not computed.
Those specic properties of static analysis directly inuence the
types of bugs that are found in static-analysis code. In our study,
81.3% of the participants report that the main cause of bugs in
application code is programming errors such as
“wrong conditions,
wrong loops statements”. While those bugs exist in static analyses,
89
2018 ACM/IEEE 40th International Conference on Software Engineering: Companion Proceedings
ICSE ’18 Companion, May 27-June 3, 2018, Gothenburg, Sweden Lisa Nguyen ang Do, Stefan Krüger, Patrick Hill, Karim Ali,
Eric Bodden
Graph visuals
Other visuals
IR
Test generation
Quick updates
Breakpoints
Stepping
0%
25 %
50 %
75 %
100 %
Not Important Neutral Important
Very Important N/A
Figure 1: Importance of the features for debugging static
analysis. IR denotes Intermediate Representation.
they are only mentioned by 41.7% of the participants. Corner cases,
algorithmic errors, and handling the analysis semantics and infras-
tructure are signicantly more prevalent in static analysis than in
application code.
Since the bugs found in static analyses are dierent from the ones
found in application code, one would expect dierent debugging
tools to be used for those two types of code. However, our survey
showed that regardless of the type of code that participants write,
they use the same techniques: breakpoints and stepping, variable
inspection, printing intermediate results, debugging tools such as
gdb
or the Eclipse integrated debugger, and coding support such as
auto-completion. Interestingly, participants expressed dissatisfac-
tion with their debugging tools in general. This shows that existing
debugging tools are not sucient to fully support static analysis
writers:
“While the IDE can show a path through [my] code for a symbolic
execution run, it doesn’t show analysis states along that path.Overall,
current debugging tools miss one crucial component to properly
support static analysis: visibility of what is (not) computed in the
analysis at a given point of its execution.
We asked the participants which debugging features would be
useful in a debugging environment and noticed a signicant dier-
ence between the features requested to debug static analysis and
application code ((
p=
0
.
04
0
.
05) for a
X2
test). For application
code, participants requested better coding support and hot-code
replacement. For static analysis, participants requested better vi-
sualizations of the analysis constructs such as the intermediate
representation of the code or intermediate results of the analysis
(18.4%), graph visualizations of the analysis (23.7%), omniscient
debugging (13.2%) [
8
], and better breakpoints and stepping func-
tionalities that would allow them to step through both the analysis
code and the analyzed code at the same time. Figure 1 details the
relative importance of those debugging features.
Through our survey, we see that current debugging tools are
designed for application code, and while helpful, they are not su-
cient to fully support debugging static analysis. We identied the
following debugging features that a static analysis debugger should
provide: graph visualizations, access to analysis intermediate re-
sults, and better conditional breakpoints.
3 VISUFLOW
Based on the features identied in the survey, we present V,
a debugging environment for static analyses. We have implemented
V on top of the Eclipse IDE, and it supports debugging
static data-ow analyses written on top of the S [
14
] analysis
framework. Our survey shows the importance of displaying both
the analysis code and the analyzed code, and in particular, high-
lighting how the former handles the latter. With this goal in mind,
we have designed V provide such information, presented
in an understandable and usable manner.
We detail below the main functionalities of V, illustrated
in Figure 2, using the corresponding numbers.
1. Java Editor:
We used the standard Eclipse Java Editor and its
functionalities (e.g., Eclipse breakpoints) to display the analysis
code, since providing users with familiar views and functionali-
ties allows for a better integration of V into the Eclipse
IDE. We extended the Java Editor to add navigation functions
between this view and the other views, as detailed below.
2. Jimple View:
S converts Java code to an intermediate repre-
sentation called Jimple, which it then analyzes. To show analysis
writers how the analysis handles Jimple code, we introduced a
Jimple View that shows the analyzed code in Jimple format. The
Jimple code is not editable, because it is automatically gener-
ated by S. Similar to the Java Editor, the Jimple View oers
navigation functionalities to other views.
3. Unit Breakpoints:
In the Jimple View, analysis writers can set
special breakpoints called Unit Breakpoints that stop the execu-
tion of the analysis at a given unit (i.e., Jimple statement). Once
the execution stops, V highlights the Jimple statement
being currently analyzed, and the user may step through the
intermediate representation of the analyzed code to debug their
analysis. By jointly using the stepping functionalities of the Unit
Breakpoints and the Java Editor’s breakpoints, analysis writers
can step through both code bases at the same time, without need-
ing to write complex conditional breakpoints in the standard
Eclipse debugger.
4. Graph View:
By default, this view displays the call graph of the
analyzed code. The user can explore the control ow graph (CFG)
of a particular method by selecting its node in the call graph.
One can also navigate between the CFGs of dierent methods
through navigation menus as shown in Figure 2. On the edges of
the CFGs, V displays the information propagated by the
analysis. Such access to the intermediate results of the analysis
allows users to follow the data ows and locate miscalculations
more easily. When stepping through the Jimple code, V
highlights the corresponding unit in the Graph View. Tooltips
give access to more information about the dierent units.
The Graph View is intended to give the user a better un-
derstanding of the structure of the analyzed code, while also
providing a more visual approach to debugging. It combines dif-
ferent information about the analysis, the analyzed code, and the
intermediate results, thus providing an easy way to keep track of
multiple things without cluttering the interface with code. These
characteristics also cause the graphs to act as navigational hubs
from which the user can jump to the dierent views based on
what is observed in the graph to gather more information. The
90
VISUFLOW: a Debugging Environment for Static Analyses ICSE ’18 Companion, May 27-June 3, 2018, Gothenburg, Sweden
Figure 2: The graphical user interface of V. We describe the labeled views in Section 3.
graph information, retrieved from the data model, is displayed
using the open source framework GraphStream [
2
], which scales
to graphs with a large number of nodes.
5. Results View:
The intermediate results are also shown in the
Results View, along with additional information such as editable
tags that the developer can use to mark specic units. This view
also contains searching and ltering functionalities, which is
especially helpful for methods with a large number of statements.
6. Unit Inspection View:
This view enables analysis writers who
are unfamiliar with S and Jimple to inspect Jimple state-
ments and see how they are constructed. Since S analyses
manipulate Jimple statements, understanding Jimple units helps
analysis writers design appropriate analysis rules, or judge the
correctness of existing ones.
7. Synchronization and Navigation:
V allows users to
synchronize the Unit Inspection View, the Graph View and the
Results View so that statements that are selected in a view are
also shown in the other views. In order to connect the dierent
views, we enriched V with navigation features that allow
developers to switch between views through drop-down menus,
as illustrated in Figure 2. The synchronization functionalities
provide a smooth navigation between all views that does not
disrupt the analysis writer by forcing them to manually look for
a specic statement when switching between views.
The dierent views and breakpoints of V are imple-
mented as an extension of the standard Eclipse UI components. To
populate the views with analysis information, V maintains
an internal data model of the intermediate representation of the an-
alyzed code. The tool hooks into Eclipse’s builder and uses Eclipse’s
OSGi-Event Model to populate and update this data model at every
change in the code base. The intermediate results of the analysis
are updated every time the analysis is re-run, using a Java agent to
instrument the analysis to collect the information at runtime.
4 EVALUATION
To evaluate the usefulness of V, we conducted a user study
with 20 participants. We prepared faulty analyses that do not pro-
duce correct results for a given piece of analyzed code. Each par-
ticipant debugged two such analyses, each containing three errors,
and used V for one analysis, and the standard Eclipse de-
bugging environment (hereafter named Eclipse) for the other. Half
of the participants used V rst, and the other half Eclipse
rst. We recorded how many errors the participants could identify
and x, how long they spent using each feature of the tools, and
asked them to ll a comparative questionnaire and to discuss their
impressions of the two tools. The results are available online [10].
Figure 1 shows that the IDE features that were most used by the
participants include many of V’s features (i.e., the Graph
View, the Jimple View, the breakpoints, and the Results View). With
V, participants spent 25.6% less time using the Java Editor,
and 44.4% less time stepping through the code. Instead, they spent
this time using the Graph View and the Results View. This shows
that graphs, special breakpoints, and access to the intermediate
representation and to the intermediate results are desirable features
in a debugging environment for static analyses, conrming the
ndings of our survey. Moreover, Eclipse’s breakpoints editor was
used 88.2% less often in V than in Eclipse. We attribute
this to the special breakpoint features in V, which relieve
users from dening complex conditional breakpoints.
Overall, participants identied 25% and xed 50% more errors
with V than with Eclipse. For Task 1, they identied and
xed 1.4
×
more errors, and for Task 2, they identied 1.1
×
and xed
91
ICSE ’18 Companion, May 27-June 3, 2018, Gothenburg, Sweden Lisa Nguyen ang Do, Stefan Krüger, Patrick Hill, Karim Ali,
Eric Bodden
Table 1: Main features of V and Eclipse that partic-
ipants used, and the average time spent using each feature.
V Eclipse
#users Time (s) #users Time (s)
Java Editor 14 486 14 653
Graph View 14 201 n/a n/a
Jimple View 11 58 12 60
Breakpoints / Stepping 11 174 11 313
Variable Inspection 3 78 8 67
Results View 8 50 n/a n/a
Console 5 24 7 40
Drop Frame 5 12 3 5
Breakpoints View 3 13 2 110
Unit View 3 7 n/a n/a
1.6
×
more errors. Out of the 20 participants in our study, 12 partici-
pants are used to debugging their analyses using Eclipse. However,
7 of them still found and xed more errors using V.
In general, V was positively received by the participants
of our user study. In the questionnaire, they gave it a Net Promoter
Score [
12
] of 9.1/10 compared to Eclipse (standard deviation
σ
= 1.1),
and 8.3/10 compared to their own coding environment (
σ
= 1.7).
In the questionnaire and interviews, participants stressed the use-
fulness of the Graph, Jimple, and Results View, and V’s
special breakpoints, especially in understanding both the analysis
and the analyzed code, and how they interacted: “I think [V]
is helpful because of the linkage between the Java code, the Jimple code
and the graphic visualization: all that I had to keep in my mind [earlier]..
5 RELATED WORK
While we are not aware of any debugging tool that targets writers
of static analyses as V does, a few tools have been pub-
lished that oer at least subsets of V’s features. By means
of integrated Eclipse views, the software-analysis platform Atlas [
5
]
is able to visualize data-ows through a given program. Path Pro-
jection [
6
] supports users of static analysis in understanding error
messages and locating the related lines of code. However, neither
of these tools target analysis developers, and therefore, they do not
have the comprehensive feature set that V provides for
debugging static analyses.
V is tightly integrated into Eclipse. Consequently, its
users can use all of Eclipse’s integrated debugging functionalities
for Java, such as breakpoints and stepping. While these are general-
purpose debugging features that lack the necessary focus on static
analysis, they still prove useful in any scenario involving debugging.
V does not support more complex paradigms for debugging
such as delta [
15
], omniscient [
8
], and interrogative debugging [
7
].
We leave the exploration of how V could benet from their
integration to future work.
6 CONCLUSION
We presented V, a debugging environment for static anal-
ysis. Through a user study, we demonstrated that the features in
V, designed to help visualize the internal computations of
the analysis, help analysis writers identify and x more errors than
with the standard Eclipse IDE. In future work, we plan to explore
better visualizations for the analysis of large code bases, especially
in terms of collapsable graphs and quick response to user modi-
cations in either of the code bases. V instantiates several
of the debugging features identied in our survey for S-based,
static data-ow analysis. It would be interesting to explore how to
adapt the features for other types of static analysis. Other features
suggested in the survey (e.g., omniscient debugging and quick up-
dates) are not currently oered by V. We plan to explore
adding those features to V in the future. The anonymized
answers to the survey and user study are available online [
10
]. V
 is open source [
10
], and we welcome contributions under
the Apache 2.0 licence [1].
ACKNOWLEDGEMENTS
We thank Henrik Niehaus, Shashank Basavapatna Subramanya,
Kaarthik Rao Bekal Radhakrishna, Zafar Habeeb Syed, Nishitha
Shivegowda, Yannick Kouotang Signe, and Ram Muthiah Bose
Muthian for their work on the implementation of V. This
research was supported by a Fraunhofer Attract grant as well as
the Heinz Nixdorf Foundation. This work has also been partially
funded by the DFG as part of project E1 within the CRC 1119 CROSS-
ING, and was supported by the Natural Sciences and Engineering
Research Council of Canada.
REFERENCES
[1]
2017. Apache License 2.0. https://www.apache.org/licenses/LICENSE-2.0. (2017).
[2] 2017. GraphStream. http://graphstream-project.org/. (2017).
[3]
Eric Bodden. 2012. Inter-procedural data-ow analysis with IFDS/IDE and Soot.
In International Workshop on State of the Art in Java Program Analysis (SOAP).
3–8. https://doi.org/10.1145/2259051.2259052
[4]
Cristiano Calcagno and Dino Distefano. 2011. Infer: An Automatic Program
Verier for Memory Safety of C Programs. In NASA Formal Methods (NFM)
(Lecture Notes in Computer Science), Vol. 6617. 459–465.
[5]
Tom Deering, Suresh Kothari, Jeremias Sauceda, and Jon Mathews. 2014. Atlas:
A New Way to Explore Software, Build Analysis Tools (ICSE Companion 2014).
ACM, New York, NY, USA, 588–591. https://doi.org/10.1145/2591062.2591065
[6]
Yit Phang Khoo, Jerey S. Foster, Michael Hicks, and Vibha Sazawal. 2008. Path
Projection for User-centered Static Analysis Tools (PASTE ’08). ACM, New York,
NY, USA, 57–63. https://doi.org/10.1145/1512475.1512488
[7]
Andrew Jensen Ko and Brad A. Myers. 2004. Designing the whyline: a debugging
interface for asking questions about program behavior. In CHI 2004, Vienna,
Austria, April 24 - 29, 2004. 151–158.
[8]
Bil Lewis. 2003. Debugging Backwards in Time. CoRR cs.SE/0310016 (2003).
http://arxiv.org/abs/cs.SE/0310016
[9]
Lisa Nguyen Quang Do, Karim Ali, Benjamin Livshits, Eric Bodden, Justin Smith,
and Emerson Murphy-Hill. 2017. Just-in-time Static Analysis (ISSTA 2017). ACM,
New York, NY, USA, 307–317.
[10]
Lisa Nguyen Quang Do, Stefan Krüger, Patrick Hill, Karim Ali, and Eric Bodden.
2017. VisuFlow. https://blogs.uni-paderborn.de/sse/tools/visuow-debugging-
static-analysis/. (2017).
[11]
Lisa Nguyen Quang Do, Stefan Krüger, Patrick Hill, Karim Ali, and Eric Bodden.
2018. Debugging Static Analysis. Technical Report. arXiv:cs.SE/1801.04894
[12]
Frederick F Reichheld. 2003. The one number you need to grow. Harvard Business
Review 81, 12 (2003), 46–55.
[13]
Haihao Shen, Jianhong Fang, and Jianjun Zhao. 2011. Endbugs: Eective error
ranking for ndbugs. In International Conference on Software Testing, Verication
and Validation (ICST). 299–308.
[14]
Raja Vallée-Rai, Etienne Gagnon, Laurie J. Hendren, Patrick Lam, Patrice Pom-
inville, and Vijay Sundaresan. 2000. Optimizing Java Bytecode Using the Soot
Framework: Is It Feasible?. In CC. 18–34. https://doi.org/10.1007/3- 540-46423-9_2
[15]
Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and Isolating Failure-
Inducing Input. IEEE Trans. Software Eng. 28, 2 (2002), 183–200.
92
... Some of them, such as Hdpv [72], Heapviz [3], and Anteater [24], present task-specific or code-related information about the execution by giving a forest view. Others can reveal the runtime behavior, such as DDD [82], deet [32], ZStep 95 [50], and VisuFlow [57]. Whyline [45] and Theseus [49] introduce visualizations within integrated development environments, while FireCrystal [59] and Timelapse [14] focus on visualizing interactive behaviors on web pages. ...
Preprint
Data workers use various scripting languages for data transformation, such as SAS, R, and Python. However, understanding intricate code pieces requires advanced programming skills, which hinders data workers from grasping the idea of data transformation at ease. Program visualization is beneficial for debugging and education and has the potential to illustrate transformations intuitively and interactively. In this paper, we explore visualization design for demonstrating the semantics of code pieces in the context of data transformation. First, to depict individual data transformations, we structure a design space by two primary dimensions, i.e., key parameters to encode and possible visual channels to be mapped. Then, we derive a collection of 23 glyphs that visualize the semantics of transformations. Next, we design a pipeline, named Somnus, that provides an overview of the creation and evolution of data tables using a provenance graph. At the same time, it allows detailed investigation of individual transformations. User feedback on Somnus is positive. Our study participants achieved better accuracy with less time using Somnus, and preferred it over carefully-crafted textual description. Further, we provide two example applications to demonstrate the utility and versatility of Somnus.
... In the following paragraphs, we describe tools that play a similar role for other programming languages than our framework for C++. VISUFLOW [9] is a tool to help debug static analysis software. While it is great for debugging problems on small reproducers, it is not suitable to debug problems that only manifest on large projects, such as cut heuristics and exploration strategy related issues in symbolic execution. ...
Article
Program faults, best known as bugs, are practically unavoidable in today's ever growing software systems. One increasingly popular way of eliminating them, besides tests, dynamic analysis, and fuzzing, is using static analysis based bug-finding tools. Such tools are capable of finding surprisingly sophisticated bugs automatically by inspecting the source code. Their analysis is usually both unsound and incomplete, but still very useful in practice, as they can find non-trivial problems in a reasonable time (e.g. within hours, for an industrial project) without human intervention Because the problems that static analyzers try to solve are hard, usually intractable, they use various approximations that need to be fine-tuned in order to grant a good user experience (i.e. as many interesting bugs with as few distracting false alarms as possible). For each newly introduced heuristic, this normally happens by performing differential testing of the analyzer on a lot of widely used open source software projects that are known to use related language constructs extensively. In practice, this process is ad hoc, error-prone, poorly reproducible and its results are hard to share. We present a set of tools that aim to support the work of static analyzer developers by making differential testing easier. Our framework includes tools for automatic test suite selection, automated differential experiments, coverage information of increased granularity, statistics collection, metric calculations, and visualizations, all resulting in a convenient, shareable HTML report.
... Visual Debugging Many attempts have been made to leverage visualization principles to augment the debugging process. Some of these efforts focus on adding visualization options to breakpoint and step-through debuggers [9,11,13,27,28,32,34]. Other visualizations aim at showing the user-specific information about the execution, such as an overview of the heap [3], the impact of resource utilization on control flow [30], object mutation [35], or run-time state and data structures of the program [39]. ...
Preprint
Understanding and debugging long, complex programs can be extremely difficult; it often includes significant, manual program instrumentation and searches through source files. In this paper, we present Anteater, an interactive visualization system for tracing and exploring the execution of a program. While existing debugging tools often have visualization components, these components are often added on top of an existing environment. In Anteater, in contrast, visualization is a driving concern. We answer the following question: what should a debugging tool look like if it were designed from the ground up to support interactive visualization principles? Anteater automatically instruments source code to capture execution behavior along with variables and expressions specified by the user. After generating the execution trace, Anteater presents the execution information through interactive visualizations. Anteater supports interactions that help with tasks such as discovering important structures in the execution, learning dependencies, and understanding and debugging unexpected behaviors. To assess the utility of Anteater, we present several real-world case studies that show how Anteater compares favorably to existing approaches. Finally, we discuss limitations of our system and where further research is needed.
Article
Data workers use various scripting languages for data transformation, such as SAS, R, and Python. However, understanding intricate code pieces requires advanced programming skills, which hinders data workers from grasping the idea of data transformation at ease. Program visualization is beneficial for debugging and education and has the potential to illustrate transformations intuitively and interactively. In this article, we explore visualization design for demonstrating the semantics of code pieces in the context of data transformation. First, to depict individual data transformations, we structure a design space by two primary dimensions, i.e., key parameters to encode and possible visual channels to be mapped. Then, we derive a collection of 23 glyphs that visualize the semantics of transformations. Next, we design a pipeline, named Somnus , that provides an overview of the creation and evolution of data tables using a provenance graph. At the same time, it allows detailed investigation of individual transformations. User feedback on Somnus is positive. Our study participants achieved better accuracy with less time using Somnus , and preferred it over carefully-crafted textual description. Further, we provide two example applications to demonstrate the utility and versatility of Somnus .
Chapter
Full-text available
Many problems in reactive synthesis are stated using two formulas—an environment assumption and a system guarantee—and ask for an implementation that satisfies the guarantee in environments that satisfy their assumption. Reactive synthesis tools often produce strategies that formally satisfy such specifications by actively preventing an environment assumption from holding. While formally correct, such strategies do not capture the intention of the designer. We introduce an additional requirement in reactive synthesis, non-conflictingness, which asks that a system strategy should always allow the environment to fulfill its liveness requirements. We give an algorithm for solving GR(1) synthesis that produces non-conflicting strategies. Our algorithm is given by a 4-nested fixed point in the \(\mu \)-calculus, in contrast to the usual 3-nested fixed point for GR(1). Our algorithm ensures that, in every environment that satisfies its assumptions on its own, traces of the resulting implementation satisfy both the assumptions and the guarantees. In addition, the asymptotic complexity of our algorithm is the same as that of the usual GR(1) solution. We have implemented our algorithm and show how its performance compares to the usual GR(1) synthesis algorithm.
Chapter
Full-text available
Reasoning about the correctness of parallel and distributed systems requires automated tools. By now, the mCRL2 toolset and language have been developed over a course of more than fifteen years. In this paper, we report on the progress and advancements over the past six years. Firstly, the mCRL2 language has been extended to support the modelling of probabilistic behaviour. Furthermore, the usability has been improved with the addition of refinement checking, counterexample generation and a user-friendly GUI. Finally, several performance improvements have been made in the treatment of behavioural equivalences. Besides the changes to the toolset itself, we cover recent applications of mCRL2 in software product line engineering and the use of domain specific languages (DSLs).
Chapter
Full-text available
Static program analysis is used to automatically determine program properties, or to detect bugs or security vulnerabilities in programs. It can be used as a stand-alone tool or to aid compiler optimization as an intermediary step. Developing precise, inter-procedural static analyses, however, is a challenging task, due to the algorithmic complexity, implementation effort, and the threat of state explosion which leads to unsatisfactory performance. Software written in C and C++ is notoriously hard to analyze because of the deliberately unsafe type system, unrestricted use of pointers, and (for C++) virtual dispatch. In this work, we describe the design and implementation of the LLVM-based static analysis framework PhASAR for C/C++ code. PhASAR allows data-flow problems to be solved in a fully automated manner. It provides class hierarchy, call-graph, points-to, and data-flow information, hence requiring analysis developers only to specify a definition of the data-flow problem. PhASAR thus hides the complexity of static analysis behind a high-level API, making static program analysis more accessible and easy to use. PhASAR is available as an open-source project. We evaluate PhASAR’s scalability during whole-program analysis. Analyzing 12 real-world programs using a taint analysis written in PhASAR, we found PhASAR’s abstractions and their implementations to provide a whole-program analysis that scales well to real-world programs. Furthermore, we peek into the details of analysis runs, discuss our experience in developing static analyses for C/C++, and present possible future improvements. Data or code related to this paper is available at: [34].
Article
Full-text available
To detect and fix bugs and security vulnerabilities, software companies use static analysis as part of the development process. However, static analysis code itself is also prone to bugs. To ensure a consistent level of precision, as analyzed programs grow more complex, a static analysis has to handle more code constructs, frameworks, and libraries that the programs use. While more complex analyses are written and used in production systems every day, the cost of debugging and fixing them also increases tremendously. To better understand the difficulties of debugging static analyses, we surveyed 115 static analysis writers. From their responses, we extracted the core requirements to build a debugger for static analysis, which revolve around two main issues: (1) abstracting from two code bases at the same time (the analysis code and the analyzed code) and (2) tracking the analysis internal state throughout both code bases. Most current debugging tools that our survey participants use lack the capabilities to address both issues. Focusing on those requirements, we introduce VisuFlow, a debugging environment for static data-flow analysis that is integrated in the Eclipse development environment. VisuFlow features graph visualizations that enable users to view the state of a data-flow analysis and its intermediate results at any time. Special breakpoints in VisuFlow help users step through the analysis code and the analyzed simultaneously. To evaluate the usefulness of VisuFlow, we have conducted a user study on 20 static analysis writers. Using VisuFlow helped our sample of analysis writers identify 25% and fix 50% more errors in the analysis code compared to using the standard Eclipse debugging environment.
Conference Paper
Full-text available
We present the concept of Just-In-Time (JIT) static analysis that interleaves code development and bug fixing in an integrated development environment. Unlike traditional batch-style analysis tools, a JIT analysis tool presents warnings to code developers over time, providing the most relevant results quickly, and computing less relevant results incrementally later. In this paper, we describe general guidelines for designing JIT analyses. We also present a general recipe for transforming static data-flow analyses to JIT analyses through a concept of layered analysis execution. We illustrate this transformation through CHEETAH, a JIT taint analysis for Android applications. Our empirical evaluation of CHEETAH on real-world applications shows that our approach returns warnings quickly enough to avoid disrupting the normal workflow of developers. This result is confirmed by our user study, in which developers fixed data leaks twice as fast when using CHEETAH compared to an equivalent batch-style analysis.
Article
Full-text available
Atlas is a new software analysis platform from EnSoft Corp. Atlas decouples the domain-specific analysis goal from its underlying mechanism by splitting analysis into two distinct phases. In the first phase, polynomial-time static analyzers index the software AST, building a rich graph database. In the second phase, users can explore the graph directly or run custom analysis scripts written using a convenient API. These features make Atlas ideal for both interaction and automation. In this paper, we describe the motivation, design, and use of Atlas. We present validation case studies, including the verification of safe synchronization of the Linux kernel, and the detection of malware in Android applications. Our ICSE 2014 demo explores the comprehension and malware detection use cases. Video: http://youtu.be/cZOWlJ-IO0k
Conference Paper
Full-text available
Debugging is still among the most common and costly of programming activities. One reason is that current debugging tools do not directly support the inquisitive nature of the activity. Interrogative Debugging is a new debugging paradigm in which programmers can ask why did and even why didn't questions directly about their program's runtime failures. The Whyline is a prototype Interrogative Debugging interface for the Alice programming environment that visualizes answers in terms of runtime events directly relevant to a programmer's question. Comparisons of identical debugging scenarios from user tests with and without the Whyline showed that the Whyline reduced debugging time by nearly a factor of 8, and helped programmers complete 40% more tasks.
Conference Paper
Full-text available
Infer is a new automatic program verification tool aimed at proving memory safety of C programs. It attempts to build a compositional proof of the program at hand by composing proofs of its constituent modules (functions/procedures). Bugs are extracted from failures of proof attempts. We describe the main features of Infer and some of the main ideas behind it.
Conference Paper
Full-text available
The research and industrial communities have made great strides in developing sophisticated defect detection tools based on static anal- ysis. However, to date most of the work in this area has focused on developing novel static analysis algorithms, and neglected study of other aspects of static analysis tools, in particular user interfaces. In this work, we present a novel user interface toolkit called Path Pro- jection that helps users visualize, navigate, and understand program paths, a common component of many static analysis tools' error re- ports. We performed a controlled user study to measure the benefit of Path Projection in triaging error reports from Locksmith, a data race detection tool for C. We found that Path Projection improved participants' time to complete this task, without affecting accuracy, and that participants felt Path Projection was useful.
Conference Paper
Full-text available
Static analysis tools have been widely used to detect potential defects without executing programs. It helps programmers raise the awareness about subtle correctness issues in the early stage. However, static defect detection tools face the high false positive rate problem. Therefore, programmers have to spend a considerable amount of time on screening out real bugs from a large number of reported warnings, which is time-consuming and inefficient. To alleviate the above problem during the report inspection process, we present EFindBugs to employ an effective two-stage error ranking strategy that suppresses the false positives and ranks the true error reports on top, so that real bugs existing in the programs could be more easily found and fixed by the programmers. In the first stage, EFindBugs initializes the ranking by assigning predefined defect likelihood for each bug pattern and sorting the error reports by the defect likelihood in descending order. In the second stage, EFindbugs optimizes the initial ranking self-adaptively through the feedback from users. This optimization process is executed automatically and based on the correlations among error reports with the same bug pattern. Our experiment on three widely-used Java projects (AspectJ, Tomcat, and Axis) shows that our ranking strategy outperforms the original ranking in FindBugs in terms of precision, recall and F1-score.
Conference Paper
The IFDS and IDE frameworks by Reps, Horwitz and Sagiv are two general frameworks for the inter-procedural analysis of data-flow problems with distributive flow functions over finite domains. Many data-flow problems do have distributive flow functions and are thus expressible as IFDS or IDE problems, reaching from basic analyses like truly-live variables to complex analyses for problems from the current literature such as typestate and secure information-flow. In this work we describe our implementation of a generic IFDS/IDE solver on top of Soot and contrast it with an IFDS implementation in the Watson Libraries for Analysis (WALA), both from a user's perspective and in terms of the implementation. While WALA's implementation is geared much towards memory efficiency, ours is currently geared more towards extensibility and ease of use and we focus on efficiency as a secondary goal. We further discuss possible extensions to our IFDS/IDE implementation that may be useful to support a wider range of analyses.
Conference Paper
This paper presents Soot, a framework for optimizing Java™ bytecode. The framework is implemented in Java and supports three intermediate representations for representing Java bytecode: Baf, a streamlined representation of Java’s stack-based bytecode; Jimple, a typed three-address intermediate representation suitable for optimization; and Grimp, an aggregated version of Jimple. Our approach to class file optimization is to first convert the stack-based bytecode into Jimple, a three-address form more amenable to traditional program optimization, and then convert the optimized Jimple back to bytecode. In order to demonstrate that our approach is feasible, we present experimental results showing the effects of processing class files through our framework. In particular, we study the techniques necessary to effectively translate Jimple back to bytecode, without losing performance. Finally, we demonstrate that class file optimization can be quite effective by showing the results of some basic optimizations using our framework. Our experiments were done on ten benchmarks, including seven SPECjvm98 benchmarks, and were executed on five different Java virtual machine implementations.
Article
Companies spend lots of time and money on complex tools to assess customer satisfaction. But they're measuring the wrong thing. The best predictor of top-line growth can usually be captured in a single survey question: Would you recommend this company to a friend? This finding is based on two years of research in which a variety of survey questions were tested by linking the responses with actual customer behavior--purchasing patterns and referrals--and ultimately with company growth. Surprisingly, the most effective question wasn't about customer satisfaction or even loyalty per se. In most of the industries studied, the percentage of customers enthusiastic enough about a company to refer it to a friend or colleague directly correlated with growth rates among competitors. Willingness to talk up a company or product to friends, family, and colleagues is one of the best indicators of loyalty because of the customer's sacrifice in making the recommendation. When customers act as references, they do more than indicate they've received good economic value from a company; they put their own reputations on the line. And they will risk their reputations only if they feel intense loyalty. The findings point to a new, simpler approach to customer research, one directly linked to a company's results. By substituting a single question--blunt tool though it may appear to be--for the complex black box of the customer satisfaction survey, companies can actually put consumer survey results to use and focus employees on the task of stimulating growth.