Sofya: Supporting Rapid Development of Dynamic Program Analyses for Java
ABSTRACT Dynamic analysis is an increasingly important means of supporting software validation and maintenance. To date, developers of dynamic analyses have used low-level instrumentation and debug interfaces to realize their analyses. Many dynamic analyses, however, share multiple common high-level requirements, e.g., capture of program data state as well as events, and efficient and accurate event capture in the presence of threading. We present SOFYA -- an infra-structure designed to provide high-level, efficient, concurrency-aware support for building analyses that reason about rich observations of program data and events. It provides a layered, modular architecture, which has been successfully used to rapidly develop and evaluate a variety of demanding dynamic program analyses. In this paper, we describe the SOFYA framework, the challenges it addresses, and survey several such analyses.
- SourceAvailable from: Klaus Havelund[Show abstract] [Hide abstract]
ABSTRACT: Abstract—In the past researchers,have developed,specialized programs,to aid programmers,detecting concurrent programming errors such as deadlocks, livelocks, starvation and data races. In this work we propose a language extension to the aspect-oriented programming language AspectJ, in the form of three new pointcuts, lock(), unlock() and maybeShared(). These pointcuts allow programmers to monitor program events where locks are granted or handed back, and where values are accessed that may be shared amongst,multiple Java threads. We decide thread-locality using a static thread-local-objects analysis developed,by others. Using the three new primitive pointcuts, researchers can directly implement efficient monitoring algorithms to detect concurrent-programming errors online. As an example, we describe a new algorithm which we call RACER, an adaption of the well-known ERASER algorithm to the memory model of Java. We implemented the new pointcuts asan extension to the AspectBench Compiler, implemented the RACER algorithm using this language,extension and then applied the algorithm to the NASA K9 Rover Executive and two smaller programs. Our experiments demonstrate,that our implementation is effective in finding subtle data races. In the Rover Executive R ACER finds 12 data races, with no false warnings. Only one of these races was previously known. Index Terms—Race detection, runtime verification, aspect-oriented programming, semantic pointcuts, static analysis. ✦IEEE Transactions on Software Engineering 01/2010; 36:509-527. · 2.59 Impact Factor
Conference Paper: The RoadRunner dynamic analysis framework for concurrent programs.[Show abstract] [Hide abstract]
ABSTRACT: RoadRunner is a dynamic analysis framework designed to facilitate rapid prototyping and experimentation with dynamic analyses for concurrent Java programs. It provides a clean API for communicating an event stream to back-end analyses, where each event describes some operation of interest performed by the target program, such as accessing memory, synchronizing on a lock, forking a new thread, and so on. This API enables the developer to focus on the essential algorithmic issues of the dynamic analysis, rather than on orthogonal infrastructure complexities. Each back-end analysis tool is expressed as a filter over the event stream, allowing easy composition of analyses into tool chains. This tool-chain architecture permits complex analyses to be described and implemented as a sequence of more simple, modular steps, and it facilitates experimentation with different tool compositions. Moreover, the ability to insert various monitoring tools into the tool chain facilitates debugging and performance tuning. Despite RoadRunner's flexibility, careful implementation and optimization choices enable RoadRunner-based analyses to offer comparable performance to traditional, monolithic analysis prototypes, while being up to an order of magnitude smaller in code size. We have used RoadRunner to develop several dozen tools and have successfully applied them to programs as large as the Eclipse programming environment.Proceedings of the 9th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, PASTE'10, Toronto, Ontario, Canada, June 5-6, 2010; 01/2010
Conference Paper: Productive Development of Dynamic Program Analysis Tools with DiSL[Show abstract] [Hide abstract]
ABSTRACT: Dynamic program analysis tools serve many important software engineering tasks such as profiling, debugging, testing, program comprehension, and reverse engineering. Many dynamic analysis tools rely on program instrumentation and are implemented using low-level instrumentation libraries, resulting in tedious and error-prone tool development. The recently released Domain-Specific Language for Instrumentation (DiSL) was designed to boost the productivity of tool developers targeting the Java Virtual Machine, without impairing the performance of the resulting tools. DiSL offers high-level programming abstractions especially designed for development of instrumentation-based dynamic analysis tools. In this paper, we present a controlled experiment aimed at quantifying the impact of the DiSL programming model and high-level abstractions on the development of dynamic program analysis instrumentations. The experiment results show that compared with a prevailing, state-of-the-art instrumentation library, the DiSL users were able to complete instrumentation development tasks faster, and with more correct results.Software Engineering Conference (ASWEC), 2013 22nd Australian; 01/2013
Computer Science and Engineering, Department of
CSE Conference and Workshop Papers
University of Nebraska - Lincoln Year
Sofya: Supporting Rapid Development of
Dynamic Program Analyses for Java
Matthew B. Dwyer†
∗University of Nebraska-Lincoln, firstname.lastname@example.org
†University of Nebraska-Lincoln, email@example.com
‡University of Nebraska-Lincoln, firstname.lastname@example.org
This paper is posted at DigitalCommons@University of Nebraska - Lincoln.
Sofya: Supporting Rapid Development of Dynamic Program Analyses for Java∗
Alex Kinneer, Matthew B. Dwyer, Gregg Rothermel
Department of Computer Science and Engineering
University of Nebraska - Lincoln
Dynamic analysis is an increasingly important means of
supporting software validation and maintenance. To date,
developers of dynamic analyses have used low-level instru-
mentation and debug interfaces to realize their analyses.
Many dynamic analyses, however, share multiple common
high-level requirements, e.g., capture of program data state
as well as events, and efficient and accurate event cap-
ture in the presence of threading. We present SOFYA –
an infra-structure designed to provide high-level, efficient,
concurrency-aware support for building analyses that rea-
son about rich observations of program data and events. It
provides a layered, modular architecture, which has been
successfully used to rapidly develop and evaluate a variety
of demanding dynamic program analyses. In this paper, we
describe the SOFYA framework, the challenges it addresses,
and survey several such analyses.
A wide variety of techniques reported in the literature,
e.g. [2, 6], use observations collected during actual runs of
Java programs to perform analyses for verification and vali-
dation; new techniques are being reported frequently. Most
of these techniques are sensitive to both the accuracy of the
program observations – how faithfully the reporting of ob-
served events reflects the actual ordering of those events
in the monitored program – and the efficiency with which
those observations can be delivered. Analyses that receive
incorrectly ordered events may produce wrong results, and
analyses that cannot process a large volume of events effi-
ciently may simply run too slowly to be of any use. We dis-
cuss common problems encountered in implementing such
analyses and describe how the SOFYA  dynamic analysis
infrastructure addresses those problems.
Specification of program observations. Developers need
to specify the program observations relevant to their analy-
∗This work was supported in part by the National Science Foundation
through awards 0429149, 0444167, 0454203, and 0541263.
sis. It is generally accepted that modifying source code by
hand for each analysis is too costly and error prone. Inter-
mediary technologies, on the other hand, may present com-
plex APIs that are difficult for the analyst to learn to use
effectively, or that may not allow the analyst to describe the
observation and the associated payload data, e.g., receiver
object identity, needed for the analysis. SOFYA provides
an expressive, but simple, declarative language for describ-
ing observations and associated payloads and automates the
generation of code to capture them at run-time.
Efficient event capture. Using instrumentation and debug-
ger connections to capture program observations introduces
overhead. The literature reports multiple overhead reduc-
tion techniques, but, our experience indicates that develop-
ers do not generally apply these best-practices in their anal-
ysis implementations. SOFYA relieves analysis developers
of the need to work with complex libraries and tools, such
as the Bytecode Engineering Libary (BCEL), and enables
the reuse of robust, efficient, and concurrency-safe strate-
gies for the capture of a broad set of observations. It also
provides several novel performance enhancements, such as
support for dynamically modifying the set of active obser-
vations during analysis .
Accurate event capture. Concurrency can lead to hard
to find bugs; consequently, many validation and verifica-
tion techniques, e.g. , are aimed at detecting concur-
rency related errors. Without additional synchronization,
which slows the program, byte code instrumentation can-
not guarantee that the order of observed events corresponds
with the order of occurrence of those events in the program.
Synchronization introduced by instrumentation can inter-
fere with the natural scheduling of threads in a monitored
program, leading to questionable results from analyses in-
vestigating effects of thread ordering. While an efficient
and correct solution to this problem is difficult to achieve,
SOFYA includes a number of strategies that reduce the risk
of imprecise reporting without losing efficiency.
Event processing. Many analyses need to distinguish be-
tween events occurring on different object instances. Since
it is generally not possible to know the identity or even the
29th International Conference on Software Engineering (ICSE'07 Companion)
0-7695-2892-9/07 $20.00 © 2007
Digital Object Identifier: 10.1109/ICSECOMPANION.2007.68
Publication Year: 2007 , Page(s): 51 - 52
number of instances of an object statically, this presents a
real challenge to analysis developers. SOFYA implements
a publish-subscribe architecture that supports flexible fil-
tering and routing of observations. Streams of correlated
observations, such as those sharing the same receiver ob-
ject, are routed to subscribing analysis components. These
streams can be generated and re-routed dynamically as the
program under analysis executes. SOFYA’s standard archi-
tecture allows modularization of event processing in spe-
cific analyses while supporting flexible creation and combi-
nation of analysis components on-the-fly.
Many new dynamic analyses reported in literature are
evaluated with a specialized implementation, even though
they often share event capture requirements with existing
techniques; such implementations are often described as
prototypes. This leads to redundant and inefficient tools,
which in our experience often suffer from common errors
and shortcomings. It also inhibits the comparative evalu-
ation of techniques. SOFYA addresses this by providing a
common framework that frees developers from the burden
of repeatedly dealing with these challenges so that they can
rapidly implement and evaluate novel analysis techniques.
2. Sofya Architecture
Sofya’s event capture components are organized into a
layered architecture, which is presented in detail at . The
top layers present a programmatic publish/subscribe API
targeted at “client” program analyses. This layered archi-
tecture factors out different aspects of event capture and dis-
patching so that they can evolve over time to track technol-
ogy advances or be customized for a specific analysis with-
out affecting existing analysis clients. We provide a brief
summary of each layer.
Layer 1. Provides information to guide the activities of
other layers, e.g, processing of observables defined in the
Event Description Language (EDL). EDL is used by Sofya
to specify events, and associated data values, to be captured
from a class of “semantic” events, such as, method invoca-
tion, field read/write and lock acquire/release.
Layer 2. Capture of observations is achieved either through
instrumentation, defined in this layer, or through debugger
interface support (Layer 3). Sofya provides highly opti-
mized and robust bytecode instrumentors in this layer, built
using BCEL, to capture various program observations.
Layer 3. Provides communication mechanisms to transfer
information from instrumentation and debugger interfaces
to event dispatchers (Layer 4). Sofya runs monitored pro-
grams in their own virtual machine, which prevents event
processing from interfering with program behavior.
Layer 4. Implements event dispatchers - publishers of cap-
tured program events. EDL can be used to ensure that only
selected events are delivered to a given analysis client.
Layer 5. Provides splitters, filters, and routers to manipu-
late event streams published by event dispatchers. A splitter
breaks a single event stream into multiple streams based on
some criteria, such as thread ID. Filters are used to select
particular events of interest out of an event stream.
Most dynamic program analyses can be rapidly imple-
mented using the services provided by layers 4 and 5, thus
benefiting from the carefully engineered efficiency and cor-
rectness of the lower layers without incurring the difficulty
or cost of implementing that functionality.
3. Experience and Conclusions
We have used SOFYA to implement a wide-variety of dif-
ferent dynamic analyses including: multi-lockset race de-
tection , vector-clock happens-before analysis , dy-
namic escape analysis , atomicity , variants of se-
quencing property inference analyses , and a number
of sophisticated variants of temporal property conformance
checkers . Our experience across these 6 distinct classes
of analyses, for which we have implemented a total of 13
different variants, can be contrasted to our experience build-
ing dynamic analyses using low-level libraries like BCEL
directly over the past several years. Analyses built using
SOFYA can be implemented more quickly, and are, at least,
competitive in terms of overhead. We believe that SOFYA’s
standard interfaces and flexible publish-subscribe approach
for connecting analysis components will improve analysis
developer productivity by enabling the creation and reuse of
high-levelanalysisbuildingblockcomponents. SOFYA sup-
ports analysis developers by providing an efficient and well-
engineered framework for rapidly implementing, evaluat-
ing, and comparing dynamic program analysis techniques.
The SOFYA website  provides current information on the
development of SOFYA, and allows free access to current
versions of SOFYA.
 M. B. Dwyer, A. Kinneer, and S. Elbaum. Adaptive online
program analysis. In Int’l. Conf. Softw. Eng., 2007 (to appear).
 K. Havelund and G. Ros ¸u. An overview of the runtime ver-
ification tool Java PathExplorer. Formal Meth. Sys. Design,
 H. Nishiyama. Detecting data races using dynamic escape
and Tech. Symp., pages 127–138, 2004.
 R. O’Callahan and J.-D. Choi. Hybrid dynamic data race de-
tection. In Symp. Princ. Prac. Par. Prog., 2003.
 L. Wang and S. D. Stoller. Runtime analysis of atomicity for
multi-threaded programs. IEEE Trans. Softw. Eng., 32:93–
110, Feb 2006.
 W. Weimer and G. Necula. Mining temporal specifications for
error detection. In Conf. Tools Alg. Constr. Anal. Sys., pages
461–476, April 2005.
29th International Conference on Software Engineering (ICSE'07 Companion)
0-7695-2892-9/07 $20.00 © 2007