Sofya: Supporting Rapid Development of Dynamic Program Analyses for Java
ABSTRACT Dynamic analysis is an increasingly important means of supporting software validation and maintenance. To date, developers of dynamic analyses have used low-level instrumentation and debug interfaces to realize their analyses. Many dynamic analyses, however, share multiple common high-level requirements, e.g., capture of program data state as well as events, and efficient and accurate event capture in the presence of threading. We present SOFYA -- an infra-structure designed to provide high-level, efficient, concurrency-aware support for building analyses that reason about rich observations of program data and events. It provides a layered, modular architecture, which has been successfully used to rapidly develop and evaluate a variety of demanding dynamic program analyses. In this paper, we describe the SOFYA framework, the challenges it addresses, and survey several such analyses.
- SourceAvailable from: Mário Alberto Zenha-Rela[Show abstract] [Hide abstract]
ABSTRACT: In Evolutionary Testing, meta-heuristic search techniques are used for generating test data. The focus of our research is on employing evolutionary algorithms for the structural unit-testing of Object-Oriented programs. Relevant contributions include the introduction of novel methodologies for automation, search guidance and Input Domain Reduction; the strategies proposed were empirically evaluated with encouraging results.Test cases are evolved using the Strongly-Typed Genetic Programming technique. Test data quality evaluation includes instrumenting the test object, executing it with the generated test cases, and tracing the structures traversed in order to derive coverage metrics. The methodology for efficiently guiding the search process towards achieving full structural coverage involves favouring test cases that exercise problematic structures. Purity Analysis is employed as a systematic strategy for reducing the search space.Information and Software Technology. 01/2009;
- [Show abstract] [Hide abstract]
ABSTRACT: A wide range of techniques for supporting software maintenance tasks rely on representations of program control flow. The accuracy of these representations can be important to the effectiveness and efficiency of these techniques. The Java programming language has introduced structured exception handling features that complicate the task of representing control flow. Previous work has attempted to address these complications by using type inference algorithms to analyze the control flow effects of exceptions, but to date, there has been no study of whether the use of these algorithms is justified. In this paper we report results of an empirical study addressing this issue. We find that type inference algorithms can lead to more accurate representations of control flow, but this improvement does not necessarily translate into benefits for maintenance techniques that use them. It follows that type inference algorithms should not just automatically be applied; rather, the tradeoffs of applying them must first be assessed with respect to particular maintenance techniques and workloads.Software Maintenance, 2008. ICSM 2008. IEEE International Conference on; 11/2008
Conference Paper: Productive Development of Dynamic Program Analysis Tools with DiSL[Show abstract] [Hide abstract]
ABSTRACT: Dynamic program analysis tools serve many important software engineering tasks such as profiling, debugging, testing, program comprehension, and reverse engineering. Many dynamic analysis tools rely on program instrumentation and are implemented using low-level instrumentation libraries, resulting in tedious and error-prone tool development. The recently released Domain-Specific Language for Instrumentation (DiSL) was designed to boost the productivity of tool developers targeting the Java Virtual Machine, without impairing the performance of the resulting tools. DiSL offers high-level programming abstractions especially designed for development of instrumentation-based dynamic analysis tools. In this paper, we present a controlled experiment aimed at quantifying the impact of the DiSL programming model and high-level abstractions on the development of dynamic program analysis instrumentations. The experiment results show that compared with a prevailing, state-of-the-art instrumentation library, the DiSL users were able to complete instrumentation development tasks faster, and with more correct results.Software Engineering Conference (ASWEC), 2013 22nd Australian; 01/2013
Computer Science and Engineering, Department of
CSE Conference and Workshop Papers
University of Nebraska - Lincoln Year
Sofya: Supporting Rapid Development of
Dynamic Program Analyses for Java
Matthew B. Dwyer†
∗University of Nebraska-Lincoln, firstname.lastname@example.org
†University of Nebraska-Lincoln, email@example.com
‡University of Nebraska-Lincoln, firstname.lastname@example.org
This paper is posted at DigitalCommons@University of Nebraska - Lincoln.
Sofya: Supporting Rapid Development of Dynamic Program Analyses for Java∗
Alex Kinneer, Matthew B. Dwyer, Gregg Rothermel
Department of Computer Science and Engineering
University of Nebraska - Lincoln
Dynamic analysis is an increasingly important means of
supporting software validation and maintenance. To date,
developers of dynamic analyses have used low-level instru-
mentation and debug interfaces to realize their analyses.
Many dynamic analyses, however, share multiple common
high-level requirements, e.g., capture of program data state
as well as events, and efficient and accurate event cap-
ture in the presence of threading. We present SOFYA –
an infra-structure designed to provide high-level, efficient,
concurrency-aware support for building analyses that rea-
son about rich observations of program data and events. It
provides a layered, modular architecture, which has been
successfully used to rapidly develop and evaluate a variety
of demanding dynamic program analyses. In this paper, we
describe the SOFYA framework, the challenges it addresses,
and survey several such analyses.
A wide variety of techniques reported in the literature,
e.g. [2, 6], use observations collected during actual runs of
Java programs to perform analyses for verification and vali-
dation; new techniques are being reported frequently. Most
of these techniques are sensitive to both the accuracy of the
program observations – how faithfully the reporting of ob-
served events reflects the actual ordering of those events
in the monitored program – and the efficiency with which
those observations can be delivered. Analyses that receive
incorrectly ordered events may produce wrong results, and
analyses that cannot process a large volume of events effi-
ciently may simply run too slowly to be of any use. We dis-
cuss common problems encountered in implementing such
analyses and describe how the SOFYA  dynamic analysis
infrastructure addresses those problems.
Specification of program observations. Developers need
to specify the program observations relevant to their analy-
∗This work was supported in part by the National Science Foundation
through awards 0429149, 0444167, 0454203, and 0541263.
sis. It is generally accepted that modifying source code by
hand for each analysis is too costly and error prone. Inter-
mediary technologies, on the other hand, may present com-
plex APIs that are difficult for the analyst to learn to use
effectively, or that may not allow the analyst to describe the
observation and the associated payload data, e.g., receiver
object identity, needed for the analysis. SOFYA provides
an expressive, but simple, declarative language for describ-
ing observations and associated payloads and automates the
generation of code to capture them at run-time.
Efficient event capture. Using instrumentation and debug-
ger connections to capture program observations introduces
overhead. The literature reports multiple overhead reduc-
tion techniques, but, our experience indicates that develop-
ers do not generally apply these best-practices in their anal-
ysis implementations. SOFYA relieves analysis developers
of the need to work with complex libraries and tools, such
as the Bytecode Engineering Libary (BCEL), and enables
the reuse of robust, efficient, and concurrency-safe strate-
gies for the capture of a broad set of observations. It also
provides several novel performance enhancements, such as
support for dynamically modifying the set of active obser-
vations during analysis .
Accurate event capture. Concurrency can lead to hard
to find bugs; consequently, many validation and verifica-
tion techniques, e.g. , are aimed at detecting concur-
rency related errors. Without additional synchronization,
which slows the program, byte code instrumentation can-
not guarantee that the order of observed events corresponds
with the order of occurrence of those events in the program.
Synchronization introduced by instrumentation can inter-
fere with the natural scheduling of threads in a monitored
program, leading to questionable results from analyses in-
vestigating effects of thread ordering. While an efficient
and correct solution to this problem is difficult to achieve,
SOFYA includes a number of strategies that reduce the risk
of imprecise reporting without losing efficiency.
Event processing. Many analyses need to distinguish be-
tween events occurring on different object instances. Since
it is generally not possible to know the identity or even the
29th International Conference on Software Engineering (ICSE'07 Companion)
0-7695-2892-9/07 $20.00 © 2007
Digital Object Identifier: 10.1109/ICSECOMPANION.2007.68
Publication Year: 2007 , Page(s): 51 - 52
number of instances of an object statically, this presents a
real challenge to analysis developers. SOFYA implements
a publish-subscribe architecture that supports flexible fil-
tering and routing of observations. Streams of correlated
observations, such as those sharing the same receiver ob-
ject, are routed to subscribing analysis components. These
streams can be generated and re-routed dynamically as the
program under analysis executes. SOFYA’s standard archi-
tecture allows modularization of event processing in spe-
cific analyses while supporting flexible creation and combi-
nation of analysis components on-the-fly.
Many new dynamic analyses reported in literature are
evaluated with a specialized implementation, even though
they often share event capture requirements with existing
techniques; such implementations are often described as
prototypes. This leads to redundant and inefficient tools,
which in our experience often suffer from common errors
and shortcomings. It also inhibits the comparative evalu-
ation of techniques. SOFYA addresses this by providing a
common framework that frees developers from the burden
of repeatedly dealing with these challenges so that they can
rapidly implement and evaluate novel analysis techniques.
2. Sofya Architecture
Sofya’s event capture components are organized into a
layered architecture, which is presented in detail at . The
top layers present a programmatic publish/subscribe API
targeted at “client” program analyses. This layered archi-
tecture factors out different aspects of event capture and dis-
patching so that they can evolve over time to track technol-
ogy advances or be customized for a specific analysis with-
out affecting existing analysis clients. We provide a brief
summary of each layer.
Layer 1. Provides information to guide the activities of
other layers, e.g, processing of observables defined in the
Event Description Language (EDL). EDL is used by Sofya
to specify events, and associated data values, to be captured
from a class of “semantic” events, such as, method invoca-
tion, field read/write and lock acquire/release.
Layer 2. Capture of observations is achieved either through
instrumentation, defined in this layer, or through debugger
interface support (Layer 3). Sofya provides highly opti-
mized and robust bytecode instrumentors in this layer, built
using BCEL, to capture various program observations.
Layer 3. Provides communication mechanisms to transfer
information from instrumentation and debugger interfaces
to event dispatchers (Layer 4). Sofya runs monitored pro-
grams in their own virtual machine, which prevents event
processing from interfering with program behavior.
Layer 4. Implements event dispatchers - publishers of cap-
tured program events. EDL can be used to ensure that only
selected events are delivered to a given analysis client.
Layer 5. Provides splitters, filters, and routers to manipu-
late event streams published by event dispatchers. A splitter
breaks a single event stream into multiple streams based on
some criteria, such as thread ID. Filters are used to select
particular events of interest out of an event stream.
Most dynamic program analyses can be rapidly imple-
mented using the services provided by layers 4 and 5, thus
benefiting from the carefully engineered efficiency and cor-
rectness of the lower layers without incurring the difficulty
or cost of implementing that functionality.
3. Experience and Conclusions
We have used SOFYA to implement a wide-variety of dif-
ferent dynamic analyses including: multi-lockset race de-
tection , vector-clock happens-before analysis , dy-
namic escape analysis , atomicity , variants of se-
quencing property inference analyses , and a number
of sophisticated variants of temporal property conformance
checkers . Our experience across these 6 distinct classes
of analyses, for which we have implemented a total of 13
different variants, can be contrasted to our experience build-
ing dynamic analyses using low-level libraries like BCEL
directly over the past several years. Analyses built using
SOFYA can be implemented more quickly, and are, at least,
competitive in terms of overhead. We believe that SOFYA’s
standard interfaces and flexible publish-subscribe approach
for connecting analysis components will improve analysis
developer productivity by enabling the creation and reuse of
high-levelanalysisbuildingblockcomponents. SOFYA sup-
ports analysis developers by providing an efficient and well-
engineered framework for rapidly implementing, evaluat-
ing, and comparing dynamic program analysis techniques.
The SOFYA website  provides current information on the
development of SOFYA, and allows free access to current
versions of SOFYA.
 M. B. Dwyer, A. Kinneer, and S. Elbaum. Adaptive online
program analysis. In Int’l. Conf. Softw. Eng., 2007 (to appear).
 K. Havelund and G. Ros ¸u. An overview of the runtime ver-
ification tool Java PathExplorer. Formal Meth. Sys. Design,
 H. Nishiyama. Detecting data races using dynamic escape
and Tech. Symp., pages 127–138, 2004.
 R. O’Callahan and J.-D. Choi. Hybrid dynamic data race de-
tection. In Symp. Princ. Prac. Par. Prog., 2003.
 L. Wang and S. D. Stoller. Runtime analysis of atomicity for
multi-threaded programs. IEEE Trans. Softw. Eng., 32:93–
110, Feb 2006.
 W. Weimer and G. Necula. Mining temporal specifications for
error detection. In Conf. Tools Alg. Constr. Anal. Sys., pages
461–476, April 2005.
29th International Conference on Software Engineering (ICSE'07 Companion)
0-7695-2892-9/07 $20.00 © 2007