ArticlePDF Available

Abstracting Runtime Heaps for Program Understanding

Authors:

Abstract and Figures

Modern programming environments provide extensive support for inspecting, analyzing, and testing programs based on the algorithmic structure of a program. Unfortunately, support for inspecting and understanding runtime data structures during execution is typically much more limited. This paper provides a general purpose technique for abstracting and summarizing entire runtime heaps. We describe the abstract heap model and the associated algorithms for transforming a concrete heap dump into the corresponding abstract model as well as algorithms for merging, comparing, and computing changes between abstract models. The abstract model is designed to emphasize high-level concepts about heap-based data structures, such as shape and size, as well as relationships between heap structures, such as sharing and connectivity. We demonstrate the utility and computational tractability of the abstract heap model by building a memory profiler. We then use this tool to check for, pinpoint, and correct sources of memory bloat from a suite of programs from DaCapo.
Content may be subject to copyright.
Abstracting Runtime Heaps for Program Understanding
Mark Marron1Cesar Sanchez1,2Zhendong Su3Manuel Fahndrich4
1IMDEA Software Institute 2CSIC 3UC Davis 4Microsoft Research
{mark.marron, cesar.sanchez}@imdea.org, su@ucdavis.edu, maf@microsoft.com
Abstract.
Modern programming environments provide extensive support for
inspecting, analyzing, and testing programs based on the algorithmic structure of
a program. Unfortunately, support for inspecting and understanding runtime data
structures during execution is typically much more limited. This paper provides a
general purpose technique for abstracting and summarizing entire runtime heaps.
We describe the abstract heap model and the associated algorithms for transforming
a concrete heap dump into the corresponding abstract model as well as algorithms
for merging, comparing, and computing changes between abstract models. The
abstract model is designed to emphasize high-level concepts about heap-based data
structures, such as shape and size, as well as relationships between heap structures,
such as sharing and connectivity. We demonstrate the utility and computational
tractability of the abstract heap model by building a memory profiler. We then use
this tool to check for, pinpoint, and correct sources of memory bloat from a suite
of programs from DaCapo.
1 Introduction
Modern programming environments provide excellent support for visualizing and de-
bugging code, but inspecting and understanding the high-level structure of the data
manipulated at runtime by said code is typically not well supported. Visualizing en-
tire runtime heap graphs is a non-trivial problem, as the number of nodes and edges
is typically so large that displaying these graphs directly—even with excellent graph
visualization tools—results in useless jumbles of nodes and edges. As a result, little of
interest can be gleaned from such visualizations.
In this paper, we propose an abstract domain for runtime heap graphs that captures
many fundamental properties of data structures on the heap, such as shape, connectivity,
and sharing, but abstracts away other often less useful details. The abstract heap graphs
we compute are both small enough to visualize and navigate, and at the same time precise
enough to capture essential information useful in interactive debugging and memory
profiling scenarios. Further, the abstract heaps can be computed efficiently from a single
concrete heap and further merged/compared with other abstract heap graphs. from across
a set of program runs, or from multiple program points in order to get an even more
general view of the heap configurations that occur during program execution.
Example. Figure 1(a) shows a heap snapshot of a simple program that manipulates
expression trees. An expression tree consists of binary nodes for
Add
,
Sub
, and
Mult
,
and leaf nodes for
Constants
and
Variables
. The local variable
exp
(rectangular box)
arXiv:1201.1327v1 [cs.PL] 6 Jan 2012
(a) A Concrete Heap. (b) Corresponding Abstract Heap.
Fig. 1. A concrete heap and corresponding abstraction.
points to an expression tree consisting of 4 interior binary expression objects, 2
Var
, and
2
Const
objects. Local variable
env
points to an array representing an environment of
Var objects that are shared with the expression tree.
Figure 1(b) shows the abstract heap produced by our tools from this concrete heap
with the default visualization mode.
1
The abstraction summarizes the concrete objects
into three distinct summary nodes in the abstract heap graph: (1) an abstract node
representing all interior recursive objects in the expression tree (
Add
,
Mult
,
Sub
), (2) an
abstract node representing the two
Var
objects, and (3) an abstract node representing the
two
Const
objects. Specific details about the order and branching structure of expression
nodes are absent in the abstraction, but other more general properties are still present.
For example, the fact that there is no sharing or cycles among the interior expression
nodes is apparent in the abstract graph by looking at the self-edge representing the
pointers between objects in the interior of the expression tree. The label
tree{l,r}
on
the self-edge expresses that pointers stored in the
l
and
r
fields of the objects in this
region form a tree structure (i.e., no sharing and no cycles).
The abstract graph maintains another useful property of the expression tree, namely
that no
Const
object is referenced from multiple expression objects. On the other hand,
several expression objects might point to the same
Var
object. The abstract graph shows
possible sharing using wide orange colored edges (if color is available), whereas normal
edges indicate non-sharing pointers. The abstract graph shows pointer nullity via full vs.
dashed lines – in our example all pointers, except in the environment array, are non-null.
Rudimentary information on the number of objects represented by each node is
encoded in the shading. Nodes that always abstract a single object are given a white
background while nodes which represent multiple objects are shaded (silver if color is
available). Size information of arrays and other containers is encoded by annotating the
type label with the container size (Var[3] to indicate an array is of length 3).
1
Additional information can be obtained by hovering over the nodes/edges or by restyling for a
specific task as in our case studies in section 6.
2
Overview. This paper addresses the problem of turning large concrete runtime heaps into
compact abstract heaps while retaining many interesting properties of the original heap in
the abstraction. Our abstraction is safe in the sense that properties stated on the abstract
heap graph also hold in the corresponding concrete heaps. To achieve this abstraction
safety, we adopt the theory for the design of abstract domains developed in abstract
interpretation [
7
,
25
]. The theory of abstract interpretation provides a general framework
for (1) defining an abstract domain and relating it to possible concrete program states
and (2) a method for taking an abstract domain and computing an over-approximation
of the collecting semantics for a given program as a static analysis. The static analysis
component of the abstract interpretation framework is not relevant here, as we are
interested in abstracting runtime heaps. However, the framework for constructing the
abstract domains, as well as the properties of operations for comparing (
v
) and merging
(
e
t
) abstract domain elements, allows us to formally describe the relationship of our
abstract heap graphs to their concrete counterparts, and to obtain safe operations for
comparing and summarizing heaps from different program points or different program
runs in a semantically meaningful way. These guarantees provide confidence that all
inferences made by examining the abstract model are valid.
Our abstract heap domain encodes a fixed set of heap properties identified in previous
work on static heap analysis [
5
,
10
,
20
] that are fundamental properties of heaps and can
be computed efficiently. These properties include the summarization of recursive and
composite data structures, the assignment of shape information to these structures and
injectivity of fields (given two distinct objects does the field
f
in each object point to
a distinct target). The abstraction is also able to provide information on the number
and types of objects in the various structures, as well as nullity information. Our focus
on a fixed set of heap properties (as opposed to user defined properties) enables the
abstraction to be computed efficiently in time
O((Ob +Pt)log(Ob))
, where
Ob
is the
number of objects and Pt is the number of pointers in the concrete heap.
The contributions of this paper are:
The abstract domain for heap graphs and its concretization function formalizing the
safe relationship to concrete heaps.
An efficient algorithm for computing the abstraction and algorithms for comparing
and joining abstract heaps.
Graphical representations of abstract heap graphs that allow on-demand collapsing
or expansion of sub-structures, allowing a form of semantic zoom [
8
] from a very
abstract view of the heap down to the level of individual objects.
The construction of a general purpose heap memory profiler and analysis tool that
augments the basic abstraction with specialized support for profiling and identifying
common memory problems in a program.
A qualitative evaluation of the visualization and memory profiler in tracking down
and identifying solutions to memory inefficiencies in a range of programs (up to a
25% reduction in memory use).
3
2 Abstract Heap Graph
We begin by formalizing concrete program heaps and the relevant properties of concrete
heaps that will be captured by the abstraction. Later, we define the abstract heap graph
and formally relate the abstraction to its concrete heap counterparts using a concretization
(γ) function from the framework of abstract interpretation.
2.1 Concrete Heaps
For the purposes of this paper, we model the runtime state of a program as an environment,
mapping variables to values, and a store, mapping addresses to values. We refer to an
instance of an environment together with a store as a concrete heap. Formally, a concrete
heap is a labeled directed graph
(root,null,Ob,Pt,Ty)
, where the nodes are formed by
the set of heap objects (
Ob
) and the edges (
Pt
) correspond to pointers. We assume a
distinguished heap object
root Ob
whose fields are the variables from the environment.
This representation avoids dealing with distinct sets of variable locations and makes the
formalization more uniform. We also assume a distinguished object
null
among
Ob
to
model null pointers. The set of pointers
Pt Ob ×Ob ×Label
connect a source object to
a target object with a pointer label from
Label
. These labels are either a variable name (if
the source object is
root
), a field name (if the source object is a heap object), or an array
index (if the source object is an array). Finally,
Ty :Ob Type
is a map that assigns
a concrete program type to each object. We assume the concrete set of types in
Type
contains at least object types and array types. We use the notation
o1
p
o2
to indicate
that object o1refers to o2via pointer label p.
Aregion of memory
COb \ {null,root}
is a subset of the concrete heap objects,
not containing the
root
node or
null
. It is handy to define the set of pointers
P(C1,C2)
crossing from a region C1to a region C2as:
P(C1,C2) = {o1
p
o2Pt |o1C1,o2C2}
2.2 Concrete Heap Properties
We now formalize the set of concrete properties of objects, pointers, and entire regions
of the heap that we later use to create the abstract heap graph.
Type. The set of types associated with a region
C
is the union of all types of the objects
in the region: {Ty(o)|oC}.
Cardinality. The cardinality of a region Cis the number of objects in the region |C|.
Nullity. A pointer
o1o2
is a null pointer if
o2=null
and non-null pointer if
o26=null
.
Injectivity. Given two regions
C1
and
C2
, we say that pointers labeled
p
from
C1
to
C2
are injective, written
inj(C1,C2,p)
, if for all pairs of pointers
o1
p
t1
and
o2
p
t2
drawn
from
P(C1,C2)
,
o16=o2t16=t2
. In words, the pointers labeled
p
from two distinct
objects o1and o2point to distinct objects t1and t2.
4
Shape. We characterize regions of memory
C
by shape using standard graph theoretic
notions of trees and general graphs. For additional precision, we consider the shape of
subgraphs formed from
C
, and
P(C,C)L
, i.e., the subgraph consisting of objects from
C
and pointers with labels
lL
only. This way, we can describe, for example, that a tree
structure with parent pointers is still a tree if we only consider the left and right pointers,
but not the parent pointers.
The predicate
any(C,L)
is simply true for any graph. We use it only to clarify shapes
in visualizations that don’t satisfy the more restrictive tree property.
The predicate
tree(C,L)
holds, if
P(C,C)L
is acyclic and the subgraph
P(C,C)L
does not contain any cross edges.
2.3 Heap Graph Abstraction
An abstract heap graph is an instance of storage shape graphs [
5
]. More precisely, the
abstract heap graphs used in this paper are tuples:
(root,null,Ob#
,Pt#
,Ty#
,Cd#
,Ij#
,Sh#)
where
Ob#
is a set of abstract nodes (each of which abstracts a region of the concrete
heap), and
Pt#Ob#×Ob#×Label#
is a set of graph edges, each of which abstracts
a set of pointers. Edges are annotated with labels from,
Label#
, and are the field labels
and the special label
[]
. The special label
[]
abstracts the indices of all array or container
elements (i.e., array smashing).
We distinguish a
root
node in
Ob#
for modeling the variable environment as fields on
root
. Another distinguished node
null
is used to represent the null pointer. The remaining
parts of an abstract heap
(Ty#
,Cd#
,Ij#
,Sh#)
capture abstract properties of the heap
graph.
Ty#:Ob#7→ 2Type
maps abstract nodes to the set of types of the concrete nodes
represented by the abstraction.
Cd#:Ob#7→ Interval
represents the cardinality of each
abstracted region.
Cd#
maps each abstract node
n
to a numerical interval
[l,u]Interval
,
where the lowerbound lis a natural number, and uis a natural number or .
The abstract injectivity
Ij#:Pt#bool
expresses whether the set of pointers repre-
sented by an abstract edge is injective. Finally, the abstract shape
Sh#
is a set of tuples
(n,L,s)Ob#×2Label#× {tree,any}
indicating the shape
s
of a region represented by
n
with edges restricted to L.
2.4 Abstraction Relation
We are now ready to formally relate the abstract heap graph to its concrete counterparts
by specifying which heaps are in the concretization of an abstract heap:
(root,null,Ob,Pt,Ty)γ(root,null,Ob#
,Pt#
,Ty#
,Cd#
,Ij#
,Sh#)
µ.Embed(µ,Ob,Pt,Ob#
,Pt#)
Typing(µ,Ob,Ty,Ob#
,Ty#)Counting(µ,Ob,Ob#
,Cd#)
Injective(µ,Pt,Pt#
,Ij#)Shape(µ,Pt,Pt#
,Sh#)
5
A concrete heap is an instance of an abstract heap, if there exists an embedding
µ:Ob
Ob#
satisfying the graph embedding, typing, counting, injectivity, and shape relation
between the graphs. The auxiliary predicates are defined as follows.
Embed(µ,Ob,Pt,Ob#
,Pt#)µ(root) = root µ(null) = null
∧∀o1
p
o2Pt .l.µ(o1)l
µ(o2)Pt#pγL(l)
The embed predicate makes sure that all edges of the concrete graph are present in the
abstract graph, connecting corresponding abstract nodes, and that the edge label in the
abstract graph encompasses the concrete edge label. The embedding mapping
µ
must
also map the special objects root and null to their exact abstract counterparts.
Typing(µ,Ob,Ty,Ob#
,Ty#)⇔ ∀oOb.Ty(o)Ty#(µ(o))
The typing relation guarantees that the type
Ty(o)
for every concrete object
o
is in the
set of types Ty#(µ(o)) of the abstract node µ(o)of o.
Counting(µ,Ob,Ob#
,Cd#)⇔ ∀nOb#
.|µ1(n)| ∈ Cd#(n)
The counting relation guarantees that for each abstract node
n
, the set of concrete nodes
µ1(n)abstracted by nhas a cardinality in the numeric interval Cd#(n).
Injective(µ,Pt,Pt#
,Ij#)
(n1,n2,l)Pt#
.Ij#(n1,n2,l)⇒ ∀pγL(l).inj(µ1(n1),µ1(n2),p)
The injectivity relation guarantees that every pointer set marked as injective corresponds
to injective pointers between the concrete source and target regions of the heap.
Shape(µ,Pt,Pt#
,Sh#)⇔ ∀(n,L,tree)Sh#
.tree(µ1(n),γL(L))
Finally, the shape relation guarantees that for every abstract shape tuple
(n,L,s)
, the
concrete subgraph
µ1(n)
abstracted by node
n
restricted to labels
L
satisfies the corre-
sponding concrete shape predicate s(tree and implicitly any).
2.5 Visual Representation of Abstract Heap Graphs
In the iconography for our abstract graph visualizations, the screen shots in Figure 1(a),
Figure 1(b), and section 6, we leverage a number of conventions to convey information.
An edge
(root,o,p)
whose source is the
root
node represents the content of variable
p
. Instead of drawing a root node with such edges, we simply draw a variable node
p
and an unlabeled edge to
o
. Thus, the
root
node is never drawn, as it does not appear as
the target of any edge in concrete or abstract graphs.
The set of abstract types of an abstract node is represented as the label of the abstract
node. Shape information is represented as labels on the recursive self edges of abstract
nodes. An abstract node with cardinality
1
is represented by a white background. Other
cardinalities are represented with shaded abstract nodes.
We do not draw explicit edges which only point to
null
. If an edge is associated
with a label that contains both pointers to
null
and pointers to other heap objects we
fold the possibility into the edge by using a dashed edge instead of a full edge. Finally,
injective edges are represented with normal thin edges, whereas non-injective edges are
represented by wide edges (and if color is available are also highlighted in orange).
6
3 Computing the Abstraction
This section describes the computation of the abstract graph from a given concrete
heap. The transformation is performed in three phases. 1) recursive data structures are
identified and collapsed based on identifying cycles in the type definitions, 2) nodes that
represent objects in the same logical heap region based on equivalent edges originating
the same abstract node are merged, and finally 3) abstract properties like cardinality,
injectivity, and shape are computed for the abstract edges and nodes.
3.1 Partition (µ)Computation
Initially, we associate with each concrete object
oi
an abstract partition
ni
representing
an equivalence using a Tarjan union-find structure. The mapping
µ
from concrete objects
to abstract partitions is given at any point in time by:
µ(oi) = ecr(ni)
, i.e., by the
equivalence class of the original
ni
associated with
oi
. The union-find structure maintains
the reverse mapping
µ1
providing the set of concrete objects abstracted by a node. The
abstract type map Ty#can be maintained efficiently in the union-find structure as well.
Figure 2(a) shows the initial state of these equivalence partitions for our example
from Figure 1(a) (one partition per object, plus the roots, and a special partition for null).
Each node is labeled with its partition id and the types of the objects in that partition.
The first abstraction identifies parts of the heap graph that represent unbounded depth
recursive data structures. The basic approach consists of examining the type information
in the program and the heap connectivity properties [
2
,
19
,
9
] and ensures that any heap
graph produced has a finite depth. We say types
τ1
and
τ2
are recursive (
τ1τ2
) if they
are part of the same recursive type definition.
Definition 1 (Same Data Structure Objects).
Two distinct objects
o1
,
o2
are part of
the same data structure if there is a reference
o1
p
o2
in the heap and the types of the
two objects are in the same data structure Ty(o1)Ty(o2).
The recursive components are thus identified by visiting each pointer
oioj
in the
heap and if
oi
and
oj
are in the same data structure according to Definition 1, then we
union the corresponding abstract nodes niand nj.
Figure 2(b) shows the result of merging Same Data Structure Nodes on the initial
partitions shown in Figure 2(a). The algorithm identified objects
1,2,4,5
(the
Add
,
Sub
,
and
Mult
objects from the interior of the expression tree) as being part of the recursive
data structure and replaced them with a single representative summary node.
Next we group objects based on predecessor partitions. The motivation for this
abstraction can be seen in Figure 2(b) where
Var
objects in partitions
7
and
8
represent
“variables in the environment”. There’s no need to distinguish them as they are both
referenced from the environment array. Similarly, the two constant objects referenced
from the recursive component both represent “constants in the expression tree”.
Definition 2 (Equivalent on Abstract Predecessors).
Given two pointers
o1
l
o2
and
o0
1
l0
o0
2
where
µ(o1) = µ(o0
1)
, we say that their target nodes are equivalent when-
ever: The labels agree
l=l0
, and the target nodes have some types in common, i.e.,
Ty#(µ(o2)) Ty#(µ(o0
3)) 6=/0.
7
(a) Initial Partition. (b) Merge Same Data Structure. (c) Merge Predecessors.
Fig. 2. Steps in abstraction computation.
The algorithm for grouping equivalent objects is based on a worklist where merging
two partitions may create new opportunities for merging. The worklist consists of
pointers that may have equivalent target objects. When processing a pointer from the
worklist we check if we need merge any partitions and, as needed, we merge these
partitions. Finally, all pointers incident to the merged partitions are added to the worklist.
Due to the properties of the Tarjan union-find algorithm, each pointer can enter the work
list at most
log(N)
times, where
N
is the number of abstract partitions that can be merged,
and Eis the number of pointers. Thus the complexity of this step is O(Elog(N)).
Figure 2(c) shows the result of performing the required merge operations on the
partitions from Figure 2(b). The algorithm has merged the
Var
regions into a new
summary region (since the objects represented by partitions
7
and
8
in Figure 2(b) are
referred to from the same array). Similarly the
Const
partitions from Figure 2(b) have
been merged as they are both stored in the same recursive structure (the expression tree).
Figure 2(c) differs from the abstract graph in Figure 1(b) where there is only one
edge between the expression tree node and the variables node. The reason is that, despite
the underlying abstraction being a multi-graph, our visualization application collapses
multi-edges as they frequently lead to poor graph layouts and rarely provide useful
information to the developer. Also, note that there are explicit references to
null
and that
these were not merged since we associate no types with the null object.
3.2 Abstract Property Computation
Type, Cardinality, and Nullity. The abstract type map
Ty#
has already been computed as
part of the union-find operation on abstract nodes. Similarly, the union-find operation
computes the exact cardinality, which results in a precise interval value
[i,i]
if a node
abstracts exactly
i
objects. The nullity information is represented as explicit edges to the
null abstract object.
Injectivity. The Injectivity information for an abstract edge
n1
l
n2
is computed by
iterating over all pointers from objects
oi
represented by
n1
to objects
oj
represented
by
n2
with label
p
compatible with
l
. We determine if every concrete target object is
referenced at most once, in which case the abstract edge is injective. Otherwise, the edge
is not injective.
8
Shape. The fundamental observation that enables interesting shape predicates to be pro-
duced for the abstract graphs is that the shape properties are restricted to the subgraphs
represented by an abstract node. In addition, we allow the examination of a variety of
further subgraphs by restricting the set of labels considered in the subgraph. Restricting
the label set allows e.g., to determine that the
{l,r}
edges in a tree actually form a
tree, even though there are also parent pointers
p
, which if included would allow no
interesting shape property to be determined. Selecting the particular subsets of edge
labels to consider in the subgraph selection is based on heuristics. We can start with all
labels to get an overall shape and use that computation to guess which labels to throw
out and try again. For small sets of labels, all combinations can be tried.
After partitioning the heap as shown in Figure 2(c) the final map for the objects is:
µ1=n17→ {o1,o2,o4,o5},n37→ {o3,o6},n77→ {o7,o8},n97→ {o9}
Thus, for Figure 1(b) we determine the abstract edge representing the cross partition
pointer set
n1
l
n7
is not injective, since it abstracts the two concrete pointers
o4
l
o7
and
o5
l
o7
both refer to the same
Var
object
o7
. On the other hand, since the two
Const
objects
o3
,
o6
are distinct, the algorithm will determine that edge representing
the cross partition pointer set
n1
r
n3
is injective. The Shape computation for the node
representing partition
1
requires a traversal of the four objects. As there are no cross or
back edges the layout for this is tree{l,r}.
4 Merge and Comparison Operations
Many program analysis and understanding tasks require the ability to 1) accumulate
abstract graphs, and 2) compare abstract graphs (both from the same program execution
and across executions). For example, to support computing differences in the heap state
during profiling activities or for computing likely heap invariants. So we cannot simply
track object identities and use them to control the merge and compare operations. Thus,
the definitions must be entirely based on the abstract graph structure.
4.1 Compare
Formally, the order between two abstract graphs
g1vg2
can be defined via our abstrac-
tion relation from subsection 2.4 as: g1vg2⇔ ∀h.hγ(g1)hγ(g2).
However, this is not directly computable. Instead, we implement a
O(E)
time ap-
proximation of this relation that first determines the structural equality of the abstract
graphs by computing an isomorphism, followed by an implication check that all abstract
edge and node properties in g2cover the equivalent node and edge properties of g1.
To efficiently compute the subgraph isomorphism between
g1
and
g2
we use a
property of the abstract graphs established by Definition 2. From this definition we
know that every pair of out edges from a node either differ in the
label
or have the same
label
but non-overlapping sets of types in the nodes they refer to. Thus, to compute an
isomorphism between two graphs we can simply start pairing the local and global roots
9
and then from each pair match up edges based on their
label
and type sets, leading to
new pairings. This either results in an isomorphism map, or it results in a pair of nodes
reachable from the roots along the same path that have incompatible edges. Any such
edge differences can then be reported. With the subgraph isomorphism
φ
, we define the
ordering relation:
g1vφg2⇔ ∀nOb#
1.Ty#
1(n)Ty#
2(φ(n))
∧ ∀nOb#
1.Cd#
1(n)vCd#
2(φ(n)) ∧ ∀φ(e)Pt#
2.Ij#
2(φ(e)) Ij#
1(e)
∧ ∀(φ(n),L2,s2)Sh#
2.(n,L1,s1)Sh#
1.L2L1s1vs2
Note how abstract shape predicates are contra-variant in the label set L. In other words,
if a shape property holds for the subgraph based on
L1
, then it holds for the smaller
subgraph based on the smaller set L2.
4.2 Merge
The merge operation takes two abstract graphs and produces a new abstract graph that
is an over approximation of all the concrete heap states that are represented by the two
input graphs. In the standard abstract interpretation formulation this is typically the
least element that is an over approximation of both models. However, to simplify the
computation we do not enforce this property (formally we define an upper approximation
instead of a join). Our approach is to leverage the existing definitions from the abstraction
function in the following steps.
Given two abstract heap graphs,
g1
and
g2
of the form
gi= (rooti,nulli,Ob#
i,Pt#
i,
Ty#
i,Cd#
i,Ij#
i,Sh#
i)
we can define the graph,
g3
, that is the result of their merge as follows.
First we produce the union of the two graphs by simply adding all nodes and edges from
both graphs. Once we have taken the union of the two graphs we merge the variable/static
roots that have the same names. Then we use Definition 1 and Definition 2 to zip down
the graph merging nodes and edges until no more changes are occurring. During the
merge we build up two mappings
η1:g1g3
and
η2:g2g3
from nodes (edges) in
the original graphs,
g1
and
g2
respectively, to the nodes (edges) in the merged graph.
Using these mappings, we define upper approximations of all the graph properties:
Ty#
3(n) = [
n1η1
1(n)
Ty#
1(n1)[
n2η1
2(n)
Ty#
2(n2)
Cd#
3(n) =
n1η1
1(n)
Cd#
1(n1)t
n2η1
2(n)
Cd#
2(n2)
Ij#
3(e) = (|η1
1(e)|=|{n2|n1
l
n2η1
1(e)}|)
(|η1
2(e)|=|{n2|n1
l
n2η1
2(e)}|)
^
e1η1
1(e)
Ij#
1(e1)^
e2η1
2(e)
Ij#
2(e2)
The set of types associated with the result is just the union of all types abstracted by the
node in both graphs. The cardinality is more complicated to compute. It computes the
10
abstract sums over intervals from all nodes abstracted from the input graphs separately,
and then joins the resulting interval (or depending on the application widens as defined
in [
7
]). Injectivity is the logical conjunction of the injectivity of all the source edges,
provided that all the edges in the respective graphs that are merged had different target
nodes (the equality of the edge and target sets). When merging two injective edges from
the same graph we cannot guarantee that the resulting set of edges is injective, in the
case that they target the same node, and if we encounter this we conservatively assume
the result edge is not injective.
For computing the shape predicates we need to take into account not only the shape
properties of the original graphs, but also the connectivity among the input nodes that
map to the same node in the joined graph. We define a very conservative check for
treeness during the merge:
treeµ(n,L,µ,g)⇔ |Pt#
gµ1(n),L| ≤ 1∧∀n0µ1(n).L0L(n0
,L0
,tree)Sh#
g
where
Pt#
gµ1
i(n),L
is the subgraph of
Pt#
g
made up of nodes that map to
n
under
µ
and
non-self
2
edges incident to them and restricted to labels
L
. Note that
tree
can only be
inferred, if at most one node is in the partition from each graph and the node represents
a tree. The abstract shape for a merged node in the graph can be defined as:
(n,L,tree)Sh#
3treeµ(n,L,µ,g1)treeµ(n,L,µ,g2)
Since this operation is based on the same congruence closure operation as the
abstraction operation (plus a linear amount of work to compute the needed properties)
the merge operation it can be computed in O(Elog(N)) time.
5 Additional Reduced and Interactive Views
While the abstract heap graph presented thus far produces models that scale in size with
the number of logical regions in the program — independently of heap size and loosely
correlated with the number of types used in the program — the graphs are often still
too large to visualize and explore effectively. A second issue, particularly in a debugger
scenario, is that after identifying a region of interest the developer wants to zoom into a
more detailed view of the objects that make up the region.
While the DGML viewer [
11
] we use is quite effective at zooming, slicing, and
navigating though large graphs we can directly address the above two issues by providing
additional support for zooming between abstraction levels: the developer can zoom
incrementally from a very high level view based on dominators in the abstract heap
graph, defined in subsection 2.3, all the way down to individual objects in the concrete
heap without losing track of the larger global context of the heap structure3.
Given an abstract heap graph we can compute dominator information in a fairly
standard way [
24
]. We deviate slightly since we want to ensure that interesting nodes
which are directly pointed to by variables, and nodes that are immediate neighbors of
2Self-edges need not be considered as they are already accounted for in the shape.
3In a way that is similar to the semantic zoom of [8].
11
these nodes remain expanded. In our experience this heuristic seems to strike a nice
balance between collapsing large portions of the graph, to aid in quickly getting a
general overview of the heap, while preserving structure around local variables, which
are frequently of particular interest and we want extra detail on. This can be done by
simply asserting that all of the nodes we want to keep expanded do not have any non-
self dominators (equivalently ignoring all in-edges to these nodes during dominator
computation). Using our modified dominator computation we can replace every node
n
(which has not been marked interesting) and all of the nodes
nd
1. . . nd
k
that
n
dominates
with a single reduced node. This simple transformation results in a substantial reduction
in the size of the graph while preserving much of the large scale heap structure and,
since we can track the set of abstract graph nodes that each reduced node corresponds
to, we can move easily between the two views of the heap. Furthermore, since the
notion of domination and ownership [
6
] are closely related, this reduction has a natural
relation with the developer’s concept of ownership encapsulation of heap structures. This
view is conceptually similar to the approach taken in [
22
,
21
], although the dominator
construction is on the abstract graph, where data structures have already been identified
and grouped, instead of on the concrete heap graph.
Individual Object Zoom When looking at a graph that represents an abstraction of a
single heap state (e.g., in an interactive debugger) it is very useful to be able to zoom
down from the level of individual regions to examine the individual objects that make
up a region. One approach for this is to simply expand a node in the abstract graph
into the concrete object graph it represents. However, for large structures (e.g., a list
with
2000
entries) this can produce an intractably large graph. An alternative is to mark
individual objects as interesting and then implement the abstraction function such that
these objects are always represented as distinct nodes (i.e., never merged). Then as the
user drills down into a data structure, similar to what is done in existing debuggers, we
can recompute the abstraction for the data structure that is being explored marking the
appropriate nodes as interesting so they can be individually inspected.
6 Implementation and Evaluation
To evaluate the utility of our abstraction, we examine 1) the cost of computing abstract
heaps from realistically sized heaps in real programs, 2) the feasibility of visualizing
the abstract graphs, and 3) whether the abstract graphs produced are precise enough for
understanding the program’s behavior and to identify and correct various defects.
We implemented the algorithms
4
for computing and manipulating abstract heap
graphs in C#. In order to visualize the resulting graphs we use the DGML [
11
] graph
format and the associated viewer in Visual Studio 2010. This graph format and viewer
support conditional style macros to control changes between the levels of abstraction
described in this paper, and to perform selective highlighting of nodes/edges with given
properties. For example, we can highlight edges that represent non-injective pointers, or
4
Code available online at
http://heapdbg.codeplex.com
and a web accessible demo is
available at http://rise4fun.com/HeapDbg.
12
we can apply a heat-color map to the nodes based on the amount of memory the objects
they represent are using.
In order to evaluate the utility of the abstraction in the inspection and understanding
of heap related problems (and in their solutions) we implemented a memory profiler tool.
This profiler rewrites a given .Net assembly with sampling code to monitor memory
use and to compute heap snapshots, along with the associated abstractions, at points
where memory use is at a high point in the execution profile. The rewriter is based on
the Common Compiler Infrastructure (CCI) [
4
] framework. As performing full heap
abstractions at each method call would be impractical we use a per-method randomized
approach with an exponential backoff based on the total reachable heap size (as reported
by the GC). If we detect that the program may have entered a new phase of computation,
the reachable heap size grows or shrinks by a factor of
1.5×
from the previous threshold,
then we begin actively taking and abstracting heap snapshots. A snapshot of the heap is
the portion reachable from the parameters of a single method call and from static roots.
Depending on the size of the snapshot relative to previously seen heaps, we either save
the snapshot as likely capturing some interesting heap state or discard it and increase the
random backoff for the method that produced it. This use of random backoff sampling
based on GC reported memory use and snapshot size results in a program that outputs
between
2
and
10
snapshots from a program execution and execution is around
20×
to
100×
slower than the uninsturmented program. We compared the results obtained
by sampling uniformly at random and found that, in addition to having a much larger
overhead, the uniform sampling approach produced results that were no more useful for
memory debugging then the backoff sampling approach.
In order to help the developer quickly identify structures of interest we implemented
a number of simple post-processing operations on the abstract graphs which allow the
DGML viewer to flag nodes (regions) of the heap that display common types of poor
memory utilization [
23
]. The properties we identify are percentage of memory used,
small object identification, sparse container or small containers, and over-factored
classes. The memory percentage property uses a heat map, coloring any nodes that
contain more than
5
%,
15
%, or
25
% of the heap respectively. The small object property
highlights any nodes where the object overheads (assumed to be
4
bytes per object)
are more than half the actual data stored in the objects. The poor collection utilization
property highlights nodes that represent regions which are containers and all of them are
either all very small (contain 3 or fewer elements) or are more than half empty (over half
the entries are null). While the first three properties are fairly standard, the final property,
over-factored classes, is a less well known issue. We consider a structure overfactored if
(1) there exists a node
n
that consists of small objects and (2)
n
has a single incoming
edge that is injective (i.e., each object represented by the node
n
is uniquely owned by
another object). These two features appear commonly when the objects represented by
the node
n
could be merged with the objects that have the unique pointers to them (i.e.,
the class definitions can be merged) or when the objects represented by
n
could be better
implemented as value types (i.e.,
structs
in C#). The
Face[]
and
Point
objects in
the raytracer study, subsection 6.1, are an example of this.
From the viewpoint of a userspace tool handling the types provided by the base class
or system libraries, e.g., the Base Class Library (BCL) for .Net or the
java.*
in Java,
13
are an important consideration. For user space applications the internal structure of say,
FileStream
or
StringBuilder
is not interesting, We identify these objects by simply
examining the namespace of the type and treat them as single opaque objects. However,
some classes in these libraries have features that are relevant to userspace code even
though the details of the internal representation are not of particular interest. Examples
of these types are
List<T>
or
Dictionary<K,V>
, which we treat as ideal algebraic
data structures, showing the links to the contained elements but still treating the internal
implementations as opaque.
For this paper we converted raytracer from from SPEC JVM98 [
31
] and six programs
from DaCapo suite [
3
] to .Net bytecode using the ikvm compiler [
13
]
5
. As the DaCapo
suite contains a number of large and complex programs we also timed the extraction,
comparision, and merge operations on each heap snapshot that was taken.
6.1 Raytracer: Extended Case Study
In this section we study the raytracer program from SPEC JVM98. The program imple-
ments a raytracer which renders a user provided scene. Using this example, we illustrate
how the heap visualization looks for a well know program, and how the information can
be used in a debugging type scenario to investigate memory use.
Running this program in the heap profiler, we obtain as one of the snapshots an
abstract heap from the entry of the
shade
method. This abstract heap represents
168K
objects (a total of
4MB of memory). Applying the heap graph abstraction followed
by the dominator reduction produces the visualization shown in Figure 6.1. This figure
shows the entire heap structure for the render while preserving most structural features
of interest. In this heap we see the root nodes
this
,
tree
, and
eyeRay
representing
the argument variables to the method. The
this
variable refers to a
scene
object. This
object has a field
octree
that represents a space decomposition tree structure which is
also referred to by the
tree
argument variable. The larger nodes with the chevron are
dominator reduced nodes that represent multiple dominated regions and can be expanded
to inspect the internal structure in more detail.
The raytracer octree space decomposition structure is represented by the dominator
reduced node labeled
#20
. It is directly noticeable that there are pointers from this data
structure to
ObjNode
objects, represented by node
#7
. The shape
tree{nextLink}
of
node
#7
indicates that this is a list (a tree with out-degree 1). The list in turn contains
shapes (
SphereObj
,
TriangleObj
,
. . .
) that are in the associated quadrants of the space
decomposition structure. This list is used to enumerate all the shapes that appear in a
given quadrant. There are also references from objects in the space decomposition tree
structure to the dominator reduced node
#19
, which contains more information on the
composite structure of Face objects.
Memory Use. Memory usage is an important concern for many applications. There
are many reasons why an application may use more memory than is really required.
Common problems in object-oriented, garbage collected languages are leaks [
14
], where
unused objects are still reachable, and bloat [
23
], where encapsulation and layering have
5Unfortunately, ikvm is not able to process the remaining DaCapo benchmarks.
14
Fig. 3. Debugger snapshot of Shade method in the Scene class.
added excessive overhead. Ideally a programmer would like to see what types are using
the most memory and where these objects are being used. Our visualization uses the
conditional styling support in the DGML renderer to color nodes based on the percentage
of total used live memory. Enabling this coloring results in the dominator reduced node
representing the Face structures (node #19) being colored.
Fig. 4. Expanded face dominator node.
Node
#19
represents a large amount
of memory,
107K objects representing
nearly half of the the total live heap. By
expanding the node
#19
we get the graph
(Figure 4) representing the internal struc-
ture of the dominator reduced node. This
reveals node (
$48
), abstracting a region
of
18K
Face
objects, node (
$23
), ab-
stracting a region of
18K
Point[]
, and
node (
$49
), abstracting a region of
72K
Point
objects. The raytracer program is
known to have poor memory health [
23
],
in the sense that it exhibits a high rate of
object overhead associated with a large
number of very small objects. The
Point
objects here are a major factor in that.
At first glance it may not be clear how
to reduce the overhead of these
Point
objects. However, turning on the over-factored highlighting or inspecting the injectivity
information in Figure 4, provides additional guidance. The edge from node
$23
to node
$49
—representing all the pointers stored in the arrays—is shown as a normal edge and
not shaded and wide. Therefore, the set of pointers abstracted by the edge is injective and
each index of each array points to a unique
Point
object. Given this likely ownership
relation and the fact that all of the arrays are of length
4
it seems that flattening the Face
15
data structure would reduce memory use substantially (i.e., this satisfies our conditions
for being an over factored structure).
By studying the source code for the
Face
class we can see that these ownership and
length properties do in fact hold universally. Thus, we can flatten each
Point[4]
and
associated Point objects into a float[12]. This transformation eliminates one object
header per
Point
object (at
4
bytes each) and the
4
pointers stored in the
Point[4]
(at
4
bytes per pointer). Given that we have
72K
Point
objects and
18K
Point[]
, this
change works out to
0.6MB of savings or
18% of the total live heap. Using similar
reasoning we could further flatten the
float[12]
arrays into the
Face
implementations
for another
0.22MB of savings, or another
6% of the live heap. These two refactorings
then represent a 24% reduction in the total live heap size.
This case study shows how the multi-level abstraction allows the developer to
navigate around the heap at the desired level of detail, zoom-in and out of specific areas
of interest, all while maintaining the larger context. This ability to put the problem in
context and interactively explore the heap is critical to aiding the developer in quickly
identifying the source of a problem, understanding the larger context, and thus being
confident in formulating a remedy.
6.2 Evaluation With Profiler
A number of papers have identified and explored memory use issues in the DaCapo
benchmark suite. Hence, we decided to evaluate the effectiveness of the abstraction
techniques described in this paper by using our profiling tool to analyze several programs
from the DaCapo suite for memory utilization issues.
After running the profiler we inspected the output abstract graphs to find nodes
(regions) that contained potential problems and then to determine what (if anything)
could be done to resolve the issues or if the memory use appeared appropriate. This was
done via manual inspection of the graph, the use of the heap inspection and highlighting
tools in the profiler, and inspecting the associated source code. In all cases at most
7
nodes were colored by the profiler tools and the total time to inspect the graph, identify
the relevant structures, inspect the associated source code, and determine if the memory
use was generally appropriate was always less than
10
minutes. Also, as we had not
previously worked with the code, sometimes we needed to spend additional time to
understand more about the intent of the classes and their structure in order to fully
determine if the code could be successfully refactored and how. This was particularly
important when multiple classes/subclasses were used to build recursive data structures.
However, this inspection never required more than an additional 15 to 20 minutes.
Antlr. For the Antlr benchmark, the tool reports one of the larger heaps being reachable
from a method in the
JavaCodeGenerator
class. We inspected this heap with our
visualization turning on the memory use heat map, we were able to quickly identify
one dominator node as containing around 72% of the reachable memory. This region
was dominated by a set of
RuleSymbols
each of which stores information representing
various aspects of the parser. Further inspection did not reveal any obvious memory use
problems or obvious areas where data structures could be refactored to substantially
improve memory utilization. These findings match those of previous studies of the
16
benchmark which is not known to have any reported memory leaks and is reported to
have good utilization of memory (in particular [23] reports a good health score).
Chart. For the Chart benchmark our tool reports the largest heaps being reachable from
a method in the
JFreeChart
class. Our highlighting tools indicate a region, Figure 5,
that is of potential interest is dominated by a set of
XYSeries
objects. Expanding this
dominator node shows that the memory is being used by a large number of
XYDataItem
objects and the
Double
objects they own (similar to the case in the raytracer case study).
By hovering over these objects we saw that they consume about
3
MB of heap space.
The actual data contained in these objects (in particular the
Double
objects) is small
compared to the object overhead and there is an ownership relation between each of
the
XYDataItem
objects and the
Double
objects. This indicates that we could inline
these structures to save space and an inspection of the
XYDataItem
class shows that it
declares the
x/y
fields as
Number
types to allow for some level of polymorphism. So we
need to subclass our new flattened classes to allow for storing both integer and floating
point
x
,
y
pairs. This refactoring results in a savings of around
1
MB, which is around
25
% of the total live memory at this point in the program. To the best of our knowledge
this memory issue has not been reported in previous work.
Fig. 5. Chart Memory Use
FOP. For the fop benchmark the tool reports the largest heap being reachable from a
method in the
Page
class. The highlighted region consists of a large number of objects
that contains various parts of the document to render, for example, the
WordArea
and
TableArea
objects. After a some inspection of the source code we concluded that the
data structure was not particularly amenable to refactoring. As reported in [
14
], we note
that the data structure is needed later in the computation and thus is not a leak.
PMD. For the pmd program, our tool reports one of the larger heaps as occurring in
the
JavaParser
class. The section highlighted by the memory utilization coloring uses
over
10
MB of memory and consists of a data structure which is a tree via the
children
field and container offsets, along with a parent back pointer on the
parent
field. This
data structure represents the AST of the program that is being analyzed. Hovering over
17
the node reports that it represents more than 50 types (with names like
ASTExpression
and
ASTPrimitiveType
) that all inherit from the
SimpleNode
class. On inspection
we see that this base class has many data fields (line numbers, the
children
array, the
parent
field, etc.) which the subclasses add to with additional AST specific information.
Given this structure we did not see any obviously poor memory use or memory leaks.
This appears to contradict [
23
] which reports a high rate of object header overhead
in this benchmark. However, in this case the overhead is actually encoding important
structural information about the AST that is being processed. This demonstrates how the
visualization can be used to augment the information provided by existing analysis tools.
6.3 Computational Costs
Table 1 contains information on the sizes of the largest abstract representations produced
during the runs of the profiler and the cost of extracting and comparing these abstract
heap graphs. The first column lists the benchmark and the second column the number
of objects in the largest concrete heap snapshot that was encountered. The following
columns are the size of the largest abstract heap graph produced for any heap snapshot
(AbsNode), and the size of the corresponding dominator reduced representation from
section 5 (Reduced). Some of these sizes seem to be at (or beyond) the upper end of what
can be conveniently visualized. However, our experience with in subsection 6.1 shows
the combination of the conditional graph styles, the ability to zoom between levels of
detail, and the navigational tools provided by the DGML viewer made inspecting and
understanding the relevant parts of the graphs quite easy.
Bench Objects AbsNode Reduced AbsTime EqTime MergeTime
raytracer 168K 48 21 1.37s 0.04s 0.11s
antlr 12K 606 201 0.41s 0.03s 0.11s
chart 189K 198 110 3.22s 0.09s 0.21s
fop 120K 531 150 2.67s 0.11s 0.41s
luindex 2K 87 36 0.50s 0.01s 0.02s
pmd 178K 146 28 4.11s 0.09s 0.15s
xalan 40K 451 127 2.42s 0.07s 0.17s
Table 1. Max graph sizes and timings.
The next issue we wanted to evaluate was the computational costs of performing
the abstraction, comparison, and merge operations. The columns AbsTime,EqTime, and
MergeTime columns shows the maximum time taken to abstract a concrete heap during
the profiler run and to merge/compare it with previously taken snapshots.
The current abstraction implementation creates a complete shadow copy of the
concrete heap during abstraction. Despite this large constant time overhead, the cost of
computing the abstractions is quite tractable. The running time scales very closely to
the asymptotic complexity of
O(Elog(N))
. The current implementation computes the
abstraction inside the process that is instrumented, so it was not possible to precisely
18
measure the exact memory overhead of the abstraction operations. However, using the
difference in the total memory consumed by the process as reported by the system
monitor indicates a factor of a 40×increase in memory use (never exceeding 800MB).
7 Related Work
Developing debugger support for the program heap is an ongoing active research area.
The work in [
33
] outlines many of the basic issues that arise when attempting to visualize
concrete program heaps and [
27
] presents some abstractions to help alleviate some of
these issues. There is a large body of work on techniques to improve the efficiency
and effectiveness of debugging [
32
,
17
,
18
,
29
,
12
,
26
]. Work in [
1
] takes the same general
approach as this work but focuses on the interactive aspects of visualizing the heap, in
particular allowing the developer to inspect individual objects in a larger structure.
Work by Mitchell et. al. [
22
,
21
] has a number of similarities to the work in this
paper. Both approaches use a set of grouping heuristics to partition structures in the
heap and then extract information about the partitions, but the partitioning strategy
and information extracted differ substantially. Our work uses recursive structures and
predecessor ownership to identify equivalence classes of objects/data while [
22
,
21
] focus
on dominator relations between objects. We note that this results in the same asymptotic
cost as the work in this paper. Given this difference of grouping heuristics there is also a
natural difference in the focus on what type of information is extracted. In particular, the
abstraction in this paper is designed to aid programmer understanding of the structure
and connectivity of various heap structures and so it explicitly extracts information on
shape, edge injectivity, pointer nullity, container sizes, in addition to information on the
sizes of various data structures. While some of these properties can, in some cases, be
reconstructed using fanout and object count information, the majority of the information
computed in [
22
,
21
] focuses the specific task of identifying memory inefficiencies in
large Java programs.
There is a substantial amount of work on the development of heap models for use
in static program analysis [
2
,
10
,
16
,
30
]. Whereas program analysis is concerned with
computability and obtaining enough precision at reasonable cost, the main challenge in
abstracting runtime heaps is to obtain very small models that can be visualized, while
retaining many useful properties of the original heap. We believe though that insights
in static heap analysis can inform the abstractions of runtime heaps and vice versa. For
example, it would be interesting to provide programmers with more control over the
abstractions produced via instrumentation predicates [
2
,
30
]. The approach in [
16
] uses
a less descriptive model than the one presented in this paper for example, it does not
consider information such as injectivity or shape. Work in [
28
,
15
] use a related idea of
taking a concrete heap from a C/C++ or Java program and inferring the types [
28
] or
basic shapes [15] of heap structures.
8 Conclusion
This paper introduces a new runtime technique for program understanding, analysis
and debugging. The abstraction of heap graphs presented attempts to construct a very
19
small representation of the runtime heap in order to allow effective visualization and
navigation, while retaining crucial high-level properties of the abstracted heap, such
as edge relations and shape of various subgraphs. The construction of the abstraction
ensures that the abstract graph is a safe representation of the concrete heap, allowing
the programmer (or other tools) to confidently reason about the state of the memory by
looking at the abstract representation. Our benchmarks and case studies demonstrate that
abstract heap graphs can be efficiently computed, contain interesting information on the
heap structure, and provide valuable information for identifying and correcting memory
use related defects. Given the utility of the abstraction in this task we believe there are
a number of other applications including thread races, refactoring for parallelism, or
interactive debugging, where this type of abstraction and understanding would be useful.
Acknowledgments
We would like to thank Peli de Halleux for setting up the online demo code at RiSE4fun,
Chris Lovett for his help with DGML, and Todd Mytkowicz for his insight into some of
the code in the DaCapo suite.
References
1.
E. Aftandilian, S. Kelley, C. Gramazio, N. Ricci, S. Su, and S. Guyer. Heapviz: interactive
heap visualization for program understanding and debugging. In SOFTVIS, 2010.
2.
J. Berdine, C. Calcagno, B. Cook, D. Distefano, P. O’Hearn, T. Wies, and H. Yang. Shape
analysis for composite data structures. In CAV, 2007.
3.
S. Blackburn, R. Garner, C. Hoffman, A. Khan, K. McKinley, R. Bentzur, A. Diwan, D. Fein-
berg, D. Frampton, S. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. Moss, A. Phansalkar,
D. Stefanovi
´
c, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo bench-
marks: Java benchmarking development and analysis (2006-mr2). In OOPSLA, 2006.
4. Common Compiler Infrastructure. http://ccimetadata.codeplex.com.
5. D. Chase, M. Wegman, and K. Zadeck. Analysis of pointers and structures. In PLDI, 1990.
6.
D. Clarke, J. Potter, and J. Noble. Ownership types for flexible alias protection. In OOPSLA,
1998.
7.
P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In POPL, 1979.
8.
R. DeLine and K. Rowan. Code canvas: zooming towards better development environments.
In ICSE, 2010.
9.
A. Deutsch. Interprocedural may-alias analysis for pointers: Beyond k-limiting. In PLDI,
1994.
10.
R. Ghiya and L. Hendren. Is it a tree, a dag, or a cyclic graph? A shape analysis for heap-
directed pointers in C. In POPL, 1996.
11. DGML Specification. http://schemas.microsoft.com/vs/2009/dgml.
12.
T. Hill, J. Noble, and J. Potter. Scalable visualizations of object-oriented systems with
ownership trees. Journal of Visual Languages and Computing, 2002.
13. ikvm. http://www.ikvm.net/.
14.
M. Jump and K. McKinley. Cork: dynamic memory leak detection for garbage-collected
languages. In POPL, 2007.
15. M. Jump and K. McKinley. Dynamic shape analysis via degree metrics. In ISMM, 2009.
16.
C. Lattner, A. Lenharth, and V. S. Adve. Making context-sensitive points-to analysis with
heap cloning practical for the real world. In PLDI, 2007.
20
17.
B. Liblit, M. Naik, A. Zheng, A. Aiken, and M. Jordan. Scalable statistical bug isolation. In
PLDI, 2005.
18.
C. Liu, X. Yan, L. Fei, J. Han, and S. Midkiff. Sober: statistical model-based bug localization.
SIGSOFT, 30(5), 2005.
19.
R. Manevich, E. Yahav, G. Ramalingam, and M. Sagiv. Predicate abstraction and canonical
abstraction for singly-linked lists. In VMCAI, 2005.
20.
M. Marron, M. M
´
endez-Lojo, M. Hermenegildo, D. Stefanovic, and D. Kapur. Sharing
analysis of arrays, collections, and recursive structures. In PASTE, 2008.
21. N. Mitchell. The runtime structure of object ownership. In ECOOP, 2006.
22.
N. Mitchell, E. Schonberg, and G. Sevitsky. Making sense of large heaps. In ECOOP, 2009.
23. N. Mitchell and G. Sevitsky. The causes of bloat, the limits of health. In OOPSLA, 2007.
24. S. Muchnick. Advanced Compiler Design and Implementation. Morgan Kaufmann, 1997.
25.
F. Nielson, H. Nielson, and C. Hankin. Principles of Program Analysis. Springer-Verlag New
York, Inc., 1999.
26.
W. D. Pauw and G. Sevitsky. Visualizing reference patterns for solving memory leaks in java.
In ECOOP, 1999.
27.
S. Pheng and C. Verbrugge. Dynamic data structure analysis for Java programs. In ICPC,
2006.
28.
M. Polishchuk, B. Liblit, and C. Schulze. Dynamic heap type inference for program under-
standing and debugging. In POPL, 2007.
29. A. Potanin, J. Noble, and R. Biddle. Snapshot query-based debugging. In ASWEC, 2004.
30.
S. Sagiv, T. Reps, and R. Wilhelm. Parametric shape analysis via 3-valued logic. In POPL,
1999.
31.
Standard Performance Evaluation Corporation. JVM98 Version 1.04, August 1998.
http://www.spec.org/jvm98.
32. A. Zeller. Isolating cause-effect chains from computer programs. In FSE, 2002.
33.
T. Zimmermann and A. Zeller. Visualizing memory graphs. In Software Visualization, 2001.
21
... Some tools support users by displaying the path to the GC roots, while other approaches assist users by displaying the code that has allocated the objects. Visualization approaches [2,46,73,76,97,102,131] aggregating the object graph (e.g., based on its dominator tree [69,75,116]) are useful to analyze the heap's composition. A user following the top-down approach first selects a GC root or a heap object that keeps alive many other objects. ...
... Most state-of-the-art tools rely heavily on the visualization of data using (tree) tables. Yet, ample scientific work exists on more advanced features for memory visualization [2,46,73,76,97,102,131]. AntTracks thus already provides a graph-based visualization of the aggregated object graph to inspect the paths to the GC roots. ...
Article
Memory analysis tools are essential for finding and fixing anomalies in the memory usage of software systems (e.g., memory leaks). Although numerous tools are available, hardly any empirical studies exist on their usefulness for developers in typical usage scenarios. Instead, most evaluations are limited to reporting performance metrics. We thus conducted a study to empirically assess the usefulness of the interactive memory analysis tool AntTracks Analyzer. Specifically, we first report findings from assessing the tool using a cognitive walkthrough, guided by the Cognitive Dimensions of Notations Framework. We then present the results of a qualitative user study involving 14 subjects who used AntTracks to detect and resolve memory anomalies. We report lessons learned from the study and implications for developers of interactive memory analysis tools. We hope that our results will help researchers and developers of memory analysis tools in defining, selecting, and improving tool capabilities.
Article
Understanding and optimizing memory usage of software applications is a difficult task, usually involving the analysis of large amounts of memory-related complex data. Over the years, numerous software visualizations have been proposed to help developers analyze the memory usage information of their programs. This article reports a systematic literature review of published works centered on software visualizations for analyzing the memory consumption of programs. We have systematically selected 46 articles and categorized them based on the tasks supported, data collected, visualization techniques, evaluations conducted, and prototype availability. As a result, we introduce a taxonomy based on these five dimensions to identify the main challenges of visualizing memory consumption and opportunities for improvement. Despite the effort to evaluate visualizations, we also find that most articles lack evidence regarding how these visualizations perform in practice. We also highlight that few articles are available for developers willing to adopt a visualization for memory consumption analysis. Additionally, we describe a number of research areas that are worth exploring.
Chapter
Knowing the shapes of dynamic data structures is key when formally reasoning about pointer programs. While modern shape analysis tools employ symbolic execution and machine learning to infer shapes, they often assume well-structured C code or programs written in an idealised language. In contrast, our Data Structure Investigator (DSI) tool for program comprehension analyses concrete executions and handles even C programs with complex coding styles. Our current research on memory safety develops ways for DSI to synthesise inductive shape predicates in separation logic. In the context of trusted computing, we investigate how the inferred predicates can be employed to generate runtime checks for securely communicating dynamic data structures across trust boundaries. We also explore to what extent these predicates, together with additional information extracted by DSI, can be used within general program verifiers such as VeriFast. This paper accompanies a talk at the ISoLA 2018 track “A Broader View on Verification: From Static to Runtime and Back”. It introduces DSI, highlights the above use cases, and sketches our approach for synthesising inductive shape predicates.
Conference Paper
Complex software systems often suffer from performance problems caused by memory anomalies such as memory leaks. While the proliferation of objects is rather easy to detect using state-of-the-art memory monitoring tools, extracting a leak's root cause, i.e., identifying the objects that keep the accumulating objects alive, is still poorly supported. Most state-of-the-art tools rely on the dominator tree of the object graph and thus only support single-object ownership analysis. Multi-object ownership analysis, e.g., when the leaking objects are contained in multiple collections, is not possible by merely relying on the dominator tree. We present an efficient approach to continuously collect GC root information (e.g., static fields or thread-local variables) in a trace-based memory monitoring tool, as well as algorithms that use this information to calculate the transitive closure (i.e., all reachable objects) and the GC closure (i.e., objects that are kept alive) for arbitrary heap object groups. These closures allow to derive various metrics for heap object groups that can be used to guide the user during memory leak analysis. We implemented our approach in AntTracks, an offline memory monitoring tool, and demonstrate its usefulness by comparing it with other widely used tools for memory leak detection such as the Eclipse Memory Analyzer. Our evaluation shows that collecting GC root information tracing introduces about 1% overhead, in terms of run time as well as trace file size.
Chapter
Programs written in modern object-oriented programming languages heavily use dynamically allocated objects in the heap. Therefore, dynamic program analysis techniques, such as memory leak diagnosing and automatic debugging, depend on various kinds of information derived from the heap. Identifying the differences between two heaps is one of the most important task and provided by many free and commercial problem diagnosing tools that are widely used by industry. However, existing heap differentiating tools usually leverage singular kind of information of an object, e.g., the address, allocation site or access path in the heap object graph. Such a single kind of information usually has disadvantages and thus can only provide an imprecise result, which cannot further satisfy the requirement of other high-level dynamic analysis. We have observed that the disadvantage of a kind of information can be remedied by another one in many situations. This paper presents PHD, a precise heap differentiating tool for Java programs, using objects’ spatial information (i.e., access path) and temporal information (i.e., execution index), which are both derived from the execution. To practically collect execution index, we implemented PHD on an industrial-strength Java virtual machine and thus it can be seamlessly integrated in production environments. Furthermore, we conducted case studies using PHD for three different dynamic analysis tasks on real-world applications such as Eclipse Compiler for Java, Apache Derby and Apache FTP Server.
Conference Paper
Optimizing memory management is a major challenge of embedded systems programming, as memory is scarce. Further, embedded systems often have heterogeneous memory architectures, complicating the task of memory allocation during both compilation and migration. However, new opportunities for addressing these challenges have been created by the recent emergence of managed runtimes for embedded systems. By imposing structure on memory, these systems have opened the doors for new techniques for analyzing and optimizing memory usage within embedded systems. This paper presents GEM (Graphs of Embedded Memory), a tool which capitalizes on the structure that managed runtime systems provide in order to build memory graphs which facilitate memory analysis and optimization. At GEM's core are a set of fundamental graph transformations which can be layered to support a wide range of use cases, including interactive memory visualization, de-duplication of objects and code, compilation for heterogeneous memory architectures, and transparent migration. Moreover, since the same underlying infrastructure supports all of these orthogonal functionalities, they can easily be applied together to complement each other.
Article
Optimizing memory management is a major challenge of embedded systems programming, as memory is scarce. Further, embedded systems often have heterogeneous memory architectures, complicating the task of memory allocation during both compilation and migration. However, new opportunities for addressing these challenges have been created by the recent emergence of managed runtimes for embedded systems. By imposing structure on memory, these systems have opened the doors for new techniques for analyzing and optimizing memory usage within embedded systems. This paper presents GEM (Graphs of Embedded Memory), a tool which capitalizes on the structure that managed runtime systems provide in order to build memory graphs which facilitate memory analysis and optimization. At GEM's core are a set of fundamental graph transformations which can be layered to support a wide range of use cases, including interactive memory visualization, de-duplication of objects and code, compilation for heterogeneous memory architectures, and transparent migration. Moreover, since the same underlying infrastructure supports all of these orthogonal functionalities, they can easily be applied together to complement each other.
Conference Paper
As the complexity of malware grows, so does the necessity of employing program structuring mechanisms during development. While control flow structuring is often obfuscated, the dynamic data structures employed by the program are typically untouched. We report on work in progress that exploits this weakness to identify dynamic data structures present in malware samples for the purposes of aiding reverse engineering and constructing malware signatures, which may be employed for malware classification. Using a prototype implementation, which combines the type recovery tool Howard and the identification tool Data Structure Investigator (DSI), we analyze data structures in Carberp and AgoBot malware. Identifying their data structures illustrates a challenging problem. To tackle this, we propose a new type recovery for binaries based on machine learning, which uses Howard's types to guide the search and DSI's memory abstraction for hypothesis evaluation.
Book
Full-text available
In this book we shall introduce four of the main approaches to program analysis: Data Flow Analysis, Control Flow Analysis, Abstract Interpretation, and Type and Effect Systems. Each of Chapters 2 to 5 deals with one of these approaches to some length and generally treats the more advanced material in later sections. Throughout the book we aim at stressing the many similarities between what may at a first glance appear to be very unrelated approaches. To help getting this idea across, and to serve as a gentle introduction, this chapter treats all of-the approaches at the level of examples. The technical details are worked-out but it may be difficult to apply the techniques to related examples until some of the material of later chapters have been studied.
Article
Full-text available
Since benchmarks drive computer science research and industry product development, which ones we use and how we evaluate them are key questions for the community. Despite complex runtime tradeoffs due to dynamic compilation and garbage collection required for Java programs, many evaluations still use methodologies developed for C, C++, and Fortran. SPEC, the dominant purveyor of benchmarks, compounded this problem by institutionalizing these methodologies for their Java benchmark suite. This paper recommends benchmarking selection and evaluation methodologies, and introduces the DaCapo benchmarks, a set of open source, client-side Java benchmarks. We demonstrate that the complex interactions of (1) architecture, (2) compiler, (3) virtual machine, (4) memory management, and (5) application require more extensive evaluation than C, C++, and Fortran which stress (4) much less, and do not require (3). We use and introduce new value, time-series, and statistical metrics for static and dynamic properties such as code complexity, code size, heap composition, and pointer mutations. No benchmark suite is definitive, but these metrics show that DaCapo improves over SPEC Java in a variety of ways, including more complex code, richer object behaviors, and more demanding memory system requirements. This paper takes a step towards improving methodologies for choosing and evaluating benchmarks to foster innovation in system design and implementation for Java and other managed languages.
Article
Analytical expressions for the optical matrix elements of semiconducting carbon nanotubes are obtained for different polarizations of the incident light. It is shown that, for the light polarized along the nanotube axis and transitions between the extrema of electronic subbands, the optical matrix element only weakly depends on the nanotube chirality and excitation energy. It is found that, for the light polarized in the plane perpendicular to the nanotube axis, both v2→c1 and v1→c2 transitions are allowed.
Article
Infrared emission from a carbon-nanotube (CNT) field-effect transistor, with the position of the light spot controlled by applied bias, was recently reported. In this letter, a self-consistent simulation, which couples a quantum treatment of the metal–CNT contacts to a semiclassical treatment of the channel, is performed to understand carrier transport and light emission in a CNT infrared emitter. The results show that when the channel is long, light emission significantly affects carrier transport, and reduces the source–drain current by a factor of 2 in ambipolar transport regime. The experimentally observed light-spot movement along the channel can be mostly understood and explained by a simple, semiclassical picture.
Article
The transport properties of ambipolar long-channel carbon-nanotube field-effect transistors are calculated in the framework of a diffusive-transport model. The effects associated with radiative and nonradiative recombinations of injected electrons and holes in the channel are considered, and the spatial dependence of the recombination profile on the gate and drain voltages is explored. Nonradiative recombination is shown to play a decisive role in the transport characteristics. The emitted light spot size is predicted to exhibit sensitive dependence on the nonradiative recombination mechanism. Moreover, the local electric field reaches a maximum but remains relatively small inside the recombination region.
Article
We demonstrate that individual single-walled carbon nanotubes (SWNTs) can form ideal p-n junction diodes. An ideal behavior is the theoretical limit of performance for any diode, a highly sought after goal in all electronic materials development. We further elaborate on their properties by examining photovoltaic effects, an application where its performance is intimately related to the quality of the diode. Under illumination, SWNT diodes show significant power conversion efficiencies owing to enhanced properties of an ideal diode.
Article
Optical trapping and manipulation of individually dissolved single-walled carbon nanotubes in an aqueous solution was demonstrated using an infrared laser trapping system. The trapping effect was attributed to the interaction between the electric field of the laser and the instantaneously induced dipole moments in individual nanotube molecules. Fluorescence microscopy and intensity measurements demonstrated that the amount of trapped nanotubes is dependent on the initial nanotube concentration, laser power, and irradiated volume of the laser beam. It was also demonstrated that an optical trap combined with a microfluidic device trapped and moved nanotubes in the fluid channel.
Article
We observe infrared laser excited photoconductivity from a single carbon nanotube incorporated as the channel of an ambipolar field-effect transistor (FET). Electron−hole pairs are generated within the nanotube molecule, and the carriers are separated by an applied electric field between the source and drain contacts. The photocurrent shows resonances whose energies are in agreement with the energies of exciton states of semiconducting nanotubes of the appropriate diameter. The photocurrent is maximized for photons polarized along the direction of the carbon nanotube. Thus, the nanotube FET acts as a polarized photodetector with a diameter 1000 times smaller than the wavelength of the light it detects and has an estimated quantum efficiency of >10%. A photovoltage is observed when an asymmetric band lineup due to two nonequivalent Schottky barriers or an asymmetric coupling of the gate to the nanotube is present.
Article
A multi-scale model is presented that captures the experimentally observed behviour of electroluminescence (EL) in carbon nanotube field-effect transistors (CNFETs) under ambipolar bias conditions, namely variations in mobile EL intensity, localized EL at a contact, and localized EL at a charge defect. A full, quantum mechanical approach is used to describe tunneling and thermionic emission at the contacts, and the drift-diffusion equations, with a field dependent mobility, are used for transport in the long devices (CN length ≥ 10µm). We find that contact-localized EL is only present when the height of the Schottky barrier at the ends of the CN favours the injection of one type of carrier. Charge defects on the CN surface also lead to localized EL, which is present only under certain bias conditions.