ArticlePDF Available

Dynamic Class Loading in the Java Virtual Machine

Authors:

Abstract

Class loaders are a powerful mechanism for dynamically loading software components on the Java platform. They are unusual in supporting all of the following features: laziness, type-safe linkage, user-defined extensibility, and multiple communicating namespaces. We present the notion of class loaders and demonstrate some of their interesting uses. In addition, we discuss how to maintain type safety in the presence of user-defined dynamic class loading. 1 Introduction In this paper, we investigate an important feature of the Java virtual machine: dynamic class loading. This is the underlying mechanism that provides much of the power of the Java platform: the ability to install software components at runtime. An example of a component is an applet that is downloaded into a web browser. While many other systems [16] [13] also support some form of dynamic loading and linking, the Java platform is the only system we know of that incorporates all of the following features: 1. Lazy loadi...
Dynamic Class Loading in the Java
TM
Virtual Machine
Sheng Liang Gilad Bracha
Sun Microsystems Inc.
901 San Antonio Road, CUP02-302
Palo Alto, CA 94303
Class loaders are a powerful mechanism for dynamically
loading software components on the Java platform. They
are unusual in supporting all of the following features:
laziness, type-safe linkage, user-defined extensibility, and multiple
communicating namespaces.
We present the notion of class loaders and demonstrate
some of their interesting use s. In addition, we discuss how to
maintain type safety in the presence of user-defined dynamic
class loading.
In this paper, we investigate a n important feature of the
Java virtual machine: dynamic class loading. This is the
underlying mechanism that provides much of the power of
the Java platform: the ability to install software components
at runtime. An example of a component is an applet that is
downloaded into a web browser.
While many other sys tems [16] [13] also support some
form of dynamic loading and linking, the Java platform is
the only system we know of that incorporates all of the
following features:
1. Lazy loading. Classes are loaded on demand. Class
loading is delayed as long as possible, reducing mem-
ory usage and improving system response time.
2. Type-safe linkage. Dynamic class loading must not
violate the type safety of the Java virtual machine.
Dynamic loading must not require additional run-time
check s in order to guarantee type safety. Additional
link-time checks are acceptable, because these checks
are performed only once.
3. User-defina ble class loading policy. Class loaders are first-
class obj e cts. Programmers have complete control of
dynamic cla ss loading. A user-defined class loader can ,
for example, specify the remote location from which
To appear in the 13t h Annual ACM SIGPLAN Conference
on Object-Oriented Programming Systems, Languages,
and Applications (OOPSLA’98), Vancouver, BC, Canada,
October, 1998 .
the classes are loaded, or assign appropriate security
attributes to classes loaded from a particular source.
4. Multiple namespaces. Class loaders provide separate
namespaces for different software components. For
example, the Hotjava
TM
browser loads applets from
different sources into separate class loaders. These
applets may contain c lasses of the same name, but the
classes are treated as distinct types by the Java virtual
machine.
In contrast, existing d ynamic linking mechan isms do
not support all of these features. Although most operating
systems support some form of dyna mic linked libraries, su ch
mechanis ms are targeted toward C/C++ code, and are not
type-safe. Dynamic languages such as Lisp [13], Smalltal k
[6], and Self [21] achieve type safety through additional
run-time checks, not link-time checks.
The main contribution of this paper is to provide the
first in-depth description of class loaders, a novel c oncept
introduced by the Java platform. Class l oaders existed in
the first version of the Java Development Kit (JDK 1 .0). T he
original purpose was to enable applet c lass loading in the
Hotjava browser. Since that time, the use of class loaders
has been extended to handle a wider range of software
components such as server-side components (servlets) [11],
extensions [10] to the Java platform, and JavaBeans [8]
components. Despite the increasingly important role of class
loaders, the u nderlying mechanism has not been adequately
described in the literature.
A further contribution of this paper is to present a
solution to the long-standing type safety problem [20] with
class loaders. Early ver sions (1.0 and 1.1) of the JDK
contained a serious flaw in class loader implementation.
Improperly written class loaders could defeat the type safety
guarantee of the Java virtual machine . Note that the type
safety problem did not impose any immediate security risks,
because untrusted code (such as a d ownloaded applet) was
not allowed to create c lass loaders. Nonetheless, application
programmers who had the need to write custom class loaders
could compromise type safety inad ver tently. Although the
issue had been known for some time, it remained an open
problem in the research community whether a satisfactory
solution exists. For example, earlier discussions centered
around whether the lack of type safety was a fundamental
limitation of user-definable class loaders, and whether we
would have to limit the power of class loaders, give up
lazy class loading, or introduce additional dynamic type-
check ing at runtime. The solution we present in this paper,
which has been implemented in JDK 1.2, solves the type
safety problem while preserving all of the other desira ble
features of class loaders.
We assume the reader has basic knowledge of the Java
programming language [7]. The remainder of this paper
is organized as follows: We first give a more detailed
introduction to class loaders. Applications of class loaders
are dis cussed in section 3. Section 4 describes the type safety
problems that may arise due to the use of class loaders, a nd
their solutions. Section 5 relates our work to other research.
Finally, we present our conclusions in section 6.
The purpose of class loaders is to support dynamic loading
of software components on the Java platform. The unit of
software dis tribution is a class
. Classes are distributed us-
ing a machine-independent, standard, binar y representation
known as the class le format [15]. The representation of an
individua l class is referred to as a class fi le. Class files are
produced by Java compilers, a nd can be loaded into any Java
virtual machine. A class file does not have to be stored in an
actual file; it could be stored in a memory buffer, or obtained
from a network stream.
The Java virtual ma chine executes the byte code stored
in clas s files. Byte code sequences, however, are only part
of what the virtua l machine needs to execute a program. A
class file also contains s ymbolic references to fields, methods,
and names of other clas ses. Consider, for example, a class
declared as follows:
The cl ass file representing contains a symbolic reference
to class
. Symbolic references are resolved at link time to
actual class types. Class types are reified first-class objects in
the Java virtual machine. A class type is represented in user
code as an object of class
. In order to resolve a
symbolic reference to a class, the Java virtual machine must
load the cla ss file and create the class type.
The Java virtual machine uses class loaders to l oad class
files and create class objects. Class loaders are ordinary
objects that can be defined in Java code. They are instances
of subclasses of the class , shown in Figure 1.
We have omitted the methods that are not directly relevant
Throughout this paper, we use the term class generically to denote both
classes and interfaces .
Figure 1: The class
Applet n...
Browser code
Application class loader
Applet 1
(e.g., java.lang.String)
System classes
System class loader
Applet class loaders
Figure 2: Class loaders in a web browser
to this presentation. The
method
takes a c lass name as argument, and returns a
object
that is the run-time representation of a class type. The
methods
, and
will be des cribed later.
In the a bove example, assume that
is loaded by the
class loader . i s referred to as ’s defining loader. The Java
virtual machine will use
to load classes referenced by .
Before the virtual machine allocates an object of class
, it
must resolve the reference to If has not yet been loaded,
the virtual machine will invoke the
method of ’s
class loader,
to load
Once has been loaded, the virtual machine can resolve
the reference and create an object of class .
A Java application may use several different kinds of class
loaders to manage various software components. For exam-
ple, Figure 2 shows how a web browser written in Java may
use class loaders.
This example illustrates the use of two types of class
loaders: user-defined class loaders and the system class
loader supplied by the Java virtua l machin e . User-defined
class loaders can be u sed to create classes that originate from
user-defined sources. For example, the browser application
creates class loaders for downloaded applets. We use a
We use the notation to refer to an instance method defined in
class
, although this is not legal syntax in the Java pro gramming language.
2
separate class loader for the web browser application itself.
All system c lasses (such as
) are loaded into
the system class loader. The sys tem class loader is s upported
directly by the Java virtual machine.
The arrows in the figure indicate the delegation relation-
ship between class loaders. A class loader
can ask another
loader
to load a c lass on its behalf. In such a case,
delegates to . For example, applet and application class
loaders delegate all system classes to the system class loader.
As a result, all system c lasses are shared among the applets
and the application. This is desirabl e because type safety
would be violated if, for example, applet and system code
had a different notion of what the type
was.
Delegating class loaders allow us to maintain namespace
separation while still sharing a common se t of classes. In
the Java virtual machine, a class type is uniquely determined
by the combination of the class na me and class loa der. Applet
and application class loaders delegate to the system class
loader. This guarantees that all system cla ss types, such
as , are unique. On the other hand, a class
named
loaded in applet 1 is considered a different type
from a cla ss named
in applet 2. Although these two
classes have the same name, they are defined by different
class loader s. In fact, these two classes can be completely
unrelated. For example, they may have different methods or
fields.
Classes from one applet cannot interfere with classes in
another, because applets are loaded in separate class load-
ers. This is crucial in guaranteeing Java platform security.
Likewise, because the b rowser resides in a separate class
loader, applets cannot access the classes used to implement
the browser. Applets are only allowed to acces s the standard
Java API exposed in the system classes.
The Java virtual mac hine starts up by creating the appli-
cation class loader and using it to load the i nitial browser
class. Application execution starts in the public class method
of the initial clas s. The invocation of this
method drives a ll fur ther execution. Execution of instruc-
tions may cause loading of a dditional class e s. In this
application, the browser also creates additional class loaders
for downloaded applets.
The garbage collector unloads applet cla sses that are no
longer referenc e d. E ach class object contains a reference to
its defining loader; each class l oader refers to all the classes it
defines. This means that, from the garbage collector’s point
of view, classes are strongly connected with their defining
loader. Classes are unloaded when their defining loader is
garbage-collected.
We now walk through the implementation of a simple class
loader. As noted earlier, all user-defined class loader classes
are subclasses of
. Subclasses of can
override the definition of , thus providing a user-
defined loading pol icy. Here is a class loader that looks up
classes in a given directory:
The public constructor simply records
the directory name. In the definition of
, we use
the method to check whether the class has
already been loaded. (Section 4.1 will give a more precise de-
scription of the method.) If
returns , the class has not yet been loaded. We then dele-
gate to the system class loader by calling
If
the class we are trying to load is not a system class, we call
a helper method
to read in the class file.
After we have read in the class file, we pass it to the
method. The method constructs the
run-time representation of the class from the class file. Note
that the
method syn chronizes on the class loader
object so tha t multiple threads may not load the same class
at the same time.
When one class loader delegates to another class loader, the
class loader that initiates the loading is not necessaril y the
same loader that completes the loading and defines the class.
Consider the following code segment:
Instances of the class delegate the load-
ing of to the system loader. Consequently,
is defined by the system loader, even though
loading was i nitiated by
.
Definition 2.1 Let
be the result of . is the
defining loader of or e quivalently, define s
Definition 2.2 Let be the result of . is an
initiating loader of
or e quivalently, initiates loading of
3
Service
old
new
Server
Service
Figure 3: Class redirects to a new version of
class
In the Java virtual machine, every cl ass
is permanently
associated with its defining loader. It is
’s defining loader
that initiates the loading of any class referenced by
.
In this section, we give a few examples that demonstrate the
power of cla ss loaders.
It is often desirable to upgrade software components in a
long-running application such as a ser ver. The upgrade must
not require the appl ication to shut down and restart.
On the Java platform, this ability translates to reloading
a subset of the classes already l oaded in a running virtual
machine. It corresponds to the schema evolution [3] problem,
which could be rather difficult to solve in general. Here are
some of the difficulties:
There may b e live objects that are instances of a class
we want to reload. These objects must be migrated to
conform to the schema of the new class. For example,
if the new version of the class contains a different set
of instance fields, we must somehow map the existing
set of instance field values to field s in the new version
of the class.
Similarly, we may have to map the static eld values
to a different set of static elds in the reloaded version
of the class.
The application may be executing a method that be-
longs to a class we want to reload.
We do not address these problems in this paper. Instead,
we show how it is sometimes possible to bypass them using
class loaders. By organizing software components in separate
class loaders, programmers can often avoid dealing with
schema evolution. Instead, n e w classes are loaded by a
separate loader.
Figure 3 illustrates how a
class can dynamically
redirect the service requests to a new version of the
class. The key technique is to load the server class, old service
class, and new service class into separate class loaders. For
example, we can defin e
using the class
introd uced in the last section.
The method redirects all incoming
requests to a
object stored in a private eld. It uses
the Java Core Reflection API [9] to invoke the method
on the
object. In addition, the
method allows a new version of the class to b e
dynamically loaded, replacing the existing object.
Callers of
supply the the location of the new
class files . Further requests will be redirected to the new
object referenced to by
.
To make reloading possi ble, the
class must not
directly refer to the
class:
Once the class resolves the symbolic reference to
a class, it will contain a hard link to that clas s type.
An already-resolved reference cannot be changed. The type
conversion in the last line of the method
will fail for new versions of returned from the class
loader.
Reflection allows the
class to use the class
without a direct reference. Alternatively,
and
classes can share a common interface or superclass:
Dispatching through an interface i s typically more effi-
cient than reflection. The interface type itself must not be
reloaded, because the
class can refer to only one
type. The method must return a
class that implements the same
every time.
4
After we call the method, all future requests
will be processed by the new
class. The old
class, however, may not have nished processing some of
the earlier requests. Thus two
classes may coexist
for a while, until all uses of the old cla ss are comple te, all
references to the old class are dropped, and the old cla ss is
unloaded.
A class loader can instrument the class file before makin g the
call. For example, in the example,
we can insert a call to change the contents of the class file:
An instrumented class file must be valid ac cording to
the Java virtual machine specification [15]. The virtual
machine will apply all the usual checks (such as run ning
the byte code veri fier) to the instrumented class file. As
long as the class le format is obeyed, the programmer has
a great deal of freedom in modifying the class le. For
example, the instrumented class file may contain new byte
code instructions in e xisting methods, new elds, or ne w
methods. I t is also possible to delete existing methods, but
the resulting class file might not link with other cla sses.
The in strumented class file must define a class of the
same name as the original class file. The
method
should return a class object whose name matches the name
passed in as the argument. (Section 4.1 explains how this
rule is enforced by the vir tual machine.)
A class loader can only instrument the classes i t defines ,
not the classe s delegated to other loaders. All u ser-defined
class loaders s hould first delegate to the system class loader,
thus system classes cannot be instrumen ted through class
loaders. User-defined class l oaders cannot bypass this re-
striction by trying to define system c lasses themselves. If,
for example, a cla ss l oader defines its own
class, it
cannot pass an object of that class to a Java API that expects
a standard
object. The virtu al machine will catch and
report these type errors (see section 4 for detail s).
Class file instrumentation is useful in many circum-
stances. For example, an instrumented class file may contain
profilin g hooks that count how many times a certain method
is executed. Resource allocation may be monitored and
controlled by substituting references to certain classes with
references to resource-conscious versions of those classes
[19]. A cla ss loader may be us e d to implement parameter-
ized classes, expanding and tailoring the code in a class fi le
for each distinct invocation of a parametric type [1].
The examples presented so far have demonstrated the use-
fulness of multiple dele gating class loaders. As we will
see, however, ensuring type-safe linkage in the presence of
class loaders requires special care. The Java programming
langu age relies on name-based static typing. At compile
time, each static class type corresponds to a class name. At
runtime, class loaders introduce multiple namespaces. A
run-time class type is determined not by its name alone, but
by a pair: its class name and its defining class loader. Hence,
namespaces introduced by user-defined class loaders may
be inconsistent with the na mes pace managed by the Java
compiler, jeopardizing type safety.
The method may return differe nt class types for a
given name at different times. To maintain type safety, the
virtual machine must be able to consistently obtain the same
class type for a given class name and l oader. Consider, for
example, the two references to class
in the following c ode:
If ’s class loader were to map the two occurrences of
into different class types, the type safety of the method call
to inside would be compromised.
The virtual machine cannot trust any user-defined
method to consistently return the same type for a given
name. In stead, it internall y maintains a loaded class cache. The
loaded cl ass c ache maps class names and initiating loaders
to class types. After the virtual machine obtains a cl ass from
the method, it performs the following operations:
The real name of the class is checked against the name
passed to the
method. An error is r aised
if
returns a c lass that does not have the
requested name.
If the name matches, the resulting class is cached in the
loaded class cache. The virtual ma chine never invokes
the
method with the same name on the same
class loader more than once.
The method introduced in
section 2 performs a lookup in the loaded class cache.
We now describe the type safety problems that can arise with
delegating class loaders. The problem has been known for
some time. The first published account was given by Vijay
Saraswat [20].
Notatio n 4.1 We will represent a class type using the notation
, where denotes the n ame of the clas s, denotes the
5
class’s defining loader, and denotes the loader that initiated
class loading. When we do not care about the defining loader, we
use a simplified notation
to denote that is the initiating
loader of
. When we do not ca re about the initiating load er, we
use the simplified notation to de note that is defined by
.
Note that if
delegates to , then = .
We will now give an e xample that demonstrates the type
safety problem. In order to make clear which class loaders
are involved, we use the above notation where class names
would ordinarily appear.
is defined by . As a result, is used to initiate
the loading of the classes
and referenced
inside
defines Howe ver, delegates the
loading of to , which then defines
Because is defined by , will use
to initiate the loading of As it happens, defines a
different type expects an instance of
to be returned by However, actually
returns an instance of
, which is a completely
different clas s.
This is an inconsistency between the namespaces of
and . If this inconsistency goes undetec ted, it allows one
type to be forged as another type using delegating c lass
loaders. To see a how this type safety problem can lead to
undesir able behaviors, suppose the two versions of
are defined as follows:
Class is now able to reveal a private field of an
instance of
and forge a pointer from an integer
value:
We can access the private field in a
instance because the field is declared to be public in
. We are also able to forge an integer field
in the
instance as an integer array, and deref-
erence a pointer that is forged from the integer.
The underlying cause of the type-safety problem was the
virtual machine’s failure to take into acc ount that a class type
is determined by both the class name and the defining loader.
Instead, the virtual machine relied on the Java programming
langu age notion of us ing class names alone as types during
type check ing. The problem has since been corrected, as
described below.
A straightforward solution to the type-safety problem is to
uniformly use both the class’s name and its defining loader
to represent a class type in the Java virtual machine. The
only way to determine the defining loader, however, is to
actually load the class through the initiating loader. In the
example in the previous section, before we can determine
whether
’s call to is type-safe, we must first
load
in both and , and see whether we obtain
the same defining loader. The shortcoming of this approach
is that it sacri fices lazy clas s loading.
Our solution pres e rves the type safety of the straightfor-
ward approach, but avoids eager class loading. The key idea
is to maintain a se t of loader constraints that are dynamically
updated as c lass loading takes place. In the above example,
instead of loading
in and , we simply record
a constraint that . If is later
loaded by
or , we will need to verify that the existing
set of loader constraints will not be violated.
What if the constraint
is intro-
duced after
is loaded by both and ? It is
too late to impose the constraint and undo previous class
loading.
We must therefore take both the loaded class cache and
loader constraint set into account at the same time. We need
to mai ntain the invariant: Each entry in the loaded class cache
satisfies all the loader constraints. The invariant is maintained
as follows:
Every time a new entry is ab out to be added to the
loaded c lass cache, we verify that none of the existing
loader constraints will be violated. If the new entry
cannot be added to the loaded class cache without
violating one of the existing loader constraints, class
loading fails.
Every time a new loader constraint is added, we
verify that all loaded classes in the cache satisfy the
new constraint. If a new loader constraint cannot
be satisfied by all loaded classes, the operation that
triggered the addition of the new loader constraint
fails.
Let us see how these check s can be applied to the previous
example. The first line of the
method causes the virtual
machine to generate the constraint
.
6
If and have already loaded the class when we
generate this constrain t, an exception will immediately be
raised in the program. Otherwise, the constraint will be suc-
cessfully recorded. Assuming
loads
first, an exception will be raised when tries to load
later on.
We now state the rules for generating constraints. These
correspond to situations when one class type may be referred
to by another class. When two such classes are defined in
different loaders, there are opportunities for inconsistencies
across namespaces .
If references a field:
fieldname
declared in class , then we gen e rate the con-
straint:
If references a method:
methodname
declared in class , then we gen e rate the con-
straints:
If overrides a method:
methodname
declared in class , then we gen e rate the con-
straints:
The constraint set indicates
that
must be loaded as the same class type in and
, and in and . Even if, during the execution of the
program, is never loaded b y , distinct versions of
could not be loaded by and .
If the loader constraints are violated, a
exception will be thrown. Loader constraints are removed
from the constraint set when the corresponding class loader
is garbag e -collected.
Saraswat[20] has suggested another approac h to maintaini ng
type safety in the presence of delegating class loaders. That
proposal differs from ours in that it suggests that method
overriding should also be b ased upon dynamic types rather
than static (name-based) types. Saraswat’s idea is appealing,
in that it uses the dynamic concept of type uniformly from
link time onwards.
The following c ode illustrates the differences between his
model and ours:
Assume that and define different versions of
Saraswat considers the methods in and
to have different type signatures: takes an argument
of type whereas takes an argument of
type
. As a consequence, is not considered
to override
in this model.
In our model, if is loaded by a linka ge error
results at the point where
is called. The behavior in
Saraswat’s model is very similar: a
results.
The difference in approach becomes apparent when
is loaded by In our model, when is loaded by
the call to would invoke A linkage error would be
raised when code2 attempted to acce ss any fields or methods
of In Saraswat’s model the call to executes
(that is, does not override ).
We believe it is better to fail in this case than to silently
run code that was not meant to be executed. A programmers
expectation when writing the classes and above
is that
does override in accordance with
the semantics of the Java programming languag e . These
expectations are violated in Saraswat’s proposal.
Saraswat also suggests a modification to the class loader
API tha t would allow the virtual machine to determine
the run-time type of a symbolic reference without actually
loading it. This is necessary in order to implement his
proposal without the penalty of excessive class loading. We
believe it would be worth exploring this i dea independently
of the other aspects of Sar aswat’s proposal.
Other proposals have also focused on changing the pro-
tocol of the
class, or subdividing its functionality
among several classes. Such changes typical ly reduc e the
expressive power of class loaders.
Class loaders can be thought of as a reflective hook into the
system’s loading mechanism. Reflective systems in other
object-oriented languages [6, 14] have provided users the
opportunity to modify various aspects of system behavior.
One could use such mechanisms to provide user-extensible
class loading; however, we are n ot aware of any such
experiments.
7
Some Lis p dialects [17 ] and some functional languages
[2] have a notion of first-class environments, which support
multiple namespaces similar to those discussed in this paper.
Dean [5] [4] has discussed the problem of type safety in
class loaders from a theoretical perspective. He suggests a
deep link between cla ss loading and dynamic scoping.
Jensen et al. [12] recently proposed a formalization of
dynamic class loading in the Java virtual machine. Among
other ndings, the formal approach confirmed the type
safety problem with cla ss loader s.
Roskind [18] has put in place link-time checks to ensure
class loader type safety in Netscape’s Java virtu al machine
implementation. The checks he implemented are more eager
and strict than ours.
The Oberon/F system [1 6] (now renamed Component
Pasc al) allows dyna mic loading and type-safe linkage of
modules. However, the dynamic loading mechanism is not
under user control , nor does it provide multiple namespaces.
Dynamically linked libraries have been supported by
many operating systems. These mechanisms typically do
not provide type-safe lin kage.
We have presented the notion of class loaders in the Java
platform. Class loaders combine four desirable features:
lazy loading, type-safe linkage, multiple namespaces, and
user extensibility. Type safety, in particular, requires special
attention. We have shown how to pres e rve type safety
without restricting the power of class loaders.
Class loaders are a simple yet powerful mechanism that
has proven to be extremely valuable in managing software
components.
The authors wish to than k Drew Dean, Jim Roskind, and
Vijay Saraswat for focusing our attention on the type safety
problem, and for many valuable exchanges.
We owe a debt to David Connelly, Li Gong, Benjamin
Renaud, Roland S chemers, Bill Shannon, and many of our
other colleagues at Sun Java Software for countless discus -
sions on security and class loaders. Arthur van Hoff first
conceived of class loaders.
Bill Maddox, Marianne Mueller, Nicholas Sterling, David
Stoutamire, and the anonymous reviewers for OOPSLA’98
suggested numerous improvements to this paper.
Finally, we thank James Gosling for creating the Java
programming language.
[1] Ole Ages e n, Stephen N. Freund, and John C. Mitchell.
Adding type parameterization to the Java language.
In Proc. of the ACM Conf. on Object-Oriented Program-
ming, Systems, Languages and Applications, pages 49–65,
October 19 97.
[2] Andrew W. Appe l and David B. MacQueen. Standard
ML of New Jersey. In J. Maluszy ´nski and M. Wirsing,
editors, Programming Language Implementation and Logic
Programming, pages 1–13. Springer-Verlag, August 1991.
Lecture Notes in Computer Science 528.
[3] Gilles Barbedette. Schema modifications in the LISP
persistent object-oriented language. In European
Conference on Object-Oriented P rogramming, pages 77–96,
July 1991.
[4] Drew Dean, 1997. Private communication.
[5] Drew Dean. The security of static typing with dynamic
linking. In Fou rth ACM Conference on Computer an d
Communications Security, pages 18–27, April 1997.
[6] A. Goldberg and D. Robson. Smalltalk-80: the Language
and Its Implementation. Addison-Wesley, 1983.
[7] James Gosling, Bill Joy, and Guy Steele. The Java
Langu age Specification. Addison-Wesley, Reading, Mas-
sachusetts, 1996.
[8] JavaSoft, Sun Microsystems, Inc. JavaBeans Components
API for Java, 1997. JDK 1.1 documentation, available at
.
[9] JavaSoft, Sun Microsystems, Inc. Reflection,
1997. JDK 1.1 documentation, available at
.
[10] JavaSoft, Sun Microsystems, Inc. The Java Extensions
Framework, 1998. JDK 1.2 documentation, available at
.
[11] JavaSoft, Sun Microsystems, Inc. Servlet,
1998. JDK 1.2 documentation, available at
.
[12] Thomas Jensen, Daniel Le Metayer, and Tommy Thorn.
Security and dynamic class loading in Java: A formali-
sation. In Proceedings of IEEE International Conference on
Computer Languages, Chicago, Illinois, pages 4–15, May
1998.
[13] Sonya E. Keene. Object-Oriented Programming in Common
Lisp. Addison-Wesley, 1989.
[14] Gregor Kiczales, Jim de s Rivieres, and Daniel G. Bobrow.
The Art of the Metaobject Protocol. MIT Press, Cambridge,
Massachusetts, 1991.
[15] Tim Lind holm and Frank Yellin. The Jav a Virtual Machine
Specification. Addison-Wesley, Reading, Massachusetts,
1996.
[16] Oberon Microsystems, Inc. Component Pas-
cal Language Report, 1997. Available at
.
[17] Jonathan A. Rees, Norman I. Adams, and James R.
Meehan. The T Manual, Fourth Edition. Department of
Computer Science, Yale University, January 198 4.
8
[18] Jim Roskind, 1997. Private communic ation.
[19] Vijay Saraswat. Matrix design notes.
˜ .
[20] Vijay Saraswat. Java is not type-safe. available at
˜ , 1997.
[21] David Ungar and Randall Smith. SELF: The power of
simplicity. In Proc. of the ACM Conf. o n Object-Oriented
Programming, Systems, Languages and Applications, Octo-
ber 1987.
9
... They mainly aim to spy on private data (e.g., contact lists, photos, videos, documents, and account details) or control devices by remote servers as botnets (Karim et al., 2015). Android applications use Java as a developing language because Java provides a very flexible code, dynamic code loading (Liang & Bracha, 1998), and many other features to make Android application development more accessible and efficient. Likewise, Java uses obfuscation tools (Aonzo et al., 2020;GuardSquare, 2014) to protect commercial software companies from software plagiarism issues; professional developers protect their source codes from being stolen using advanced evasion techniques (Aonzo et al., 2020) as protection mechanisms. ...
Article
Full-text available
The various application markets are facing an exponential growth of Android malware. Every day, thousands of new Android malware applications emerge. Android malware hackers adopt reverse engineering and repackage benign applications with their malicious code. Therefore, Android applications developers tend to use state-of-the-art obfuscation techniques to mitigate the risk of application plagiarism. The malware authors adopt the obfuscation and transformation techniques to defeat the anti-malware detections, which this paper refers to as evasions. Malware authors use obfuscation techniques to generate new malware variants from the same malicious code. The concern of encountering difficulties in malware reverse engineering motivates researchers to secure the source code of benign Android applications using evasion techniques. This study reviews the state-of-the-art evasion tools and techniques. The study criticizes the existing research gap of detection in the latest Android malware detection frameworks and challenges the classification performance against various evasion techniques. The study concludes the research gaps in evaluating the current Android malware detection framework robustness against state-of-the-art evasion techniques. The study concludes the recent Android malware detection-related issues and lessons learned which require researchers’ attention in the future.
... For this purpose, they can obfuscate the code to make the analysis more difficult. Example of obfuscation includes code manipulation techniques [3], use of dynamic code loading [4] or use of the Java reflection API [5]. Attackers can also use other techniques such as packing [6] which relies on encryption to hide their malicious code. ...
Article
Full-text available
Android is present in more than 85% of mobile devices, making it a prime target for malware. Malicious code is becoming increasingly sophisticated and relies on logic bombs to hide itself from dynamic analysis. In this paper, we perform a large scale study of TSO PEN, our open-source implementation of the state-of-the-art static logic bomb scanner T RIGGER S COPE, on more than 500k Android applications. Results indicate that the approach scales. Moreover, we investigate the discrepancies and show that the approach can reach a very low false-positive rate, 0.3%, but at a particular cost, e.g., removing 90% of sensitive methods. Therefore, it might not be realistic to rely on such an approach to automatically detect all logic bombs in large datasets. However, it could be used to speed up the location of malicious code, for instance, while reverse engineering applications. We also present T RIGDB a database of 68 Android applications containing trigger-based behavior as a ground-truth to the research community.
... For this purpose, they can obfuscate the code to make the analysis more difficult. Example of obfuscation includes code manipulation techniques [3], use of dynamic code loading [4] or use of the Java reflection API [5]. Attackers can also use other techniques such as packing [6] which relies on encryption to hide their malicious code. ...
Preprint
Full-text available
Android is present in more than 85% of mobile devices, making it a prime target for malware. Malicious code is becoming increasingly sophisticated and relies on logic bombs to hide itself from dynamic analysis. In this paper, we perform a large scale study of TSOPEN, our open-source implementation of the state-of-the-art static logic bomb scanner TRIGGERSCOPE, on more than 500k Android applications. Results indicate that the approach scales. Moreover, we investigate the discrepancies and show that the approach can reach a very low false-positive rate, 0.3%, but at a particular cost, e.g., removing 90% of sensitive methods. Therefore, it might not be realistic to rely on such an approach to automatically detect all logic bombs in large datasets. However, it could be used to speed up the location of malicious code, for instance, while reverse engineering applications. We also present TRIGDB a database of 68 Android applications containing trigger-based behavior as a ground-truth to the research community.
... (1) VMware installation We install the virtual machine VMware [8] into the Linux system through image file loading. The common image files are CentOS, Ubuntu, etc. ...
Article
Objectives Pseudonymization is an important aspect of projects dealing with sensitive patient data. Most projects build their own specialized, hard-coded, solutions. However, these overlap in many aspects of their functionality. As any re-implementation binds resources, we would like to propose a solution that facilitates and encourages the reuse of existing components. Methods We analyzed already-established data protection concepts to gain an insight into their common features and the ways in which their components were linked together. We found that we could represent these pseudonymization processes with a simple descriptive language, which we have called MAGICPL, plus a relatively small set of components. We designed MAGICPL as an XML-based language, to make it human-readable and accessible to nonprogrammers. Additionally, a prototype implementation of the components was written in Java. MAGICPL makes it possible to reference the components using their class names, making it easy to extend or exchange the component set. Furthermore, there is a simple HTTP application programming interface (API) that runs the tasks and allows other systems to communicate with the pseudonymization process. Results MAGICPL has been used in at least three projects, including the re-implementation of the pseudonymization process of the German Cancer Consortium, clinical data flows in a large-scale translational research network (National Network Genomic Medicine), and for our own institute's pseudonymization service. Conclusions Putting our solution into productive use at both our own institute and at our partner sites facilitated a reduction in the time and effort required to build pseudonymization pipelines in medical research.
Article
Java projects are often built on top of various third-party libraries. If multiple versions of a library exist on the classpath, JVM will only load one version and shadow the others, which we refer to as dependency conflicts . This would give rise to semantic conflict (SC) issues, if the library APIs referenced by a project have identical method signatures but inconsistent semantics across the loaded and shadowed versions of libraries. SC issues are difficult for developers to diagnose in practice, since understanding them typically requires domain knowledge. Although adapting the existing test generation technique for dependency conflict issues, Riddle , to detect SC issues is feasible, its effectiveness is greatly compromised. This is mainly because Riddle randomly generates test inputs, while the SC issues typically require specific arguments in the tests to be exposed. To address that, we conducted an empirical study of 316 real SC issues to understand the characteristics of such specific arguments in the test cases that can capture the SC issues. Inspired by our empirical findings, we propose an automated testing technique Sensor , which synthesizes test cases using ingredients from the project under test to trigger inconsistent behaviors of the APIs with the same signatures in conflicting library versions. Our evaluation results show that Sensor is effective and useful: it achieved a $Precision$ of 0.898 and a $Recall$ of 0.725 on open-source projects and a $Precision$ of 0.821 on industrial projects; it detected 306 semantic conflict issues in 50 projects, 70.4 percent of which had been confirmed as real bugs, and 84.2 percent of the confirmed issues have been fixed quickly.
Article
We develop a behavioural theory for monitors, computational entities that passively analyse the runtime behaviour of systems so as to infer properties about them. First, we present a monitor language and an instrumentation relation used for piCalculus process monitoring. We then identify contextual behavioural preorders that allow us to relate monitors according to criteria defined over monitored executions of piCalculus processes. Subsequently, we develop alternative monitor preorders that are compositional, since they allow us to relate monitors without resorting to their composite behaviour when they instrumented with systems. Importantly, we show that the latter alternative preorders are sound and complete with respect to the contextual preorders. Finally, we demonstrate how these preorders can assist the development of correct monitor synthesis tools.
Article
Dynamic programming languages face semantic and performance challenges in the presence of features, such as eval, that can inject new code into a running program. The Julia programming language introduces the novel concept of world age to insulate optimized code from one of the most disruptive side-effects of eval: changes to the definition of an existing function. This paper provides the first formal semantics of world age in a core calculus named juliette, and shows how world age enables compiler optimizations, such as inlining, in the presence of eval. While Julia also provides programmers with the means to bypass world age, we found that this mechanism is not used extensively: a static analysis of over 4,000 registered Julia packages shows that only 4-9% of packages bypass world age. This suggests that Julia's semantics aligns with programmer expectations.
Article
While Java provides many software engineering benefits, it lacks a coherent module system and instead provides only packages (which are primarily a name space mechanism) and classloaders (which are very low-level). As a result, large Java applications suffer from unexpected interactions between independent components, require complex CLASSPATH definitions, and are often extremely complex to install and maintain. We have implemented a module system for Java called MJ that is implemented with class loaders, but provides a much higher-level interface. High-level properties can be specified in a module definition and are enforced by the module system as new modules are loaded. To experimentally validate the ability of MJ to properly handle the complex module inter-relationships found in large Java server systems, we replaced the classloader mechanisms of Apache Tomcat 4.1.18 [27] with 30 MJ modules. The modified Tomcat is functionally identical to the original, but requires no CLASSPATH definitions, and will operate correctly even if user code loads a different version of a module used by Tomcat, such as the Xerces XML parser [31]. Furthermore, by making a small change to the Java core libraries enabled by MJ, we obtained a 30% performance improvement in a servlet microbenchmark.
Conference Paper
We present Replayable Execution, a system for improving the efficiency of Function-as-a-Service (FaaS) frameworks. It takes advantage of standard kernel features to reduce memory usage and accelerate cold startup speed without changes to the OS kernel, language runtimes, and the surrounding FaaS deployment environment. Replayable Execution exploits the intensive-deflated execution characteristics of the majority of target applications. It uses checkpointing to save an image of an application, allowing this image to be shared across containers and resulting in speedy restoration at service startup. We apply Replayable Execution to a representative FaaS Java framework to create a ReplayableJVM execution, which together with benefits from deterministic execution of a warmed up runtime, offers 2X memory footprint reduction, and over 10X startup time improvement.
Conference Paper
Full-text available
The Standard ML of New Jersey compiler has been under development for five years now. We have developed a robust and complete environment for Standard ML that supports the implementation of large software systems and generates efficient code. The compiler has also served as a laboratory for de- veloping novel implementation techniques for a so- phisticated type and module system, continuation based code generation, efficient pattern matching, and concurrent programming features.
Article
This paper addresses the issue of schema evolution in LISPO2, a persistent object-oriented language. It introduces the schema modifications supported by the LISPO2 programming environment and presents the potential inconsistencies resulting from these modifications at the schema, method and object levels. Furthermore, it describes how the environment efficiently detects such inconsistencies using a database representing the schema definition. Moreover for correct modifications, it presents how this database is used to update the schema, to trigger method recompilations and to restructure objects using a semi-lazy evolution policy.