College of William & Mary
Computer Science Department
June 9, 2006
Transparent Optimization of Grid Server Selection With Real-Time Passive
Marcia ZangrilliBruce B. Lowekamp
Grid services have tremendously simplified the programming challenges in leveraging large-scale dis-
tributed computing. At the same time, the increased level of abstraction reduces the opportunities available
to the application for optimizing its performance by monitoring the system. In this paper we introduce
a monitoring grid services proxy, which transparently monitors network performance and selects between
several replica service providers. This approach provides optimized server selection without any modifica-
tion to or even awareness of the client application or service providers. We describe how we implement the
proxy and monitor the available bandwidth to the service providers using the Wren monitoring toolkit. We
present analysis indicating that our monitoring has negligible overhead. Finally, we demonstrate the practi-
cality of our approach by optimizing the server selection for INCOGEN’s VIBE, a bioinformatics workflow
application that uploads gene sequences for analysis by remote service providers.
We would like to thank INCOGEN for allowing us to use their VIBE software toolkit. In particular, we
would like to acknowledge the support provided by Jennifer Lowekamp and Dawn Cannan at INCOGEN.
This work was performed, in part, using computational facilities at the College of William and Mary which
were enabled by grants from Sun Microsystems, the National Science Foundation, and Virginia’s Common-
wealth Technology Research Fund.
Transparent Optimization of Grid Server Selection With Real-Time Passive
Marcia Zangrilli and Bruce B. Lowekamp
College of William and Mary
Williamsburg VA, USA
Grid services have tremendouslysimplified the program-
ming challenges in leveraging large-scale distributed com-
puting. At the same time, the increased level of abstraction
reduces the opportunities available to the application for
paper we introduce a monitoring grid services proxy, which
transparently monitors network performance and selects
between several replica service providers. This approach
provides optimized server selection without any modifica-
providers. We describe how we implement the proxy and
monitor the available bandwidth to the service providers
usingtheWrenmonitoringtoolkit. We presentanalysisindi-
we demonstratethe practicalityof ourapproachby optimiz-
ing the server selection for INCOGEN’s VIBE, a bioinfor-
matics workflow application that uploads gene sequences
for analysis by remote service providers.
One of the challenges in developing grid applications
is the difficulty in creating applications that can function
across both high latency networks and tightly coupled clus-
ters. Grid services have emerged as a means to help the
coordination of grid applications by providing a standard
interface between clients and services. Grid services help
reduce the programmingcomplexities for clients and create
opportunities to help steer the decision of which service is
best for the client to invoke.
In this paper, we explore the use of a grid service aware
proxy server to providetransparent optimization services to
grid services applications. The proxy service offers a new
level of abstraction that hides the exact data or service re-
sourcefromthe client. Insuch a system, the client is config-
uredwith the locationof the proxyas the address for all ser-
vices, but it is totally unaware of the behavior of the proxy
and all optimization decisions are transparent to both the
service provider and the client.
While adding proxies to grid services architectures may
remove some control from the client, our goal is to provide
the middleware with sufficient power to make the appropri-
ate performance optimizations without the need to involve
either the user or the client application unless qualitative
decisions–such as choosing between databases managed by
different groups–need to be made.
Network performance monitoring is critical for running
applications in complex grid environments, but obtaining
this information often requires the proxy to actively probe
the network for available bandwidth. The active approach
often produces accurate measurements, but it may cause
competition between application traffic and the measure-
Most of these active algorithms rely on UDP traffic to probe
the path for available bandwidth, whereas most grid appli-
cations use TCP traffic. If UDP traffic is packet-shaped dif-
ferently than TCP traffic, measurements made using UDP
applications. For available bandwidth measurements to be
useful to grid services proxies, they must be both accurate
and non invasive.
Our solution is to incorporate passive Wren measure-
decision of which service to invoke. Wren is less invasive
thanactiveprobingbecauseit uses passive traces ofexisting
application traffic to provide available bandwidth measure-
ments. Because Wren measures available bandwidth using
the application’s own traffic, the measurements providedby
Wren will reflect the bandwidth available to that applica-
Wren will allow the proxy server to choose between dif-
ferent options for executing the same service, e.g., whether
to process data locally on a single server or to spend the
time uploading the dataset to a remote high-performance
cluster that offers the same service. These features will be
used to hide the complex details of service and data loca-
tion from the client application, while still exposingoptions
of different services and datasets to the client and the user.
 N. Hu and P. Steenkiste. Evaluation and Characterization of
Available Bandwidth Techniques. IEEE JSAC Special Issue
in Internet and WWW Measurement, Mapping, and Model-
 INCOGEN. Vibe: Visual integrated bioinformatics environ-
ment. Whitepaper. www.incogen.com.
 M. Jain and C. Dovrolis. Pathload: a Measurement Tool for
End-to-end Available Bandwidth. In Proceedings of the 3rd
Passive and Active Measurements Workshop, March 2002.
 G. Jin and B. Tierney.Netest: A Tool to Measure the
Maximum Burst Size, Available Bandwidth and achievable
Throughput.In International Conference on Information
Technology Research and Education, 2003.
 B. B. Lowekamp, B. Tierney, L. Cottrell, R. Hughes-Jones,
T. Kielmann, and M. Swany. Enabling Network Measure-
ment Portability Through a Hierarchy of Characteristics.
In Proceedings of the 4th International Workshop on Grid
Computing (GRID2003), 2003.
 C. L. T. Man, G. Hasegawa, and M. Murata. A merged inline
measurement method for capacity and available bandwidth.
In Passive and Active Measurement Workshop (PAM2005),
pages 341–344, 2005.
 S. McCanne and V. Jacobson. The BSD Packet Filter: A
New Architecturefor User-level Packet Capture. InUSENIX
 B. Melander, M. Bjorkman, and P. Gunningberg. A New
End-to-End Probing and Analysis Method for Estimating
Bandwidth Bottlenecks.In Global Internet Symposium,
 R. Prasad, M. Murray, C. Dovrolis, and K. Claffy. Band-
width Estimation: Metrics, Measurement Techniques, and
Tools. In IEEE Network, June 2003.
 V. Ribeiro, R. H. Riedi, R. G. Baraniuk, J. Navratil, and
L. Cottrell. pathChirp:Efficient Available Bandwidth Esti-
mation for Network Paths. In Passive and Active Measure-
ment Workshop (PAM), 2003.
 S. Shakkottai, N. Brownlee, and kc claffy. A study of bursti-
ness in tcp flows. In Passive and Active Measurement Work-
shop (PAM2005), pages 13–26, 2005.
 S. Tuecke, K. Czajkowski, I. Foster, J. Frey, S. Graham,
C. Kesselman, T. Maguire, T. Sandholm, D. Snelling, and
P. Vanderbilt. Open grid services infrastructure. Technical
Report GFD-R.15, Global Grid Forum, 2004.
 M. Zangrilli and B. B. Lowekamp. Using Passive Traces
of Application Traffic in a Network Monitoring System. In
Proceedings of the Thirteenth IEEE International Sympo-
sium on High Performance Distributed Computing (HPDC
13). IEEE, June 2004.
 M. Zangrilli and B. B. Lowekamp. Applying principles of
active available bandwidth algorithms to passive tcp traces.
In Passive and Active Measurement Workshop (PAM 2005),
pages 333–336. LNCS, March 2005.