-
[show abstract]
[hide abstract]
ABSTRACT: We present the design and implementation of the Virtual Microscope, a software system employing a client/server architecture to provide a realistic emulation of a high power light microscope. The system provides a form of completely digital telepathology, allowing simultaneous access to archived digital slide images by multiple clients. The main problem the system targets is storing and processing the extremely large quantities of data required to represent a collection of slides. The Virtual Microscope client software runs on the end user's PC or workstation, while database software for storing, retrieving and processing the microscope image data runs on a parallel computer or on a set of workstations at one or more potentially remote sites. We have designed and implemented two versions of the data server software. One implementation is a customization of a database system framework that is optimized for a tightly coupled parallel machine with attached local disks. The second implementation is component-based, and has been designed to accommodate access to and processing of data in a distributed, heterogeneous environment. We also have developed caching client software, implemented in Java, to achieve good response time and portability across different computer platforms. The performance results presented show that the Virtual Microscope systems scales well, so that many clients can be adequately serviced by an appropriately configured data server.
IEEE Transactions on Information Technology in Biomedicine 01/2004; 7(4):230-48. · 1.68 Impact Factor
-
IEEE Transactions on Information Technology in Biomedicine. 01/2003; 7:230-248.
-
[show abstract]
[hide abstract]
ABSTRACT: We present the design and implementation of the Virtual Microscope, a software system employing a client/server architecture to provide a realistic emulation of a high power light microscope. The system provides a form of completely digital telepathology, allowing simultaneous access to archived digital slide images by multiple clients. The main problem the system targets is storing and processing the extremely large quantities of data required to represent a collection of slides. The Virtual Microscope client software runs on the end user's PC or workstation, while database software for storing, retrieving and processing the microscope image data runs on a parallel computer or on a set of workstations at one or more potentially remote sites. We have designed and implemented two versions of the data server software. One implementation is a customization of a database system framework that is optimized for a tightly coupled parallel machine with attached local disks. The second implementation is component-based, and has been designed to accommodate access to and processing of data in a distributed, heterogeneous environment. We also have developed caching client software, implemented in Java, to achieve good response time and portability across different computer platforms. The performance results presented show that the Virtual Microscope systems scales well, so that many clients can be adequately serviced by an appropriately configured data server.
11/2002;
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper we are concerned with the efficient use of a collection of disk-based storage systems and computing platforms in a heterogeneous setting for retrieving and processing large scientific datasets. We demonstrate, in the context of a data-intensive visualization application, how heterogeneity affects performance and show a set of optimization techniques that can be used to improve performance in a component-based framework. In particular, we examine the application of parallelism via transparent copies of application components in the pipelined processing of data.
04/2002;
-
[show abstract]
[hide abstract]
ABSTRACT: Applications that use collections of very large, distributed datasets have become an increasingly important part of science and engineering. With high performance wide-area networks becoming more pervasive, there is interest in making collective use of distributed computational and data resources. Recent work has converged to the notion of the Grid, which attempts to uniformly present a heterogeneous collection of distributed resources. Current Grid research covers many areas from low level infrastructure issues to high level application concerns. However, providing support for efficient exploration and processing of very large scientific datasets stored in distributed archival storage systems remains a challenging research issue.
04/2002;
-
Parallel Computing. 01/2002; 28:827-859.
-
[show abstract]
[hide abstract]
ABSTRACT: Processing of data in many data analysis applications can be represented as an acyclic, coarse grain data flow, from data sources to the client. This paper is concerned with scheduling of multiple data analysis operations, each of which is represented as a pipelined chain of processing on data. We define the scheduling problem for effectively placing components onto Grid resources, and propose two scheduling algorithms. Experimental results are presented using a visualization application.
Supercomputing, ACM/IEEE 2002 Conference; 01/2002
-
16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 15-19 April 2002, Fort Lauderdale, FL, USA, CD-ROM/Abstracts Proceedings; 01/2002
-
[show abstract]
[hide abstract]
ABSTRACT: Large client-server data intensive applications can place high demands on system and network resources. This is especially true when the connection between the client and server spans a wide-area internet link. In this paper, we consider changing the typical client-server architecture of a class of data intensive applications. We show that given sufficient common interest among multiple clients, our enhancements reduce the response time per-client and reduce the amount of data sent across the wide-area link. In addition, we also see a reduction in server utilization which helps to improve server scalability as more clients are added to the system. 1
03/2001;
-
[show abstract]
[hide abstract]
ABSTRACT: Recent research on programming models for developing applications on the Grid has proposed component-based models as a viable approach, in which an application is composed of multiple interacting computational objects. We have been developing a framework, called filter-stream programming, for building data-intensive applications that query, analyze and manipulate very large datasets in a distributed environment. In this model, the processing structure of an application is represented as a set of processing units, referred to as filters. In this paper, we develop the problem of scheduling instances of a filter group. A filter group is a set of filters collectively performing a computation for an application. In particular, we seek the answer to the following question: should a new instance be created, or an existing one reused? We experimentally investigate the effects on performance of instantiating multiple filter groups under varying application characteristics.
Future Generation Computer Systems. 03/2001;
-
Parallel Computing. 01/2001; 27:1457-1478.
-
3rd Annual International Workshop on Active Middleware Services (AMS 2001), 6 August 2001, San Francisco, CA, USA; 01/2001
-
[show abstract]
[hide abstract]
ABSTRACT: Large client-server data intensive applications can place high demands on system and network resources. This is especially true when the connection between the client and server spans a widearea internet link. In this paper, we describe our experience changing the typical client-server architecture for a class of data intensive applications. We show that given sufficient common interest among multiple clients, our enhancements reduce the response time per-client, the response-time variation and the amount of data sent across the wide-area link. In addition, we also see a reduction in server utilization which helps to improve server scalability. Keywords Wide-area, Proxy, Caching, Client-Server, Scalability. 1 INTRODUCTION When designed for a client-server environment, image processing and image browsing applications can place high demands on the underlying system and network resources. For example, the Microsoft TerraServer archive of high resolution satellite imagery [18] currentl...
01/2000;
-
[show abstract]
[hide abstract]
ABSTRACT: Large client-server data intensive applications can place high demands on system and network resources. This is especially true when the connection between the client and server spans a widearea internet link. In this paper, we describe our experience changing the typical client-server architecture for a class of data intensive applications. We show that given sufficient common interest among multiple clients, our enhancements reduce the response time per-client, the response-time variation and the amount of data sent across the wide-area link. In addition, we also see a reduction in server utilization which helps to improve server scalability.
01/2000;
-
Parallel Processing Letters. 01/1999; 9:173-195.
-
01/1999
-
[show abstract]
[hide abstract]
ABSTRACT: We describe a framework, called DataCutter, that is designed to provide support for subsetting and processing of datasets in a distributed and heterogeneous environment. We illustrate the use of DataCutter with several data-intensive applications from diverse fields, and present experimental results.
Parallel Computing.
-
-