-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, we introduce a new efficient data layout scheme to efficiently handle out-of-core axis-aligned slicing queries of very large multidimensional volumetric data. Slicing is a very useful dimension reduction tool that removes or reduces occlusion problems in visualizing 3D/4D volumetric data sets and that enables fast visual exploration of such data sets. We show that the data layouts based on typical space-filling curves are not optimal for the out-of-core slicing queries and present a novel component-based data layout scheme for a specialized problem domain, in which it is only required to provide fast slicing at every k-th value, for any k > 1. Our component-based data layout scheme provides much faster processing time for any axis-aligned slicing direction at every k-th value, k > 1, requiring less cache memory size and without any replication of data. In addition, the data layout can be generalized to any high dimension.
Scientific and Statistical Database Management, 2007. SSBDM '07. 19th International Conference on; 08/2007
-
[show abstract]
[hide abstract]
ABSTRACT: We discuss a new efficient out-of-core multidimensional indexing structure, information-aware 2<sup>n</sup>-tree, for indexing very large multidimensional volumetric data. Building a series of (n-1)-Dimensional indexing structures on n-Dimensional data causes a scalability problem in the situation of continually growing resolution in every dimension. However, building a single n-Dimensional indexing structure can cause an indexing effectiveness problem compared to the former case. The information-aware 2<sup>n</sup>-tree is an effort to maximize the indexing structure efficiency by ensuring that the subdivision of space have as similar coherence as possible along each dimension. It is particularly useful when data distribution along each dimension constantly shows a different degree of coherence from each other dimension. Our preliminary results show that our new tree can achieve higher indexing structure efficiency than previous methods.
Scientific and Statistical Database Management, 2007. SSBDM '07. 19th International Conference on; 08/2007
-
[show abstract]
[hide abstract]
ABSTRACT: We consider the problem of isosurface extraction and rendering for large scale time varying data. Such datasets have been appearing at an increasing rate especially from physics-based simulations, and can range in size from hundreds of gigabytes to tens of terabytes. We develop a new simple indexing scheme, which makes use of the concepts of the interval tree and the span space data structures. The new scheme enables isosurface extraction and rendering in I/O optimal time, using more compact indexing structure and more effective bulk data movement than the previous schemes. Moreover, our indexing scheme can be easily extended to a multiprocessor environment in which each processor has access to its own local disk. The resulting parallel algorithm is provably efficient and scalable. That is, it achieves load balancing across the processors independent of the isovalue, with almost no overhead in the total amount of work relative to the sequential algorithm. We conduct a large number of experimental tests on the University of Maryland Visualization Cluster using the Richtmyer-Meshkov instability dataset, and obtain results that consistently validate the efficiency and the scalability of our algorithm.
Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International; 05/2006
-
[show abstract]
[hide abstract]
ABSTRACT: Emerging technologies in high speed NAS, hierarchical storage management systems, and networked systems that virtualize interconnected storage over IP and fiber-channel networks, promise to consolidate distributed data stores onto large-scale professionally managed enterprise storage environments. We describe the software architecture of the PAWN (Producer - Archive Workflow Network) environment that enables scalable, reliable marshalling and organization of distributed data into such enterprise storage environments. PAWN was initially developed to capture the core elements required for long term preservation of digital objects as identified by researchers in the digital library and archiving communities. In this paper, we show how PAWN can be extended to enable multiple clients at a number of distributed sites to prepare, organize, and bulk transfer large scale data onto clusters of servers that securely verify the integrity of the data, register the metadata, and store the data into an enterprise storage environment. PAWN allows detailed description, auditing, and organization of the data, and hence will allow for efficient management, access, and disaster recovery. The basic software components are based on open standards and web technologies, and hence are platform independent.
Mass Storage Systems and Technologies, 2005. Proceedings. 22nd IEEE / 13th NASA Goddard Conference on; 05/2005
-
[show abstract]
[hide abstract]
ABSTRACT: We consider the problem of querying large scale multidimensional time series data to discover events of interest, test and validate hypotheses, or to associate temporal patterns with specific events. Large amounts of multidimensional time series data are currently available, and this type of data is growing at a fast rate due to the current trends in collecting time series of business, scientific, demographic, and simulation data. The ability to explore such collections interactively, even at a coarse level, will be critical in discovering the information and knowledge embedded in such collections. We develop indexing techniques and search algorithms to efficiently handle temporal range value querying of multidimensional time series data. Our indexing uses linear space data structures that enable the handling of queries very efficiently, invoking in the worst case a logarithmic number of queries to single time slices. We also show that our algorithm is ideally suited for parallel implementation on clusters of processors achieving a linear speedup in the number of available processors. A particularly simple data structure with provably good bounds is also presented for the case when the number of multidimensional objects is relatively small. These techniques improve significantly over previous techniques for either the serial or the parallel case, and are evaluated by extensive experimental results that confirm their superior performance.
Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on; 07/2004
-
J. JaJa
[show abstract]
[hide abstract]
ABSTRACT: This article introduces the basic Quicksort algorithm and gives a
flavor of the richness of its complexity analysis. The author also
provides a glimpse of some of its generalizations to parallel algorithms
and computational geometry
Computing in Science and Engineering 02/2000; · 1.42 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: We introduce a new optimal prefix computation algorithm on linked lists which builds upon the sparse ruling set approach of Reid-Miller and Blelloch. Besides being somewhat simpler and requiring nearly half the number of memory accesses, we can bound our complexity with high probability instead of merely on average. Moreover, whereas Reid-Miller and Blelloch (1996) targeted their algorithm for implementation on a vector multiprocessor architecture, we develop our algorithm for implementation on the symmetric multiprocessor architecture (SMP). These symmetric multiprocessors dominate the high-end server market and are currently the primary candidate for constructing large scale multiprocessor systems. Our prefix computation algorithm was implemented in C using POSIX threads and run on four symmetric multiprocessors-the IBM SP-2 (High Node), the HP-Convex Exemplar (S-Class), the DEC AlphaServer; and the Silicon Graphics Power Challenge. We ran our code using a variety of benchmarks which we identified to examine the dependence of our algorithm on memory access patterns. For some problems, our algorithm actually matched or exceeded the performance of the standard sequential solution using only a single thread. Moreover, in spite of the fact that the processors must compete for access to main memory, our algorithm still achieved scalable performance with up to 16 processors, which was the largest platform available to us
Parallel and Distributed Processing, 1999. 13th International and 10th Symposium on Parallel and Distributed Processing, 1999. 1999 IPPS/SPDP. Proceedings; 05/1999
-
[show abstract]
[hide abstract]
ABSTRACT: A novel indexing scheme is described to catalogue satellite data
on a pixel basis. The objective of this research is to develop an
efficient methodology to archive, retrieve and process satellite data,
so that data products can be generated to meet the specific needs of
individual scientists. When requesting data, users can specify the
spatial and temporal resolution, geographic projection, choice of
atmospheric correction, and the data selection methodology. The data
processing is done in two stages. Satellite data is calibrated,
navigated and quality flags are appended in the initial processing. This
processed data is then indexed and stored. Secondary processing such as
atmospheric correction and projection are done after a user requests the
data to create custom made products. By dividing the processing in to
two stages saves time, since the basic processing tasks such as
navigation and calibration which are common to all requests are not
repeated when different users request satellite data. The indexing
scheme described can be extended to allow fusion of data sets from
different sensors
Geoscience and Remote Sensing Symposium, 1999. IGARSS '99 Proceedings. IEEE 1999 International; 02/1999
-
[show abstract]
[hide abstract]
ABSTRACT: A recent initiative by NASA has resulted in the formation of a
federation of Earth science data partners. These Earth Science
Information Partners (ESIPs) have been tasked with creating novel Earth
science data products and services as well as distributing new and
existing data sets to the Earth science community and the general
public. The University of Maryland established its ESIP activities with
the creation of the Global Land Cover Facility (GLCF). This joint effort
of the Institute for Advanced Computer Studies (UMIACS) and the
Department of Geography has developed an operational data archiving and
distribution system aimed at advancing current land cover research
efforts. The success of the GLCF is tied closely to assessing user needs
as well. As the timely delivery of data products to the research
community. This paper discusses the development and implementation of a
web-based interface that allows users to query the authors' data
holdings and perform user requested processing tasks on demand. The GLCF
takes advantage of a scaleable, high performance computing architecture
for the manipulation of very large remote sensing data sets and the
rapid spatial indexing of multiple format data types. The user interface
has been developed with the cooperation of the Human-Computer
Interaction Laboratory (HCIL) and demonstrates advances in spatial and
temporal querying tools as well as the ability to overlay multiple
raster and vector data sets. Their work provides one perspective
concerning how critical earth science data may be handled in the near
future by a coalition of distributed data centers
Geoscience and Remote Sensing Symposium, 1999. IGARSS '99 Proceedings. IEEE 1999 International; 02/1999
-
[show abstract]
[hide abstract]
ABSTRACT: The authors describe three models for retrieving information
related to the scattering of light on the Earth's surface. Using these
models, they've developed algorithms for the IBM SP2 that efficiently
retrieve this information
IEEE Computational Science and Engineering 11/1998;
-
[show abstract]
[hide abstract]
ABSTRACT: The authors have used high performance computing techniques to
implement three different algorithms to model the Bidirectional
Reflectance Distribution Function (BRDF) over land from AVHRR data.
AVHRR data from the Pathfinder project has a spatial resolution of 8 km,
and four years of data (1983-1986) was used in this study. Two of the
models are statistical models, where the coefficients are derived from a
set of directional reflectances for each solar zenith angle by curve
fitting using a least square routine. The third model is semi-empirical,
and the coefficients are derived by model inversion and numerical
iteration. The semi-empirical model is computationally more expensive
compared to the other two. One of the statistical models describes
surface BRDF as a continuous temporal function using Fourier techniques.
Analysis of the standard errors between observed and modeled
reflectances from the temporal model show that the errors were larger in
higher latitudes, probably due to interannual variations in surface
conditions caused by changing snow cover in these areas. Results from
the other two models are similar. The results from this study are
expected to provide valuable inputs into BRDF retrieval algorithms
proposed for future Earth Observation System (EOS) instruments
Geoscience and Remote Sensing, 1997. IGARSS '97. Remote Sensing - A Scientific Vision for Sustainable Development., 1997 IEEE International; 09/1997
-
[show abstract]
[hide abstract]
ABSTRACT: An operational atmospheric correction algorithm for Thematic Mapper (TM) imagery has been developed for both sequential and parallel computer environments considering both aerosol and molecular scattering and absorption. The aerosol optical depth is estimated from the image itself using the dark object approach on a moving-window basis, and the surface reflectance is then retrieved by searching lookup tables that are created using a numerical radiative transfer code. The dark object pixels are identified and their surface reflectance estimated using TM channel 7 (2.1 mu m). A variety of techniques are employed to improve computational efficiency. This method is validated by measured aerosol optical depth and extensive visual evaluations accompanied by statistical analysis. Results indicate that the approach is highly stable and useful for both qualitative imagery interpretation (haze removal) and quantitative analysis. Future research activities are also highlighted. The computer codes are available to the general scientific community.
Journal Of Geophysical Research-Atmospheres. 01/1997; 102(D14):17173-17186.
-
[show abstract]
[hide abstract]
ABSTRACT: The objective of atmospheric correction is to retrieve the surface
reflectance from remotely sensed imagery by removing the atmospheric
effects. We introduce an efficient algorithm to estimate the optical
characteristics of the TM imagery and to remove the atmospheric effects
from it. Our algorithm introduces a set of techniques to significantly
improve the quality of the retrieved images. We pay a particular
attention to the computational efficiency of the algorithm thereby
allowing us to correct large TM images quite fast. We also provide a
parallel implementation of our algorithm and show its portability and
scalability on several parallel machines
Parallel Processing, 1996., Proceedings of the 1996 International Conference on; 09/1996
-
J. JaJa
[show abstract]
[hide abstract]
ABSTRACT: A fundamental problem in parallel computing is to design
high-level, architecture independent, algorithms that execute
efficiently on general purpose parallel machines. The aim is to be able
to achieve portability and high performance simultaneously. A key to
accomplishing this is the existence of a computation model that can
bridge the gap between the high level programming models and the
underlying hardware models. There are currently two factors that make
this fundamental problem more tractable. The first is the emergence of a
dominant parallel architecture consisting of a number of powerful
microprocessors interconnected by either a proprietary interconnect, or
a standard off-the-shelf interconnect (such as an ATM switch). The
second factor is the emergence of standards, such as the message passing
standard MPI, for which efficient implementations are either available
or about to appear on most machines. Our recent work has exploited these
two developments by developing a methodology based on (1) a simple
computation model for the current MIMD platforms that incorporates
communication cost into the complexity of the algorithms, and (2) a SPMD
programming model that makes effective use of communication primitives.
We describe our approach for validating the computation model based on
extensive experimentation and the development of benchmarks, and discuss
its extension to the emerging clusters of Symmetric Multiprocessors
(SMPs) architecture
Parallel Processing, 1996. Proceedings of the 1996 ICPP Workshop on Challenges for; 09/1996
-
[show abstract]
[hide abstract]
ABSTRACT: Presents efficient and portable implementations of a useful image enhancement process, the symmetric neighborhood filter (SNF), and an image segmentation technique which makes use of the SNF and a variant of the conventional connected components algorithm which we call δ-connected components. We use efficient techniques for distributing and coalescing data as well as efficient combinations of task and data parallelism. The image segmentation algorithm makes use of an efficient connected components algorithm based on a novel approach for parallel merging. The algorithms have been coded in Split-C and run on a variety of platforms, including the Thinking Machines CM-5, IBM SP-1 and SP-2, Cray Research T3D, Meiko Scientific CS-2, Intel Paragon, and workstation clusters. Our experimental results are consistent with the theoretical analysis (and provide the best known execution times for segmentation, even when compared with machine-specific implementations). Our test data include difficult images from the Landsat Thematic Mapper (TM) satellite data
Parallel Processing Symposium, 1996., Proceedings of IPPS '96, The 10th International; 05/1996
-
[show abstract]
[hide abstract]
ABSTRACT: A common statistical problem is that of finding the median element in a set of data. This paper presents a fast and portable parallel algorithm for finding the median given a set of elements distributed across a parallel machine. In fact, our algorithm solves the general selection problem that requires the determination of the element of rank i, for an arbitrarily given integer i. Practical algorithms needed by our selection algorithm for the dynamic redistribution of data are also discussed. Our general framework is a distributed memory programming model enhanced by a set of communication primitives. We use efficient techniques for distributing, coalescing, and load balancing data as well as efficient combinations of task and data parallelism. The algorithms have been coded in SPLIT-C and run on a variety of platforms, including the Thinking Machines CM-5, IBM SP-1 and SP-2, Gray Research T3D, Meiko Scientific CS-2, Intel Paragon, and workstation clusters. Our experimental results illustrate the scalability and efficiency of our algorithms across different platforms and improve upon all the related experimental results known to the authors
Parallel Processing Symposium, 1996., Proceedings of IPPS '96, The 10th International; 05/1996
-
[show abstract]
[hide abstract]
ABSTRACT: The varied features of the earth's surface each reflect sunlight
and other wavelengths of solar radiation in a highly specific way. This
principle provides the foundation for the science of satellite based
remote sensing. A vexing problem confronting remote sensing researchers,
however, is that the reflected radiation observed from remote locations
is significantly contaminated by atmospheric particles. These aerosols
and molecules scatter and absorb the solar photons reflected by the
surface in such a way that only part of the surface radiation can be
detected by a sensor. The article discusses the removal of atmospheric
effects due to scattering and absorption, ie., atmospheric correction.
Atmospheric correction algorithms basically consist of two major steps.
First, the optical characteristics of the atmosphere are estimated.
Various quantities related to the atmospheric correction can then be
computed by radiative transfer algorithms, given the atmospheric optical
properties. Second, the remotely sensed imagery is corrected by
inversion procedures that derive the surface reflectance. We focus on
the second step, describing our work on improving the computational
efficiency of the existing atmospheric correction algorithms. We discuss
a known atmospheric correction algorithm and then introduce a
substantially more efficient version which we have devised. We have also
developed a parallel implementation of our algorithm
IEEE Computational Science and Engineering 02/1996;
-
S. Liang,
L. Davis,
J. Townshend,
R. Chellappa,
R. DeFries,
R. Dubayah,
S. Goward, J. JaJa,
S. Krishnamachar,
N. Roussopoulos,
J. Saltz,
H. Samet,
T. Shock,
M. Srinivasan
[show abstract]
[hide abstract]
ABSTRACT: A comprehensive and highly interdisciplinary research program is
being carried out to investigate global land cover dynamics in
heterogeneous parallel computing environments. Some of the problems are
addressed including atmospheric correction, mixture modeling, image
classifications by Markovian random fields and by segmentation, global
image/map databases, object oriented parallel programming and
parallel/IO. During the initial two years project, significant progress
has been made in all of these areas
Geoscience and Remote Sensing Symposium, 1995. IGARSS '95. 'Quantitative Remote Sensing for Science and Applications', International; 08/1995
-
[show abstract]
[hide abstract]
ABSTRACT: Develops efficient algorithms for low and intermediate level image
processing on the scan line array processor, a SIMD machine consisting
of a linear array of cells that processes images in a scan line fashion.
For low level processing, the authors present algorithms for block DFT,
block DCT, convolution, template matching, shrinking, and expanding
which run in real-time. By real-time, the authors mean that, if the
required processing is based on neighborhoods of size m×m, then
the output lines are generated at a rate of O(m) operations per line and
a latency of O(m) scan lines, which is the best that can be achieved on
this model. The authors also develop an algorithm for median filtering
which runs in almost real-time at a cost of O(m log m) time per scan
line and a latency of [m/2] scan lines. For intermediate level
processing, the authors present optimal algorithms for translation,
histogram computation, scaling, and rotation. The authors also develop
efficient algorithms for labelling the connected components and
determining the convex hulls of multiple figures which run in O(n log n)
and O(n log<sup>2</sup>n) time, respectively. The latter algorithms are
significantly simpler and easier to implement than those already
reported in the literature for linear arrays
IEEE Transactions on Pattern Analysis and Machine Intelligence 02/1995; · 4.91 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: This article introduces scalable data parallel algorithms for image processing. Focusing on Gibbs and Markov random field model representation for textures, we present parallel algorithms for texture synthesis, compression, and maximum likelihood parameter estimation, currently implemented on Thinking Machines CM-2 and CM-5. The use of fine-grained, data parallel processing techniques yields real-time algorithms for texture synthesis and compression that are substantially faster than the previously known sequential implementations. Although current implementations are on Connection Machines, the methodology presented enables machine-independent scalable algorithms for a number of problems in image processing and analysis.
IEEE Transactions on Image Processing 02/1995; 4(10):1456-60. · 3.04 Impact Factor