Conference PaperPDF Available

Proceedings of the Advanced Research Workshop on High Performance Computing Technology and Applications (HPC 2000)

Authors:
1
A Cluster Architecture for Image Storage and Retrieval
O. Kao, G.R. Joubert
Department of Computer Science, Technical University of Clausthal
Julius-Albert-Strasse 4, 38678 Clausthal-Zellerfeld, Germany
Tel: +49 5323 727157; Fax: +49 5323 727149
okao@informatik.tu-clausthal.de
Multimedia archival and retrieval results in the computation and communication of huge
permanent increasing amounts of data. Digital photo and video technologies can be found in
the commercial, research and home area, a lot of pictorial information is available on the
internet. The scope of document management systems, digital libraries, photo archives in
hospitals, authorities and companies, satellite images are rapidly growing. Digital TV as a
mass product, video indexing and retrieval with key frames, images as results from different
experiments are the wave of the near future. These and many other applications produce
Petabytes of image material per year. Although we still can not process it in the proper way,
we should at least create a suitable infrastructure to collect the images and store them in the
way, which allows testing and integration of future retrieval methods. In this paper we discuss
the advantages and disadvantages of cluster architectures as a platform for an efficient
retrieval of images.
As a first step we analyse the memory requirements and the complexity of archival and
retrieval algorithms in order to define essential properties of the architecture to be designed.
Usually the memory complexity is described with tables giving a general measure, e.g. X
Mbyte per colour or halftone image. But this is only a small part of the description data being
generated and inserted into the database. The extracted features may require k times more
memory than the raw data, where k is the number of features. The number, type and
composition of the extracted features depends on the application area, creation time, time and
space complexity of the corresponding algorithms etc.
Cluster of SMPs seem to be suitable, because they unify the SMP programming model with
huge memory capacities and data distribution over the nodes. The proposed architecture
consist of querying and computing nodes and of a media server as a central component. he
images are stored on the local disks of cluster SMP nodes. Each node performs the operations
on the local set and sends the results to the media server, which selects the k best hits and
presents them to the user. Performance measurements based on a prototype are in progress.
Keywords: image retrieval, cluster, feature classification
References
[1] W.I. Grosky, R. Jain, R. Mehrota (Eds.): The Handbook of Multimedia information management, Prentice
Hall, 1997, 365-404
[2] M. Gaus, G. R. Joubert, O. Kao, S. Riedel and S. Stapel: Distributed high-speed computing of multimedia
data, Parallel Computing Conference (ParCo) 1999, Delft, Holland (to be published)
[3] G.F. Pfister: In Search of Clusters: The ongoing battle in lowly parallel computing, Prentice Hall, 1998
[4] Web Site Beowulf: http://www.beowulf.org/
[5] D.F. Savarese, T. Sterling: Beowulf, in R. Buyya (Edt.): High Performance Cluster Computing -
Architectures and Systems, Prentice Hall, 1999, 625-645
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Distributed platforms are not necessarily well-suited for systems which handle large data sets, such as processed in multimedia applications. In this paper a specialised computation model, based on asynchronous transmission, is presented. As the necessary functions are encapsulated this system can be used without detailed knowledge of the system architecture. A dynamic strategy of task execution is utilised to adjust the number and size of the distributed data packages according to the computational load of the processing elements at transmission time. Thus more powerful PE's, or those whose resources are not fully utilised, will either receive packages more frequently or will be given larger packages. In large networks some nodes can be replaced by others or only a few data blocks may be sent to (a) particular node(s). The efficiency of the method is evaluated with a variety of practical run time measurements.