Conference Proceeding

A parallel architecture for meaning comparison

Dept. of Comput. Sci. & Eng., Texas A&M Univ., College Station, TX, USA
05/2010; DOI:10.1109/IPDPS.2010.5470371 In proceeding of: Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on
Source: IEEE Xplore

ABSTRACT In this paper we present a fine grained parallel architecture that performs meaning comparison using vector cosine similarity (dot product). Meaning comparison assigns a similarity value to two objects (e.g. text documents) based on how similar their meanings (represented as two vectors) are to each other. The novelty of our design is the fine grained parallelism which is not exploited in available hardware based dot product processor designs and can not be achieved in traditional server class processors like the Intel Xeon. We compare the performance of our design against that of available hardware based dot product processors as well a server class processor using optimum software code performing the same computation. We show that our hardware design can achieve a speedup of 62,000 times compared to an available hardware design and a speedup of 8866 times with 33% (1.5 times) less power consumption, compared to software code running on Intel Xeon processor for 1024 basis vectors. Our design can significantly reduce the amount of servers required for similarity comparison in a distributed search engine. Thus it can enable reduction in energy consumption, investment, operational costs and floor area in search engine data centers. This design can also be deployed for other applications which require fast dot product computation.

0 0
 · 
0 Bookmarks
 · 
28 Views
  • Source
    Conference Proceeding: Optimizing a Semantic Comparator Using CUDA-enabled Graphics Hardware
    [show abstract] [hide abstract]
    ABSTRACT: Emerging semantic search techniques require fast comparison of large "concept trees". This paper addresses the challenges involved in fast computation of similarity between two large concept trees using a CUDA-enabled GPGPU co-processor. We propose efficient techniques for the same using fast hash computations, membership tests using Bloom Filters and parallel reduction. We show how a CUDA-enabled mass produced GPU can form the core of a semantic comparator for better semantic search. We experiment run-time, power and energy consumed for similarity computation on two platforms: (1) traditional sever class Intel x86 processor (2) CUDA enabled graphics hardware. Results show 4x speedup with 78% overall energy reduction over sequential processing approaches. Our design can significantly reduce the number of servers required in a distributed search engine data center and can bring an order of magnitude reduction in energy consumption, operational costs and floor area.
    Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on; 10/2011

Keywords

available hardware design
 
distributed search engine
 
dot product processor designs
 
dot product processors
 
energy consumption
 
fine grained parallel architecture
 
fine grained parallelism
 
floor area
 
hardware design
 
Intel Xeon processor
 
Meaning comparison
 
optimum software code
 
performs meaning comparison
 
power consumption
 
require fast dot product computation
 
server class processor
 
similarity comparison
 
similarity value
 
traditional server class processors
 
vector cosine similarity