ArticlePDF Available

Parallel sequence alignment in limited space

Authors:

Abstract and Figures

Sequence comparison with affine gap costs is a problem that is readily parallelizable on simple single-instruction, multiple-data stream (SIMD) parallel processors using only constant space per processing element. Unfortunately, the twin problem of sequence alignment, finding the optimal character-by-character correspondence between two sequences, is more complicated. While the innovative O(n2)-time and O(n)-space serial algorithm has been parallelized for multiple-instruction, multiple-data stream (MIMD) computers with only a communication-time slowdown, typically O(log n), it is not suitable for hardware-efficient SIMD parallel processors with only local communication. This paper proposes several methods of computing sequence alignments with limited memory per processing element. The algorithms are also well-suited to serial implementation. The simpler algorithms feature, for an arbitrary integer L, a factor of L slowdown in exchange for reducing space requirements from O(n) to O(L square root of n) per processing element. Using this result, we describe an O(n log n) parallel time algorithm that requires O(log n) space per processing element on O(n) SIMD processing elements with only a mesh or linear interconnection network.
Content may be subject to copyright.
A preview of the PDF is not available
... Até onde tem conhecimento o paradigma de checkpoints foi apresentado, para uso em conjunto com o algoritmo de PD padrão em implementações paralelas, por Grice et al. (1995). Ele foi considerado para implementações seriais por Grice et al. (1997), estendido para uso com perfis-HMMs por Tanas e ...
... In [14] the authors present a cache-oblivious divide-and-conquer algorithm for multicores, where the cost matrix is divided into four quadrants to be solved recursively, and the diagonal quadrants are solved in parallel. Other prior research on edit distance has focused on parallelization targeting both MIMD [20] and SIMD [21] architectures. Vectorization of the sequence alignment problem by reshaping the cost matrix has been described in [22]. ...
... [4] More about Smith-Waterman algorithm, a preliminary implementation of Smith-Waterman algorithm using a new chip multiprocessor architecture with multiple Digital Signal Processors (DSP) on a single chip leading to high performance at low cost. [5] Several methods of computing sequence alignments with limited memory per processing element on SIMD processing elements were presented in [6]. And this paper [7] improves alignment times by either reducing the alignment sensitivity or by developing specialized hardware. ...
Conference Paper
Full-text available
This paper presents the methodology that assists the compiler to optimize ClustalW; the most widely used tool for aligning multiple text-based protein or nucleotide sequences in Bioinformatics. Our goal is to minimize latency and maximize the throughput of execution on multithreading ClustalW called MT-ClustalW: our previous work. As a result, optimized MT-ClustalW is able to fully utilize the machine resources and achieves higher throughput on multicore computers. The experiment results show that our methodology can assist the compiler to optimize the code better than only compiler-optimization and achieve over 2 times faster than the sequential ClustalW. Finally, we analyze the overall result with Amdahl's Law
Conference Paper
In this paper, we demonstrate the ability of spatial architectures to significantly improve both runtime performance and energy efficiency on edit distance, a broadly used dynamic programming algorithm. Spatial architectures are an emerging class of application accelerators that consist of a network of many small and efficient processing elements that can be exploited by a large domain of applications. In this paper, we utilize the dataflow characteristics and inherent pipeline parallelism within the edit distance algorithm to develop efficient and scalable implementations on a previously proposed spatial accelerator. We evaluate our edit distance implementations using a cycle-accurate performance and physical design model of a previously proposed triggered instruction-based spatial architecture in order to compare against real performance and power measurements on an x86 processor. We show that when chip area is normalized between the two platforms, it is possible to get more than a 50× runtime performance improvement and over 100× reduction in energy consumption compared to an optimized and vectorized x86 implementation. This dramatic improvement comes from leveraging the massive parallelism available in spatial architectures and from the dramatic reduction of expensive memory accesses through conversion to relatively inexpensive local communication.
Article
Full-text available
Kestrel is a programmable linear array processor designed for sequence analysis. Among other features, Kestrel includes an 8-bit word, a single-cycle add-and-minimize instruction, a multiplier and efficient communication using shared registers. This paper describes Kestrel's functional units in detail, and examines each of their effects on system performance. With functional prototype chips completed, we will assemble a full single-board Kestrel array, with 512 processing elements on eight chips, in early 1998.
Article
Full-text available
Hidden Markov models (HMMs) are a highly effective means of modeling a family of unaligned sequences or a common motif within a set of unaligned sequences. The trained HMM can then be used for discrimination or multiple alignment. The basic mathematical description of an HMM and its expectation-maximization training procedure is relatively straightforward. In this paper, we review the mathematical extensions and heuristics that move the method from the theoretical to the practical. We then experimentally analyze the effectiveness of model regularization, dynamic model modification and optimization strategies. Finally it is demonstrated on the SH2 domain how a domain can be found from unaligned sequences using a special model type. The experimental work was completed with the aid of the Sequence Alignment and Modeling software suite.
Conference Paper
Full-text available
Kestrel is a programmable linear systolic array processor designed for sequence analysis. Among other features, Kestrel includes an 8-bit word, a single-cycle add-and-minimize instruction, and efficient communication using systolic shared registers. This paper describes Kestrel's functional units in detail, and examines each of their effects on system performance. With prototypes currently in progress, we expect to complete a full Kestrel array, with between 512 and 1024 processing elements, by 1997
Conference Paper
Sequence comparisons, a vital research tool in computational biology, is based on a simple O(n2) algorithm that easily maps to a linear array of processors. This paper reviews and compares high-performance sequence analysis on general-purpose supercomputers and single-purpose, reconfigurable, and programmable co-processors. The difficulty of comparing hardware from published performance figures is also noted
Article
Full-text available
This paper presents a linear systolic array for quantifying the similarity between two strings over a given alphabet. The architecture is a parallel realization of a standard dynamic programming algorithm. Also introduced is a novel encoding scheme which minimizes the number of bits required to represent a state in the computation, significantly reducing the size of a processor. An nMOS prototype, to be used in searching genetic databases for DNA strands which closely match a target sequence, is being implemented. Preliminary results indicate that it will perform hundreds to thousands of times faster than a minicomputer.
Article
This paper gives a formal definition of the biological concept of evolutionary distance and an algorithm to compute it. For any set S of finite sequences of varying lengths this distance is a real-valued function on S×SS \times S, and it is shown to be a metric under conditions which are wide enough to include the biological application. The algorithm, introduced here, lends itself to computer programming and provides a method to compute evolutionary distance which is shorter than the other methods currently in use.
Article
The problem of finding a longest common subsequence of two strings has been solved in quadratic time and space. An algorithm is presented which will solve this problem in quadratic time and in linear space.
Article
We present a parallel algorithm for computing an optimal sequence alignment in efficient space. The algorithm is intended for a message-passing architecture with one-dimensional-array topology. The algorithm computes an optimal alignment of two sequences of lengthsM andN inO((M+N) 2 /P) time andO((M+N)/P) space per processor, where the number of processors isP>=max(M, N). Thus, whenP=max(M, N) it achieves linear speedup and requires constant space per processor. Some experimental results on an Intel hypercube are provided.