ArticlePDF Available

Real-Time Pattern Recognition by Massively Parallel Systolic Processors

Authors:

Abstract

Three VMEbus controlled systolic array processors for high--speed pattern recognition are described. Within 3 ¯s to 20 ¯s ill--defined straight and curved lines, and circles are identified. The first processor uses direct matching of image pixels with a set of programmable masks. It is fully pipelined and scalable in the image size and the number of masks. Testing of a prototype begun in autumn 1993. The second processor is based on a parallel Hough transform that histograms all possible coordinate pairs (radius of curvature, starting angle). The number of well defined patterns is counted using the Euler relation. A prototype of the processor with 1,120 processing elements (PEs) implemented by programmable gate arrays is under test. The full scale system will consist of 26,880 PEs. The third processor consists of 28,160 PEs assembled on 1 m 2 PC boards filled with 2 ¯ CMOS custom VLSI chips containing 8Theta8 PEs each. The fully programmable processor is in operation since spring 19...
... For high speed implementations we need to utilise full-parallel solutions with strong pipelining of operations. FPGA devices can efficiently be used for strongly parallel implementations of algorithms, like the massively parallel systolic processor system for image processing in Mannheim University [1]. For pipelining structures FP-GAs suit also reasonably well, because typically the best logic capacity can be achieved in applications where the amounts of combinational and sequential logic are approximately equal. ...
Article
In this paper we present our studies for implementing complex DSP and Telecom systems in FPGAs. We analyse suitability of FPGA device architectures for implementing complex algorithms. Here we use a Viterbi algorithm as a deeper case study. Different architectural strategies for implementations are discussed and analysed with the special emphasis on practical FPGA implementations. Speed performance, easy routability and minimisation of inter-chip communication are used as design criteria. Viterbi decoder, constraint length seven, was designed and simulated with VHDL in Synopsys and Mentor tool environments and further implemented on four Xilinx 4028EX devices using trace-back based architecture. Also partitioning aspects of the decoding algorithm are presented and analysed.
Article
Full-text available
A systolic processor has been developed that executes a parallel Hough transform. The system has been tailored to a specific pattern recognition task, the identification of particle tracks in the r,f projection of the OPAL jet chamber. For all well defined tracks the starting angle and the radius of curvature is computed in 3.3 s. The system consists of a Hough transform processor that identifies the tracks and an Euler processor that counts their number by applying the Euler relation to the thresholded result of the Hough transform. For one sector of the detector a prototype system has been realized with 21 XILINX chips. It consists of 3532 processing elements. The full scale system will use 26,880 processing elements. The processor can easily be adapted to different generalized Hough transforms and various detector geometries. The prototype has been functionally tested with OPAL test data sets. No deviations from the offline simulation have been found. The prototype operate...
Article
Full-text available
Systolic arrays belong to the class of pipelined array architectures where many identical processing elements (PEs) are interconnected locally so that data can be passed from all PEs to their respective neighbors synchronously and in parallel. In principle, all of them perform the same basic operation on their current operands in one clock cycle. At the University of Mannheim a systolic processor array is under development specialized to a specific pattern recognition task that has extreme high-speed requirements. The system is a systolic processor for the identification of circular particle tracks in a 2D projection. For each well defined track, the starting angle and the radius of curvature is computed in less than 5 µs. The system consists of a Hough transform processor that determines well defined tracks, and an Euler processor that counts their number by applying the Euler relation to the thresholded result of the Hough transform. A systolic processor consisting of 35×32 processin...
Article
Full-text available
A parallel Hough transform processor has been developed that identifies circular particle tracks in a 2-D projection of the OPAL jet chamber. The system consists of a Hough transform processor that determines well-defined tracks, and a Euler processor that counts their number by applying the Euler relation to the threshold result of the Hough transform. A prototype of a systolic process has been built that handles one sector of the jet chamber. It consists of 35×32 processing elements that are loaded into 21 programmable gate arrays (XILINX). This processor runs at a clock rate of 40 MHz. It is tested offline with about 1000 original OPAL events. No deviations from the offline simulation are found. A trigger efficiency of 93% is obtained. The prototype, together with the associated drift time measurement unit, has been installed at the OPAL detector at the LEP (Large Electron Positron Collider) and 100 k events have been sampled to evaluate the system under detector conditions
Article
Full-text available
In the context of research and development activities for future hadron colliders, competitive implementations of real-time algorithms for feature extraction have been made on various forms of commercial pipelined and parallel architectures. The algorithms used for benchmarking serve for decision making and are of relative complexity; they are required to run with a repetition rate of 100 kHz on data sets of kilobyte size. Results are reported and discussed in detail. Among the commercially available architectures, pipelined image processing systems can compete with custom-designed architectures. General-purpose processors with systolic mesh connectivity can also be used. Massively parallel systems of the SIMD type (many processors executing the same program on different data) are less suitable in the presently marketed form
Article
A parallel Hough transform processor has been developed that identifies circular particle tracks in a 2-D projection of the OPAL jet chamber. The high-speed requirements imposed by the eight bunch crossing model of the LEP (Large Electron Positron Collider) could be fulfilled by computing the starting angle and the radius of curvature for each well-defined track in less than 4 μs. The system consists of a Hough-transform processor that determines well-defined tracks and a Euler processor that counts their number by applying the Euler relation to the threshold result of the Hough transform. A systolic processor consisting of 35×32 processing elements has been realized with 21 XILINX chips that handles one sector of the jet chamber. It has been functionally tested with 981 original OPAL test events, and no deviations from the off-line simulation have been found. A trigger efficiency of 93% has been obtained. The prototype is running at a clock rate of 40 MHz
Article
The characteristic features of the OPAL jet chamber are described, such as the mechanical and electrical structure, the readout electronics, the gas system, and the laser setup for calibration. Results on spatial resolution and on particle identification power, obtained with a full scale prototype chamber, are presented which show that in the final chamber the anticipated performance can be reached.
Article
The design features of the OPAL jet chamber are described, such as the mechanical and electrical structure, the high-voltage system, the readout electronics, the gas system and the laser setup for calibration. Special emphasis is put on the experience gained during the construction and running-in phase of all 24 sectors of the chamber. Results from the first operation are given, during which cosmic ray data were taken and a laser mapping of the entire chamber volume has been performed.
First Experiences with a Systolic Trigger Processor for RICH Detectors
  • R Baur
Baur R., Gl a J., M anner R.: First Experiences with a Systolic Trigger Processor for RICH Detectors; in: Watanabe Y., Abe F. (Eds.): Computing in High Energy Physics '91, Frontier Science Series; Vol. 3, Tokyo (1991) 211{218.
The LEP Machine Design
  • G Plass
  • E Picasso
Plass G., Picasso E.: The LEP Machine Design;