Conference Proceeding

An efficient VLSI architecture for 2-D convolution with quadrant symmetric kernels

Dept. of Electr. & Comput. Eng., Old Dominion Univ., Norfolk, VA, USA;
06/2005; DOI:10.1109/ISVLSI.2005.15 ISBN: 0-7695-2365-X pp.303- 304 In proceeding of: VLSI, 2005. Proceedings. IEEE Computer Society Annual Symposium on
Source: IEEE Xplore

ABSTRACT A high performance digital architecture for computing 2D convolution utilizing the quadrant symmetry of the kernels is proposed in this paper. Pixels in the four quadrants of the kernel region with respect to an image pixel are considered simultaneously for computing the partial products of the convolution sum. A novel data handling strategy to identify the pixels to be fed to different processing elements helps reducing the data storage requirements in the circuitry. The new design results in 75% reduction in multipliers and 50% reduction in adders when compared with the conventional systolic architecture. The proposed architecture design is capable of performing convolution operations with 14×14 kernel at a rate of 57 1024×1024 frames per second in a Xilinx 's Virtex 2v2000ff896-4 FPGA.

0 0
 · 
0 Bookmarks
 · 
48 Views
  • Conference Proceeding: An efficient programmable 2-D convolver chip
    [show abstract] [hide abstract]
    ABSTRACT: This paper proposes a new real-time 2-D convolution filter architecture to reduce the product of the hardware complexity and propagation delay. To meet the real-time image-processing requirement, several commercial 2-D convolver chips have many parallel multipliers, which occupy a large VLSI area. The implemented chip uses only one special shift-and-accumulator block instead of nine parallel multipliers. Hence the chip can reduce the chip size by more than 50% of existing 2-D convolvers. Moreover, due to a finite state machine, which is controlling input data sequences, the proposed chip does not require row buffers to store three row image data used in the commercial chips. We used the SOG cell library (KG60K). The implemented filter chip consists of only 3,893 gates, operates at 125 MHz and can meet the real-time image processing requirement, i.e., the standard of ITU-R BT.601
    Circuits and Systems, 1998. ISCAS '98. Proceedings of the 1998 IEEE International Symposium on;
  • Conference Proceeding: An advanced programmable 2D-convolution chip for, real time image processing
    [show abstract] [hide abstract]
    ABSTRACT: An advanced defect-tolerant systolic array implementation of the 2-D convolution algorithm for real-time image processing applications is presented. The chip differs from available convolution chips in the maximum kernel size of 256 taps, the ability to convolve one video signal with up to four independent coefficient masks, support of adaptive filtering, on-chip delay lines, and implemented special processing of frame borders. Defect tolerance, e.g., reconfiguration techniques, are implemented in order to enhance yield and reliability, especially for future large area implementations
    Circuits and Systems, 1991., IEEE International Sympoisum on; 07/1991
  • Article: Reconfigurable pipelined 2-D convolvers for fast digital signal processing
    [show abstract] [hide abstract]
    ABSTRACT: In order to make software applications simpler to write and easier to maintain, a software digital signal-processing library that performs essential signal- and image-processing functions is an important part of every digital signal processor (DSP) developer's toolset. In general, such a library provides high-level interface and mechanisms, therefore, developers only need to know how to use algorithms, not the details of how they work. Complex signal transformations then become function calls, e.g., C-callable functions. Considering the two-dimensional (2-D) convolver function as an example of great significance for DSP's, this paper proposes to replace this software function by an emulation on a field-programmable gate array (FPGA) initially configured by software programming. Therefore, the exploration of the 2-D convolver's design space will provide guidelines for the development of a library of DSP-oriented hardware configurations intended to significantly speed up the performance of general DSP processors. Based on the specific convolver, and considering operators supported in the library as hardware accelerators, a series of tradeoffs for efficiently exploiting the bandwidth between the general-purpose DSP and accelerators are proposed. In terms of implementation, this paper explores the performance and architectural tradeoffs involved in the design of an FPGA-based 2-D convolution coprocessor for the TMS320C40 DSP microprocessor available from Texas Instruments Incorporated. However, the proposed concept is not limited to a particular processor.
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems 10/1999; · 1.22 Impact Factor

Full-text

View
3 Downloads
Available from

Keywords

50% reduction
 
convolution sum
 
data storage requirements
 
different processing elements
 
four quadrants
 
image pixel
 
kernel region
 
kernels
 
new design results
 
performance digital architecture
 
Pixels
 
Xilinx