Conference Paper

A Two Level Architecture for High Throughput DCT-Processor and Implementing on FPGA

DOI: 10.1109/ReConFig.2010.67 Conference: Reconfigurable Computing and FPGAs (ReConFig), 2010 International Conference on
Source: IEEE Xplore

ABSTRACT Frequency analysis using discrete cosine transform is being used in a large variety of algorithms such as image processing algorithms. This paper proposes a new high throughput architecture for the DCT processor. This system has got a 2level architecture which uses parallelism and pipelining and has been synthesized on Xilinx Virtex5 FPGA. Synthesis results show that this system works at 150 MHz. Applying DCT on each 8×8 matrix of image take 67 clock pulses. In other words, applying DCT on each pixel takes approximately one clock pulse.

  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents two high performance FPGA architectures for the 2D DCT computation for Ultra High Definition video coding systems. Both architectures use Distributed Arithmetic to perform the necessary multiplications instead of traditional multipliers. The first architecture uses 105 clock cycles to transform an 8×8 block and reaches a rate of up to 206 samples per second at a 338.5 MHz frequency, while the second one requires 65 cycles for each 8×8 block and achieves a rate equal to 252 samples per second at 256 MHz. Both architectures have been implemented using VHDL. Virtex7 FPGA of Xilinx has been used for the realization of both implementations.
    Digital Signal Processing (DSP), 2013 18th International Conference on; 01/2013


Available from