Conference Paper

A configurable and programmable motion estimation processor for the H.264 video codec

Electron. Eng. Dept., Bristol Univ., Bristol
DOI: 10.1109/FPL.2008.4629923 In proceeding of: Field Programmable Logic and Applications, 2008. FPL 2008. International Conference on
Source: IEEE Xplore

ABSTRACT This work presents a programmable, configurable motion estimation processor for the H.264 video coding standard, capable of handling the processing requirements of high definition (HD) video and suitable for FPGA implementation. The programmable aspect of the processor follows the ASIP (application specific instruction set processor) approach with a instruction set targeted to accelerating block matching motion estimation algorithms. Configurability relates to the ability to optimize the microarchitecture for the selected algorithm and performance requirements through varying the number and type of execution units at compile time.

  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a flexible and scalable motion estimation processor capable of supporting the processing requirements for high-definition (HD) video using the H.264 Advanced Video Codec, which is suited for FPGA implementation. Unlike most previous work, our core is optimized to execute all existing fast block matching algorithms, which we show to match or exceed the inter-frame prediction performance of traditional full-search approaches at the HD resolutions commonly in use today. Using our development tools, such algorithms can be described using a C-style syntax which is compiled into our custom instruction set. We show that different HD sequences exhibit different characteristics which necessitate a flexible and configurable solution when targeting embedded applications. This is supported in our core and toolset by allowing designers to modify the number of functional units to be instantiated. All processor instances remain binary compatible so recompilation of the motion estimation algorithm is not required. Due to this optimization process, it is possible to match the processing requirements of the selected motion estimation algorithm to the hardware microarchitecture leading to a very efficient implementation.
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems 01/2012; 20:437-448. · 1.22 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes Integer-pel Motion Estimation (IME) specific instructions and their hardware architecture for Motion Estimation Specific Instruction-set Processor (MESIP). With pattern information using the pixel distance, the proposed IME instruction efficiently supports fast search algorithms. The proposed MESIP has been verified by the Synopsys Processor Designer and implemented by the Design Compiler using the IBM 90nm process technology. The gate count is about 25.5K gates for each Processing Element Group (PEG) which has 128 SAD PEs. The total hardware size is about 453K gates and the operating frequency is 188MHz for 1080p@30frames in real time. MESIP can reduce the hardware size about 26% and the number of operation cycles about 18% compared with the prior version of MESIP and comparable to the existing ASICs.
    Circuits and Systems (ISCAS), 2012 IEEE International Symposium on; 01/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes an efficient loop accelerator for a Motion Estimation Specific Instruction-set Processor (MESIP). ME algorithms in nature contain complex and multiple loop operations. To support efficient hardware (HW) loop operations, this paper introduces four loop instructions and their specific HW architecture. The simulation results show that the proposed loop accelerator can reduce about 29% average instruction cycles for ME early-termination schemes compared with typical implementation having a combination of compare and conditional jump instructions. The proposed loop accelerator of MESIP can significantly reduce the number of program memory accesses and greatly save power consumption. Hence, it can be quite suitable for low power and flexible ME implementation.
    Circuits and Systems (ISCAS), 2013 IEEE International Symposium on; 01/2013