A high-performance/low-latency vector rotational CORDIC architecture based on extended elementary angle set and trellis-based searching schemes

Dept. of Electr. Eng., Nat. Central Univ., Chung-li, Taiwan
IEEE Transactions on Circuits and Systems II Analog and Digital Signal Processing 10/2003; DOI: 10.1109/TCSII.2003.816923
Source: IEEE Xplore

ABSTRACT The coordinate rotational digital computer (CORDIC) algorithm is a well-known iterative method for the computation of vector rotation. For applications that require forward rotation (or vector rotation) only, the angle recoding (AR) technique provides a relaxed approach to speed up the operation of the CORDIC algorithm. In this paper, we further apply the concept of AR technique to extend the elementary angle set in the microrotation phase. This technique is called the extended elementary-angle set (EEAS) scheme. The proposed EEAS scheme provides a more flexible way of decomposing the target rotation angle in CORDIC operation, and its quantization error performance is better than the AR technique. Meanwhile, to solve the optimization problem encountered in the EEAS scheme, we also proposed a novel search algorithm, called the trellis-based searching (TBS) algorithm. Compared with the greedy algorithm used in the conventional AR technique, the proposed TBS algorithm yields apparent signal-to-quantization-noise ratio (SQNR) improvement. Moreover, in the scaling phase of the EEAS-based CORDIC algorithm, we suggest a novel scaling operation, called Extended Type-II (ET-II) scaling operation. The ET-II scaling operation applies the same design concepts as the EEAS scheme. It results in much smaller quantization error than conventional Type-I scaling operation in the numerical approximation of scaling factor. By combining the aforementioned new schemes, the proposed EEAS-based CORDIC algorithm can improve the overall SQNR performance by up to 25 dB compared with previous works. Also, given the same target SQNR performance, we require only about 66% iteration number in the iterative CORDIC structure, or use 66% hardware complexity in the parallel CORDIC structure compared with conventional AR technique. Hence, high-performance/low-latency CORDIC very large-scale integration architectures can be achieved without degrading the SQNR performance.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Mixed-scaling-rotation (MSR) coordinate rotation digital computer (CORDIC) is an attractive approach to synthesizing complex rotators. This paper presents the fixed-point error analysis and parameter selections of MSR-CORDIC with applications to the fast Fourier transform (FFT). First, the fixed-point mean squared error of the MSR-CORDIC is analyzed by considering both the angle approximation error and signal round-off error incurred in the finite precision arithmetic. The signal to quantization noise ratio (SQNR) of the output of the FFT synthesized using MSR-CORDIC is thereafter estimated. Based on these analyses, two different parameter selection algorithms of MSR-CORDIC are proposed for general and dedicated MSR-CORDIC structures. The proposed algorithms minimize the number of adders and word-length when the SQNR of the FFT output is constrained. Design examples show that the FFT designed by the proposed method exhibits a lower hardware complexity than existing methods.
    IEEE Transactions on Signal Processing 12/2012; 60(12):6245-6256. · 2.81 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Rotation of vectors through fixed and known angles has wide applications in robotics, digital signal processing, graphics, games, and animation. But, we do not find any optimized coordinate rotation digital computer (CORDIC) design for vector-rotation through specific angles. Therefore, in this paper, we present optimization schemes and CORDIC circuits for fixed and known rotations with different levels of accuracy. For reducing the area- and time-complexities, we have proposed a hardwired pre-shifting scheme in barrel-shifters of the proposed circuits. Two dedicated CORDIC cells are proposed for the fixed-angle rotations. In one of those cells, micro-rotations and scaling are interleaved, and in the other they are implemented in two separate stages. Pipelined schemes are suggested further for cascading dedicated single-rotation units and bi-rotation CORDIC units for high-throughput and reduced latency implementations. We have obtained the optimized set of micro-rotations for fixed and known angles. The optimized scale-factors are also derived and dedicated shift-add circuits are designed to implement the scaling. The fixed-point mean-squared-error of the proposed CORDIC circuit is analyzed statistically, and strategies for reducing the error are given. We have synthesized the proposed CORDIC cells by Synopsys Design Compiler using TSMC 90-nm library, and shown that the proposed designs offer higher throughput, less latency and less area-delay product than the reference CORDIC design for fixed and known angles of rotation. We find similar results of synthesis for different Xilinx field-programmable gate-array platforms.
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems 02/2013; 21(2):217-228. · 1.22 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper focuses on developing an area efficient hyperbolic Coordinate Rotation Digital Computer (CORDIC) algorithm with performance improvement. The algorithm eliminates the need of scale factor calculation in the Range of Convergence (ROC). At the same time the range of convergence offered is higher than the conventional CORDIC ROC in the hyperbolic rotation mode. Being the only kind of algorithm in hyperbolic rotation with sign sequence μ = 1 always, one complete operation requires just 5 iterations. Thus the pipelined implementation has 5 stages which provides a 50% increase in throughput in comparison to conventional CORDIC. As far as the area improvement is considered, 16-bit processor can be realized using 56% less number of full adders required by Flat-CORDIC. The x and y datapath are based on series expansion of hyperbolic functions. The complete algorithm design along with pipelined architecture implementation is detailed.
    Journal of Signal Processing Systems 01/2013; 70(1). · 0.55 Impact Factor