Parallel merged multiplier-accumulator coprocessor optimized for digital filters
In an attempt to improve the speed of VLSI signal processing systems, a new architecture for a high-speed multiply–accumulate (MAC) unit optimized for digital filters is proposed. This unit is designed as a coprocessor for the LEON2 RISC processor [LEON2 Processor; 2005 [Online]. ]. In this work, four parallel MAC units with two dual-port coefficient register-files, a three-port general register-file and a control unit are included in the coprocessing block. With the existence of four parallel units, several SIMD format instructions have been added to LEON2 instruction set. Each MAC unit has two 16-bit inputs, 32-bit output register and a programmable round-saturate block. The MAC unit uses a new architecture which embeds the accumulate module within the partial products summation tree of the multiplier with minimum overhead. A central control unit controls inputs of the four MACs and loading of the output registers. Our experimental results demonstrate a high performance in implementation of digital filters at elevated speeds of up to 33 millions of input samples per second in a 0.18μm technology.