-
[show abstract]
[hide abstract]
ABSTRACT: The design and implementation details of a 4-bit time interleaved successive approximation register (SAR) analog to digital converter (ADC) for UWB application is presented in this paper. Major contribution of this paper is the proposal for a novel digital to analog converter (DAC) architecture which reduces the area required for capacitors by a factor of three, while the maximum error due mismatch between capacitors is reduced by 33% compared to the architecture reported in the literature. The ADC is implemented in .18 mum CMOS technology and has total power consumption of 17.6 mw at sampling frequency of 500 MS/s for an input swing of 1 V peak to peak. Proposed SAR ADC gives SNDR of 23.7 dB, SFDR of 31.5 dB and THD of -32.2 dB at Nyquist rate. The proposed ADC enables the input swing to be increased by 25% while maintaining figure of merit same compared to a SAR ADC reported in the literature.
Quality Electronic Design, 2009. ASQED 2009. 1st Asia Symposium on; 08/2009
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, novel ideas have been proposed for designing and implementing the pipelined MB-OFDM UWB transmitter Digital Backend Modules on FPGA for a data rate of 200 Mbps. The various digital backend modules are scrambler, convolutional encoder, puncturer, interleaver, QPSK mapping, and OFDM. The most critical block is the OFDM block because it consists of 128 point IFFT, that to work at a speed of 528 MHz. This is achieved in the proposed OFDM module by using modified radix-2<sup>4</sup> SDF algorithm with extensive pipelining of LPM without using parallel architecture. By the way the speed 528 MHz can be obtained with minimum hardware. Also the hardware complexity has been significantly reduced by usage of constant coefficient canonical signed digit (CSD) multipliers and accuracy has been improved by the internal word length maintained at 13 bits which is 7 bits more than the input. For designing the interleaver, the initial problem faced was that the amount of registers that has to be used in designing the interleaver using bit mapping. This leads to thousands of registers in use. In the proposed interleaver, two different RAM banks which are working in tandem with different write and read addresses and clock rates are used to provide optimum results. The implementation has been performed on ALTERA STRATIX III EP3SL50F484C2 FPGA and results obtained are compliance to IEEE 802.15.3a standard.
Computing, Communication and Networking, 2008. ICCCn 2008. International Conference on; 01/2009
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, a novel scheme is proposed for the implementation of FPGA based digital systems using asynchronous pipelining technique. To control the asynchronous data flow between stages, an intelligent controller is designed which decides the delay of each stage depending upon the magnitude of the input data (Data Dependent Delay). The intelligent controller has been designed using NIOS II soft core embedded processor in ALTERA EP2C20F484C7 device. But, in this approach, the maximum operating frequency is limited by the excess of logical elements consumed by the microcontroller and the sequential execution of the C code. Hence, the function of NIOS processor to control asynchronous data flow alone has been chosen and is implemented as an equivalent hardware INTASYCON (INTelligent ASYnchronous CONtroller) using hardware description language and the speed of the circuit was evaluated. To verify the efficacy of the proposed approach, 8times8 Braun array multiplier is implemented as external logic to the INTASYCON. The INTASYCON processor calculates the completion time of each stage (based on the logic depth) and accordingly activates the respective dual edge triggered flipflops to transfer data from one stage to next stage. This approach consumes lower power and also avoids the need for global clock signals and their consequences like skew problems.
SoC Design Conference, 2008. ISOCC '08. International; 12/2008
-
[show abstract]
[hide abstract]
ABSTRACT: Wave-pipelining enables a digital circuit to be operated at a higher frequency. In the literature, only trial-and-error and
manual procedures are adopted for the choice of the optimum value of clock frequency and clock skew between the input and
output registers of wave-pipelined circuits. One of the major contributions of this paper is the proposal for automating the
above procedure. A second contribution is the study of how logic depths determine the superiority of wave-pipelining over
pipelining with regard to power dissipation. For the study and implementation of wave-pipelined circuits, filters using the
distributed arithmetic algorithm are considered. In this paper, two automation schemes are proposed for the implementation
of the wave-pipelined filters on both Xilinx and Altera field programmable gate arrays (FPGAs). In the first scheme, a self-tuning
finite state machine (FSM) is used to choose the clock skew and clock period for I/O registers between the wave-pipelined
blocks. In the second approach, an on-chip soft-core processor is used to choose the clock skew and clock period. To test
the efficacy of the schemes proposed, filters with different taps are implemented using three schemes: synchronous pipelining,
sub-optimal wave-pipelining and no pipelining (i.e. using neither synchronous pipelining nor wave-pipelining). From the implementation
results, it is observed that wave-pipelined distributed arithmetic (DA) filters are faster by a factor of 1.31–1.61 compared
to non-pipelined DA filters. The synchronous pipelined DA filters are in turn faster by a factor of 1.73–3.27 compared to
the wave-pipelined DA filters. The increased speeds are achieved in the pipelined filters at the cost of an increase in the
number of slices by 15–33% and in the number of registers by 350–530%. To compare the power dissipation, both pipelined and
wave-pipelined DA filters are tested by operating them at the same frequency. For medium logic depths, the wave-pipelined
DA filters dissipate less power than pipelined filters.
Circuits Systems and Signal Processing 05/2008; 27(3):261-276. · 0.82 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Asynchronous pipelined circuits have many potential advantages over their synchronous equivalents including lower power consumption, design reuse without compromise in speed. In this paper, a new technique i.e., "SOPC based Asynchronous Pipelining Technique" (SOPC - System On Programmable Chip) is used for designing and implementing FPGA based Low- Power VLSI Systems. In this approach, the soft core processor available within the FPGA is used to generate various control signals to control the asynchronous data flow in addition to its regular work as processor. Importantly, NIOS processor is also used to validate the results by comparing the results with a known set of test vectors. This type of verification and validation of the digital systems using NIOS provides higher speed and doesn't require any external hook up wires and I/O cards. To verify the efficacy of the proposed approach, an 8 tap DCT using Winograd algorithm is implemented as External Logic to the NIOS processor. The intermittent data between asynchronous pipelined stages are latched by using multiplexer based latches. The completion of each stage is informed to the NIOS processor using interrupts. In turn, NIOS processor generates various control signals to pass the intermittent data stored in the multiplexer based latches. The designed system has been implemented in a STRATIX EP1S25F780C5 FPGA SOPC kit. The results are validated using the same NIOS processor. In the proposed system, storage of intermittent data is done with multiplexer based latches instead of pipelined registers. Hence this approach results in obtaining the speed of a pipelined DCT with comparably lower power consumption. This approach is also avoiding the need for global clock signals and their consequences like skew problems.
Microelectronics, 2007. ICM 2007. Internatonal Conference on; 01/2008
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, techniques for efficient implementation of field-programmable gate-array (FPGA)-based wave-pipelined (WP) multipliers, accumulators, and filters are presented. A comparison of the performance of WP and pipelined systems has been made. Major contributions of this paper are development of an on-chip clock generation scheme which permits finer tuning of the frequency, a synthesis technique that reduces the area and latency by 25%, a placement utility that results in 10%-40% increase in speed and proposal of an interleaving scheme for filters that reduces the number of multipliers required by 50%. WP multipliers of size 2 /spl times/ 6 and the filters using them are found to be 11% faster and require lower power than those using pipelined multipliers. Filters with higher order WP multipliers also operate with lower power at the cost of speed. The delay-register products of such filters are found to be about 60% lower than those using the pipelined multipliers. The paper also outlines applications of these techniques for the Spartan II FPGAs and a self-tuning scheme for optimizing the speed.
IEEE Transactions on Very Large Scale Integration (VLSI) Systems 08/2005; · 1.22 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper schemes for computation of 2D DWT of 32 × 32 subimages using both lifting and DAA technique with Baugh-Wooley multiplier (BWM) is proposed and implemented on Xilinx XC2S150PQ208-5 FPGAs. The implementation results show that the lifting scheme with BWM requires about 20% less area but is 1.55 times faster than that using conventional 2s complement multiplier (C2M). For larger word sizes, the DAA with BWM is found to be 1.2 times faster than that using C2M. An overlap method for processing 128 × 128 image using subimages of size 32 × 32 is proposed and implemented. The 2D DWT of the image is also computed using a C program. The LL1 component of the image obtained using all the above schemes are found to be matching well with the original image. FPGA implementation of higher level 2D DWT is under progress.
TENCON 2003. Conference on Convergent Technologies for Asia-Pacific Region; 11/2003
-
[show abstract]
[hide abstract]
ABSTRACT: A new parallel/serial convolver scheme with wavepipelining is
proposed first. The design of the wavepipelined (WP) convolver using
FPGAs is considered next. Convolvers with and without wavepipelining are
implemented using Xilinx XC4006E FPGAs for convolving two sequences each
with 8 bit accuracy and sequence length 8. The convolver without
wavepipelining requires 125 CLBs and permits a minimum sampling period
of 176 nsec. The WP convolver requires 217 CLBs and permits a minimum
sampling period of 92 nsec. Further the multipliers in the WP convolver
do not require the latches and an ASIC for a large WP convolver can
result in significant savings in area and power. Finally three schemes
for increasing the sampling rate of the WP convolver are suggested
TENCON 2000. Proceedings; 02/2000
-
Neurology India 10/1999; 47(3):247-8. · 0.96 Impact Factor
-
Neurology India 07/1999; 47(2):157. · 0.96 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: In ATM networks several schemes have been proposed to shape the traffic in order to minimize the network congestion and increase channel throughput. A completely satisfactory solution has not yet been obtained due to conflicting requirements of accommodating complex variable bit rate traffic and supporting real time call admission and congestion control mechanism. The shift register traffic shaper (SRTS) scheme proposed by Radhakrishnan, Raghavan and Agrawala (see Comp. Net. & ISDN systems, Elsevier, vol. 28, p.453-69, 1996) performs better than LB mechanism by incorporating multiple windows accommodating different degrees of burstiness in the traffic. However the parameters of the SRTS scheme is insensitive to the congestion level at the ATM node. In this paper a two pronged approach is suggested and studied to combat congestion. The ANN monitors the congestion level at the ATM node and generates the control signals to the sources. The SRTS scheme modifies its parameters in response to this signal to keep the congestion under control. Depending upon the control signal received, the SRTS minimizes the congestion by either splitting the sources contending for network access into groups or by modifying the window parameters of individual SRTS. To validate these approaches an ANN is simulated and the optimum weight vector is obtained. An ATM node with traffic from sources with SRTS is simulated and the loss probability is obtained under different output burstiness. The results obtained confirms the effectiveness of staggering and modification of parameters of SRTS
TENCON '98. 1998 IEEE Region 10 International Conference on Global Connectivity in Energy, Computer, Communication and Control; 02/1998
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, a hybrid scheme is proposed for the implementation of two level 2D-DWT using lifting scheme. Two filter banks consisting of two filters each are used for the implementation of both horizontal and vertical filters. Each of the filter banks are implemented using the lifting scheme. The individual lifting blocks are implemented using wavepipelining and are interconnected with registers. An automation procedure is proposed for tuning the parameters of the wavepipelined circuit. The multipliers in the lifting blocks are implemented using Baugh-Wooley constant coefficient multiplier scheme (BW-KCM). For verifying the efficacy of the scheme proposed, both one level and two level 2-D DWT schemes for sub images of size 32×32 are implemented on Xilinx XC2S150 device. The results for the hybrid scheme are compared with that obtained using non-pipelined and pipelined approaches. For the one level 2D DWT, the hybrid scheme requires the same area but is faster than nonpipelined scheme by a factor of 1.4. The pipelined scheme using the pipelined BW-PKCM is faster than the hybrid scheme by a factor of 1.2 at the cost of increase in the no. of registers by a factor of 2.73. The delay-power product is lower for hybrid scheme by a factor of 2 than the pipelined scheme. The two level 2D DWT for both pipelined and non pipelined schemes are implemented. The implementation of hybrid scheme for 2 level 2 D DWT is under progress. The technique proposed in this paper is also applicable for ASICs and FPGAs from other vendors.