Ishaan L. Dalal’s research while affiliated with University of Texas at Austin and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (14)


A PARALLELIZED QUASI-MONTE CARLO ALGORITHM FOR THE EXTRACTION OF PARTIAL INDUCTANCES IN IC INTERCONNECT STRUCTURES
  • Article

January 2010

·

23 Reads

D Stefan

·

I Dalal

·

·

[...]

·

This paper presents a parallelized Quasi-Monte Carlo algorithm for the extraction of partial inductances in IC interconnect structures. Quasi-random numbers provide superior convergence over pseudo-random numbers, and in this paper the results are presented for three quasi-random number sequences-Halton, Sobol and Niderreiter. The algorithm has been parallelized and an almost linear rate of parallelization is achieved.


FPGA-based SoC for real-time network intrusion detection using counting bloom filters

April 2009

·

93 Reads

·

25 Citations

Computers face an ever increasing number of threats from hackers, viruses and other malware; effective Network Intrusion Detection (NID) before a threat affects end-user machines is critical for both financial and national security. As the number of threats and network speeds increase (over 1 gigabit/sec), users of conventional software based NID methods must choose between protection or higher data rates. To address this shortcoming, we have designed a hardware-based NID system-on-a-chip using data structures called Counting Bloom Filters (CBFs). Our design has extremely high throughput (up to 3.3 gigabits/sec) and can successfully detect and mitigate known threats, and is, to our knowledge, the only known CBF based NID system-on-a-chip to be implemented on a Virtex 4 FPGA. In this project, we present the first optimized, Counting Bloom Filter based Network Intrusion Detection FPGA SoC (system-on-chip) implemented on a Virtex 4 FPGA: our design is scalable through further parallelization and, to our knowledge, is one of the highest throughput NID systems in existence.


Low Discrepancy Sequences for Monte Carlo Simulations on Reconfigurable Platforms

August 2008

·

80 Reads

·

52 Citations

Low-discrepancy sequences, also known as ldquoquasi-randomrdquo sequences, are numbers that are better equidistributed in a given volume than pseudo-random numbers. Evaluation of high-dimensional integrals is commonly required in scientific fields as well as other areas (such as finance), and is performed by stochastic Monte Carlo simulations. Simulations which use quasi-random numbers can achieve faster convergence and better accuracy than simulations using conventional pseudo-random numbers. Such simulations are called Quasi-Monte Carlo. Conventional Monte Carlo simulations are increasingly implemented on reconfigurable devices such as FPGAs due to their inherently parallel nature. This has not been possible for Quasi-Monte Carlo simulations because, to our knowledge, no low-discrepancy sequences have been generated in hardware before. We present FPGA-optimized scalable designs to generate three different common low-discrepancy sequences: Sobol, Niederreiter and Halton. We implement these three generators on Virtex-4 FPGAs with varying degrees of fine-grained parallelization, although our ideas can be applied to a far broader class of sequences. We conclude with results from the implementation of an actual Quasi-Monte Carlo simulation for extracting partial inductances from integrated circuits.


On the fast generation of long-period pseudorandom number sequences

June 2008

·

48 Reads

·

4 Citations

Monte Carlo simulations and other scientific applications that depend on random numbers are increasingly implemented in parallel configurations in programmable hardware. High-quality pseudo-random number generators (PRNGs), such as the Mersenne Twister, are based on binary linear recurrence equations. They have extremely long periods (more than 21024 numbers generated before the entire sequence repeats) and well-proven statistical properties. Many software implementations of such dasialong-periodpsila PRNGs exist, but hardware implementations are rare. We develop optimized, resource-efficient parallel architectures for long-period PRNGs that generate multiple independent streams by exploiting the underlying algorithm as well as hardware-specific architectural features.


A hardware framework for the fast generation of multiple ong-period random number streams

February 2008

·

44 Reads

·

24 Citations

ABSTRACT Stochastic simulations and other scientific applications that depend on random,numbers,are increasingly implemented,in a parallelized manner,in programmable,logic. High-quality pseudo-random,num- ber generators (PRNG), such as the Mersenne Twister, are often based on binary linear recurrences and have extremely long peri- ods (more than 2,). Many software implementations,of such PRNGs exist, but hardware implementations are rare. We have developed an optimized, resource-efficient parallel framework for this class of random,number,generators that exploits the under- lying algorithm as well as FPGA-specific architectural features. The framework,also incorporates fast “jump-ahead” capability for these PRNGs, allowing simultaneous, independent sub-streams to be generated in parallel by partitioning one long-period pseudo- random,sequence. We demonstrate parallelized implementations,of three types of PRNGs – the 32-, 64- and 128-bit SIMD Mersenne Twister – on Xilinx Virtex-II Pro FPGAs. Their area/throughput performance,is impressive: for example, compared clock-for-clock with a previ- ous FPGA implementation, a “two-parallelized” 32-bit Mersenne Twister uses 41% fewer resources. It can also scale to 350 MHz for a throughput of 22.4 Gbps, which is 5.5x faster than the older FPGA implementation,and 7.1x faster than a dedicated software implementation. The quality of generated random,numbers,is veri- fied with the standard statistical test batteries diehard and TestU01. We also present two real-world application studies with multiple RNG streams: the Ziggurat method,for generating normal random variables and a Monte Carlo photon-transport simulation. The availability of fast long-period random,number,generators with multiple streams accelerates hardware-based scientific simu-



An Economical Multichannel Integrated Receiver/Reconstruction System for MRI

July 2006

·

7 Reads

A design is presented for a multichannel integrated receiver/reconstruction system for MRI that alleviates the concerns associated with commercial solutions regarding cost, scalability and the ability to access intermediate data. Up to 16 MR signals are sampled directly at RF, with digital down-conversion and image reconstruction performed by a single-chip field-programmable gate array (FPGA). Imaging researchers can readily access data from any point in the processing chain over a network, and the design can scale modularly in blocks of 8n or 16n channels


A Low-Cost Scalable Multichannel Digital Receiver for Magnetic Resonance Imaging

February 2006

·

20 Reads

·

4 Citations

Conference proceedings: ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference

Commercial receivers used for parallel MR imaging often present researchers with hurdles such as high cost-per-channel, low scalability for multiple coils and non-accessibility to intermediate data for research. A novel low-cost multichannel digital receiver for use with MR scanners has been developed to alleviate these concerns. MR signals from up to 16 coils are bandpass sampled at RF, with all subsequent downconversion performed on a single-chip Field-Programmable Gate Array (FPGA). Downconverted information is buffered and can be downloaded over a network or onto flash memory for image reconstruction. Arrays with more than 16 coils can easily scale by using more than one of these economical receivers. A 2-channel prototype has been designed, built and tested successfully with a combination of real-world signals and simulated MR data.


A reconfigurable FPGA-based 16-channel front-end for MRI

January 2006

·

11 Reads

·

13 Citations

Circuits, Systems and Computers, 1977. Conference Record. 1977 11th Asilomar Conference on

Parallel MRI acquisitions are generally reconstructed off-tine on PCs and computer clusters. Building upon an existing multichannel digital receiver, we present an innovative single-FPGA front-end that performs real-time 2D-FFT reconstruction from arrays of up to 16 coils. Partial reconfiguration enables rapid switching of FPGA modules for maximal flexibility and lower hardware cost. After an acquisition has been buffered, more complicated parallel MRI reconstruction techniques can replace receiver logic. Alternatively, the front-end can be used as a hardware-accelerated reconstruction engine with other receivers or for PCs. As proof-of-concept for real-time non-cartesian reconstruction, a reconfigurable module for next-neighbor regridding of spiral MRI is also demonstrated.


An Area-efficient Hardware Implementation of the CryptMT Stream Cipher

14 Reads

Abstract—Fast stream,ciphers,are,used,extensively,for en- crypted,data transmission,in mobile,networks,and,over multi- gigabit links. CryptMT, a recently proposed stream cipher, is one of the final candidates,for standardization,by the European Union’s eSTREAM project. Cryptanalysis,of CryptMT,has discovered,no attacks,or vulnerabilities thus far. We,present the,first hardware,implementation,of CryptMT,on,a,field- programmable,gate array (FPGA). On the Xilinx Virtex-2 Pro FPGA, throughputs of up to 16 Gbits/s can be obtained while using minimal,logic resources,(378 slices). This is highly area- efficient compared,to implementations,of ciphers such as AES and,RC4. Possibilities for parallelization and,scaling to higher throughputs,are also discussed.


Citations (9)


... For a specific application this data analysis unit can be made on dedicated unit such as a FPGA [39]. However, it is also common to use a computer to process the data as it gives more flexibility to implement further data processing. ...

Reference:

Portable nuclear magnetic resonance spectroscopy probe
A Low-Cost Scalable Multichannel Digital Receiver for Magnetic Resonance Imaging
  • Citing Conference Paper
  • August 2006

... The reconstruction time of GRAPPA increases quadratically with the number of channels [8]. Large computation and memory requirements due to higher channel count also limit the efficiency and scalability of GRAPPA and other pMRI techniques on the computational platforms such as FPGAs and GPUs [9][10][11][12]. ...

A reconfigurable FPGA-based 16-channel front-end for MRI
  • Citing Conference Paper
  • January 2006

Circuits, Systems and Computers, 1977. Conference Record. 1977 11th Asilomar Conference on

... To satisfy such computation-hungry applications effectively, different platforms are used. These platforms may consist of computation cores, general purpose central processing unit (CPU), general purpose graphics processing unit (GPU), field programmable gate arrays (FPGAs), and combination of those [4][5][6][7][8][9][10][11]. Each technology has its advantages and limitations. ...

A Reconfigurable Real-time Reconstruction Engine for Parallel MRI
  • Citing Article

... This coding method combines AAC and Spectral Band Replication (harmonic redundancy in the frequency domain) and since its second version it includes the parametric stereo feature. The codec can be very usefully implemented for the compression of streaming data [1]. It can operate at very low bit-rates and is mainly used for internet radio streaming. ...

A Real-time AAC-type Audio Codec on the 16-bit dsPIC Architecture
  • Citing Article

... As a result, Intrusion Detection systems implement the Bloom Filter to efficiently and effectively identify possible security attacks. Hence, using an enhanced Bloom Filter is very important to protect the billions of devices linked to the IoT from malicious attacks [25,26]. ...

FPGA-based SoC for real-time network intrusion detection using counting bloom filters
  • Citing Conference Paper
  • April 2009

... In quasi-random or low-discrepancy sampling, the position of sampling points is based on low-discrepancy sequences (also called quasi-random or subrandom sequences). These sequences represent numbers that are better equidistributed than pseudo-random numbers (Dalal et al., 2008). To construct higher-dimension low discrepancy, as in the case of two-dimensional sampling design, several one-dimensional sequences are combined in a component-wise manner, that is, that the x and y coordinates of a two-dimensional area are constructed by pairing consecutive numbers of two different low discrepancy series in an [0,1] × [0,1] space and then adjusted to the actual spatial extent of the area to be sampled. ...

On the fast generation of long-period pseudorandom number sequences
  • Citing Conference Paper
  • June 2008

... The generation of random numbers involves the following repeated steps: first, the state of the PRNG is updated according to Equation (1); then, Equation (2) extracts the random number from the state . To guarantee statistical randomness, existing FPGA-based PRNGs [10,32,51] usually implement the state transition with a large state space, requiring block RAMs (BRAMs) in the FPGAs to be used as storage for state processing. The output stage usually includes bitwise operations such as truncation or permutation to increase the unpredictability of the sequence. ...

A hardware framework for the fast generation of multiple ong-period random number streams
  • Citing Conference Paper
  • February 2008

... In some of these methods, computation-intensive image processing is realized by a host computer that is equipped with multi-core general-purpose central processing units (CPUs) or graphics processing units (GPUs)45. In others, special image processing boards are built upon digital signal processing (DSP) chips678 that include application-specific integrated circuit (ASIC) or Field Programmable Gate Array (FPGA) chips10111213141516171819. Among these hardware devices, an FPGA offers the characteristics that are most desirable and suitable for real-time data processing. ...

A Low-Cost Scalable Multichannel Digital Receiver for Magnetic Resonance Imaging
  • Citing Article
  • February 2006

Conference proceedings: ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference

... An improvement on the Monte-Carlo sampling methodologies is the low discrepancy sampling (LDS) methods, coupled with the QMC algorithm, as shown in [30,31]. Discrepancy is a measure of the deviation of sampled points from the uniform distribution [32]. Consider a number of points N R from a sequence {θ i }, for i = 1, ‥, N, in an n-dimensional rectangle R centred upon an origin 0, whose sides are parallel to the coordinate axis, which is a subset of I n : R � I n , where R is attached with a measure. ...

Low Discrepancy Sequences for Monte Carlo Simulations on Reconfigurable Platforms
  • Citing Conference Paper
  • August 2008