R. Narayanan

Northwestern University, Evanston, IL, USA

Are you R. Narayanan?

Claim your profile

Publications (6)0 Total impact

  • Source
    Conference Proceeding: Poster: A lung cancer mortality risk calculator based on SEER data
    [show abstract] [hide abstract]
    ABSTRACT: We analyze the lung cancer data available from the SEER program for developing survival prediction models using data mining techniques. The prototype mortality risk calculator developed as a result of this study is available at info.eecs.northwestern.edu:8080/CancerMortalityRiskCalculator.
    Computational Advances in Bio and Medical Sciences (ICCABS), 2011 IEEE 1st International Conference on; 03/2011
  • Source
    Conference Proceeding: pFANGS: Parallel high speed sequence mapping for Next Generation 454-roche Sequencing reads
    [show abstract] [hide abstract]
    ABSTRACT: Millions of DNA sequences (reads) are generated by Next Generation Sequencing machines everyday. There is a need for high performance algorithms to map these sequences to the reference genome to identify single nucleotide polymorphisms or rare transcripts to fulfill the dream of personalized medicine. In this paper, we present a high-throughput parallel sequence mapping program pFANGS. pFANGS is designed to find all the matches of a query sequence in the reference genome tolerating a large number of mismatches or insertions/deletions. pFANGS partitions the computational workload and data among all the processes and employs load-balancing mechanisms to ensure better process efficiency. Our experiments show that, with 512 processors, we are able to map approximately 31 million 454/Roche queries of length 500 each to a reference human genome per hour allowing 5 mismatches or insertion/deletions at full sensitivity. We also report and compare the performance results of two alternative parallel implementations of pFANGS: a shared memory OpenMP implementation and a MPI-OpenMP hybrid implementation.
    Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on; 05/2010
  • Source
    Conference Proceeding: Design and Implementation of an FPGA Architecture for High-Speed Network Feature Extraction
    [show abstract] [hide abstract]
    ABSTRACT: Network feature extraction involves the storage and classification of network packet activity. Although primarily employed in network intrusion detection systems, feature extraction is also used to determine various other aspects of a network's behavior such as total traffic and average connection size. Current software methods used for extraction of network features fail to meet the performance requirements of next-generation high-speed networks. In this paper, we propose an FPGA-based reconfigurable architecture for feature extraction of large high-speed networks. Our design makes use of parallel rows of hash functions and sketch tables in order to process network packets at a very high throughput. We present a detailed description of our architecture and its implementation on a Xilinx Virtex-II Pro FPGA board, and provide cycle-accurate timing results for feature extraction of input networking benchmark data. Our results demonstrate real-world throughputs of as high as 3.32 Gbps, with speedups reaching 18x when compared to an equivalent software implementation.
    Field-Programmable Technology, 2007. ICFPT 2007. International Conference on; 01/2008
  • Source
    Conference Proceeding: An FPGA Implementation of Decision Tree Classification
    [show abstract] [hide abstract]
    ABSTRACT: Data mining techniques are a rapidly emerging class of applications that have widespread use in several fields. One important problem in data mining is classification, which is the task of assigning objects to one of several predefined categories. Among the several solutions developed, decision tree classification (DTC) is a popular method that yields high accuracy while handling large datasets. However, DTC is a computationally intensive algorithm, and as data sizes increase, its running time can stretch to several hours. In this paper, we propose a hardware implementation of decision tree classification. We identify the compute-intensive kernel (Gini score computation) in the algorithm, and develop a highly efficient architecture, which is further optimized by reordering the computations and by using a bitmapped data structure. Our implementation on a Xilinx Virtex-II Pro FPGA platform (with 16 Gini units) provides up to 5.58times performance improvement over an equivalent software implementation
    Design, Automation & Test in Europe Conference & Exhibition, 2007. DATE '07; 05/2007
  • Source
    Conference Proceeding: An Architectural Characterization Study of Data Mining and Bioinformatics Workloads
    [show abstract] [hide abstract]
    ABSTRACT: Data mining is the process of automatically finding implicit, previously unknown, and potentially useful information from large volumes of data. Advances in data extraction techniques have resulted in tremendous increase in the input data size of data mining applications. Data mining systems, on the other hand, have been unable to maintain the same rate of growth. Therefore, there is an increasing need to understand the bottlenecks associated with the execution of these applications in modern architectures. In this paper, we present MineBench, a publicly available benchmark suite containing fifteen representative data mining applications belonging to various categories: classification, clustering, association rule mining and optimization. First, we highlight the uniqueness of data mining applications. Subsequently, we evaluate the MineBench applications on an 8-way shared memory (SMP) machine and analyze important performance characteristics such as L1 and L2 cache miss rates, branch misprediction rates
    Workload Characterization, 2006 IEEE International Symposium on; 11/2006
  • Article: MineBench: A Benchmark Suite for Data Mining Workloads
    [show abstract] [hide abstract]
    ABSTRACT: Data mining constitutes an important class of scientific and commercial applications. Recent advances in data extraction techniques have created vast data sets, which require increasingly complex data mining algorithms to sift through them to generate meaningful information. The disproportionately slower rate of growth of computer systems has led to a sizeable performance gap between data mining systems and algorithms. The first step in closing this gap is to analyze these algorithms and understand their bottlenecks. With this knowledge, current computer architectures can be optimized for data mining applications. In this paper, we present MineBench, a publicly available benchmark suite containing fifteen representative data mining applications belonging to various categories such as clustering, classification, and association rule mining. We believe that MineBench will be of use to those looking to characterize and accelerate data mining workloads
    IEEE Workload Characterization Symposium. 10/2006;