Jignesh M. Patel

University of Wisconsin–Madison, Madison, Wisconsin, United States

Are you Jignesh M. Patel?

Claim your profile

Publications (100)63.6 Total impact

  • Jason Power · Yinan Li · Mark D. Hill · Jignesh M. Patel · David A. Wood
    [Show abstract] [Hide abstract]
    ABSTRACT: Analytic database workloads are growing in data size and query complexity. At the same time, computer architects are struggling to continue the meteoric increase in performance enabled by Moore's Law. We explore the impact of two emerging architectural trends which may help continue the Moore's Law performance trend for analytic database workloads, namely 3D die-stacking and tight accelerator-CPU integration, specifically GPUs. GPUs have evolved from fixed-function units, to programmable discrete chips, and now are integrated with CPUs in most manufactured chips. Past efforts to use GPUs for analytic query processing have not had widespread practical impact, but it is time to re-examine and re-optimize database algorithms for massively data-parallel architectures. We argue that high-throughput data-parallel accelerators are likely to play a big role in future systems as they can be easily exploited by database systems and are becoming ubiquitous. Using the simple scan primitive as an example, we create a starting point for this discussion. We project the performance of both CPUs and GPUs in emerging 3D systems and show that the high-throughput data-parallel architecture of GPUs is more efficient in these future systems. We show that if database designers embrace emerging 3D architectures, there is possibly an order of magnitude performance and energy efficiency gain.
    No preview · Article · May 2015 · ACM SIGMOD Record

  • No preview · Article · Aug 2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: Exploring the inherent technical challenges in realizing the potential of Big Data.
    No preview · Article · Jul 2014 · Communications of the ACM
  • Article: WideTable
    Yinan Li · Jignesh M. Patel

    No preview · Article · Jun 2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: Data center design is a tedious and expensive process. Recently, this process has become even more challenging as users of cloud services expect to have guaranteed levels of availability, durability and performance. A new challenge for the service providers is to find the most cost-effective data center design and configuration that will accommodate the users' expectations, on ever-changing workloads, and con- stantly evolving hardware and software components. In this paper, we argue that data center design should become a systematic process. First, it should be done using an in- tegrated approach that takes into account both the hard- ware and the software interdependencies, and their impact on users' expectations. Second, it should be performed in a "wind tunnel", which uses large-scale simulation to system- atically explore the impact of a data center configuration on both the users' and the service providers' requirements. We believe that this is the first step towards systematic data center design - an exciting area for future research.
    No preview · Article · May 2014 · Proceedings of the VLDB Endowment
  • Spyros Blanas · Jignesh M. Patel
    [Show abstract] [Hide abstract]
    ABSTRACT: High-performance analytical data processing systems often run on servers with large amounts of main memory. A common operation in such environments is combining data from two or more sources using some "join" algorithm. The focus of this paper is on studying hash-based and sort-based equi-join algorithms when the data sets being joined fully reside in main memory. We only consider a single node setting, which is an important building block for larger high-performance distributed data processing systems. A critical contribution of this work is in pointing out that in addition to query response time, one must also consider the memory footprint of each join algorithm, as it impacts the number of concurrent queries that can be serviced. Memory footprint becomes an important deployment consideration when running analytical data processing services on hardware that is shared by other concurrent services. We also consider the impact of particular physical properties of the input and the output of each join algorithm. This information is essential for optimizing complex query pipelines with multiple joins. Our key contribution is in characterizing the properties of hash-based and sort-based equi-join algorithms, thereby allowing system implementers and query optimizers to make a more informed choice about which join algorithm to use.
    No preview · Conference Paper · Oct 2013
  • Craig Chasseur · Jignesh M. Patel
    [Show abstract] [Hide abstract]
    ABSTRACT: Existing main memory data processing systems employ a variety of storage organizations and make a number of storage-related design choices. The focus of this paper is on systematically evaluating a number of these key storage design choices for main memory analytical (i.e. read-optimized) database settings. Our evaluation produces a number of key insights: First, it is always beneficial to organize data into self-contained memory blocks rather than large files. Second, both column-stores and row-stores display performance advantages for different types of queries, and for high performance both should be implemented as options for the tuple-storage layout. Third, cache-sensitive B+-tree indices can play a major role in accelerating query performance, especially when used in a block-oriented organization. Finally, compression can also play a role in accelerating query performance depending on data distribution and query selectivity.
    No preview · Article · Aug 2013 · Proceedings of the VLDB Endowment
  • Yinan Li · Jignesh M. Patel
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper focuses on running scans in a main memory data processing system at "bare metal" speed. Essentially, this means that the system must aim to process data at or near the speed of the processor (the fastest component in most system configurations). Scans are common in main memory data processing environments, and with the state-of-the-art techniques it still takes many cycles per input tuple to apply simple predicates on a single column of a table. In this paper, we propose a technique called BitWeaving that exploits the parallelism available at the bit level in modern processors. BitWeaving operates on multiple bits of data in a single cycle, processing bits from different columns in each cycle. Thus, bits from a batch of tuples are processed in each cycle, allowing BitWeaving to drop the cycles per column to below one in some case. BitWeaving comes in two flavors: BitWeaving/V which looks like a columnar organization but at the bit level, and BitWeaving/H which packs bits horizontally. In this paper we also develop the arithmetic framework that is needed to evaluate predicates using these BitWeaving organizations. Our experimental results show that both these methods produce significant performance benefits over the existing state-of-the-art methods, and in some cases produce over an order of magnitude in performance improvement.
    No preview · Conference Paper · Jun 2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Data storage devices are getting "smarter." Smart Flash storage devices (a.k.a. "Smart SSD") are on the horizon and will package CPU processing and DRAM storage inside a Smart SSD, and make that available to run user programs inside a Smart SSD. The focus of this paper is on exploring the opportunities and challenges associated with exploiting this functionality of Smart SSDs for relational analytic query processing. We have implemented an initial prototype of Microsoft SQL Server running on a Samsung Smart SSD. Our results demonstrate that significant performance and energy gains can be achieved by pushing selected query processing components inside the Smart SSDs. We also identify various changes that SSD device manufacturers can make to increase the benefits of using Smart SSDs for data processing applications, and also suggest possible research opportunities for the database community.
    No preview · Conference Paper · Jun 2013
  • Jaeyoung Do · Donghui Zhang · Jignesh M. Patel · David J. DeWitt
    [Show abstract] [Hide abstract]
    ABSTRACT: A promising use of flash SSDs in a DBMS is to extend the main memory buffer pool by caching selected pages that have been evicted from the buffer pool. Such a use has been shown to produce significant gains in the steady state performance of the DBMS. One strategy for using the SSD buffer pool is to throw away the data in the SSD when the system is restarted (either when recovering from a crash or restarting after a shutdown), and consequently a long “ramp-up” period to regain peak performance is needed. One approach to eliminate this limitation is to use a memory-mapped file to store the SSD buffer table in order to be able to restore its contents on restart. However, this design can result in lower sustained performance, because every update to the SSD buffer table may incur an I/O operation to the memory-mapped file. In this paper we propose two new alternative designs. One design reconstructs the SSD buffer table using transactional logs. The other design asynchronously flushes the SSD buffer table, and upon restart, lazily verifies the integrity of the data cached in the SSD buffer pool. We have implemented these three designs in SQL Server 2012. For each design, both the write-through and write-back SSD caching policies were implemented. Using two OLTP benchmarks (TPC-C and TPC-E), our experimental results show that our designs produce up to 3.8X speedup on the interval between peak-to-peak performance, with negligible performance loss; in contrast, the previous approach has a similar speedup but up to 54% performance loss.
    No preview · Conference Paper · Apr 2013
  • Article: WHAM
    Yinan Li · Jignesh M. Patel · Allison Terrell

    No preview · Article · Dec 2012 · ACM Transactions on Database Systems
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Murine models are valuable instruments in defining the pathogenesis of diabetic nephropathy (DN), but they only partially recapitulate disease manifestations of human DN, limiting their utility. To define the molecular similarities and differences between human and murine DN, we performed a cross-species comparison of glomerular transcriptional networks. Glomerular gene expression was profiled in patients with early type 2 DN and in three mouse models (streptozotocin DBA/2, C57BLKS db/db, and eNOS-deficient C57BLKS db/db mice). Species-specific transcriptional networks were generated and compared with a novel network-matching algorithm. Three shared human–mouse cross-species glomerular transcriptional networks containing 143 (Human-DBA STZ), 97 (Human-BKS db/db), and 162 (Human-BKS eNOS−/− db/db) gene nodes were generated. Shared nodes across all networks reflected established pathogenic mechanisms of diabetes complications, such as elements of Janus kinase (JAK)/signal transducer and activator of transcription (STAT) and vascular endothelial growth factor receptor (VEGFR) signaling pathways. In addition, novel pathways not previously associated with DN and cross-species gene nodes and pathways unique to each of the human–mouse networks were discovered. The human–mouse shared glomerular transcriptional networks will assist DN researchers in selecting mouse models most relevant to the human disease process of interest. Moreover, they will allow identification of new pathways shared between mice and humans.
    Full-text · Article · Nov 2012 · Diabetes
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this new era of "big data", traditional DBMSs are under attack from two sides. At one end of the spectrum, the use of document store NoSQL systems (e.g. MongoDB) threatens to move modern Web 2.0 applications away from traditional RDBMSs. At the other end of the spectrum, big data DSS analytics that used to be the domain of parallel RDBMSs is now under attack by another class of NoSQL data analytics systems, such as Hive on Hadoop. So, are the traditional RDBMSs, aka "big elephants", doomed as they are challenged from both ends of this "big data" spectrum? In this paper, we compare one representative NoSQL system from each end of this spectrum with SQL Server, and analyze the performance and scalability aspects of each of these approaches (NoSQL vs. SQL) on two workloads (decision support analysis and interactive data-serving) that represent the two ends of the application spectrum. We present insights from this evaluation and speculate on potential trends for the future.
    Preview · Article · Aug 2012 · Proceedings of the VLDB Endowment
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Energy is a growing component of the operational cost for many "big data" deployments, and hence has become increasingly important for practitioners of large-scale data analysis who require scale-out clusters or parallel DBMS appliances. Although a number of recent studies have investigated the energy efficiency of DBMSs, none of these studies have looked at the architectural design space of energy-efficient parallel DBMS clusters. There are many challenges to increasing the energy efficiency of a DBMS cluster, including dealing with the inherent scaling inefficiency of parallel data processing, and choosing the appropriate energy-efficient hardware. In this paper, we experimentally examine and analyze a number of key parameters related to these challenges for designing energy-efficient database clusters. We explore the cluster design space using empirical results and propose a model that considers the key bottlenecks to energy efficiency in a parallel DBMS. This paper represents a key first step in designing energy-efficient database clusters, which is increasingly important given the trend toward parallel database appliances.
    Preview · Article · Aug 2012 · Proceedings of the VLDB Endowment
  • Source
    Spyros Blanas · Jignesh M Patel
    [Show abstract] [Hide abstract]
    ABSTRACT: We recently published a paper [2] that examines the design choices available to create a high-performance main-memory hash join algorithm. We experimentally evaluated four hash join variants on two different architectures, and we showed that an algorithm that does not do any partitioning on the input tables often outperforms the other more complex partitioningbased join alternatives. Our claim is that in an environment with a single processor and multiple cores, the non-partitioning method has many advantages over the more complex methods that have been proposed before. If the memory access latency between different processors is non-uniform, partitioning will be more beneficial; the non-partitioning method could then be used as a building block for an efficient hash join algorithm for data that has been partitioned to each processor (NUMA node). A full exploration of hash join methods for NUMA environments is part of our future work. In our paper [2], one of the algorithms that we evaluated this simple hash join algorithm against was the radix-partitioned hash join [3]. We implemented the parallel radix-partitioned hash join algorithm that is described in [4]. During the conference, there were some questions regarding how efficient our radix join implementation is. Unfortunately, because of three differences
    Preview · Article · May 2012
  • Source
    Willis Lang · Srinath Shankar · Jignesh M. Patel · Ajay Kalhan
    [Show abstract] [Hide abstract]
    ABSTRACT: As traditional and mission-critical relational database workloads migrate to the cloud in the form of Database-as-a-Service (DaaS), there is an increasing motivation to provide performance goals in Service Level Objectives (SLOs). Providing such performance goals is challenging for DaaS providers as they must balance the performance that they can deliver to tenants and the data center's operating costs. In general, aggressively aggregating tenants on each server reduces the operating costs but degrades performance for the tenants, and vice versa. In this paper, we present a framework that takes as input the tenant workloads, their performance SLOs, and the server hardware that is available to the DaaS provider, and outputs a cost-effective recipe that specifies how much hardware to provision and how to schedule the tenants on each hardware resource. We evaluate our method and show that it produces effective solutions that can reduce the costs for the DaaS provider while meeting performance goals.
    Preview · Article · Apr 2012
  • Source
    Avrilia Floratou · Jignesh M. Patel · Willis Lang · Alan Halverson
    [Show abstract] [Hide abstract]
    ABSTRACT: The current computing trend towards cloud-based Database-as-a-Service (DaaS) as an alternative to traditional on-site relational database management systems (RDBMSs) has largely been driven by the perceived simplicity and cost-effectiveness of migrating to a DaaS. However, customers that are attracted to these DaaS alternatives may find that the range of different services and pricing options available to them add an unexpected level of complexity to their decision making. Cloud service pricing models are typically 'pay-as-you-go' in which the customer is charged based on resource usage such as CPU and mem-ory utilization. Thus, customers considering different DaaS options must take into account how the performance and efficiency of the DaaS will ultimately impact their monthly bill. In this paper, we show that the current DaaS model can produce unpleasant surprises – for example, the case study that we present in this paper illustrates a scenario in which a DaaS service powered by a DBMS that has a lower hourly rate actually costs more to the end user than a DaaS service that is powered by an-other DBMS that charges a higher hourly rate. Thus, what we need is a method for the end-user to get an accurate estimate of the true costs that will be incurred without worrying about the nuances of how the DaaS operates. One potential solution to this problem is for DaaS providers to offer a new service called Benchmark as a Service (BaaS) where in the user provides the parameters of their workload and SLA requirements, and get a price quote.
    Preview · Article · Jan 2012
  • Source
    Ning Zhang · Junichi Tatemura · Jignesh M. Patel · Hakan Hacıgümüş
    [Show abstract] [Hide abstract]
    ABSTRACT: Data center operators face a bewildering set of choices when considering how to provision resources on machines with complex I/O subsystems. Modern I/O subsystems often have a rich mix of fast, high performing, but expensive SSDs sitting alongside with cheaper but relatively slower (for random accesses) traditional hard disk drives. The data center operators need to determine how to provision the I/O resources for specific workloads so as to abide by existing Service Level Agreements (SLAs), while minimizing the total operating cost (TOC) of running the workload, where the TOC includes the amortized hardware costs and the run time energy costs. The focus of this paper is on introducing this new problem of TOC-based storage allocation, cast in a framework that is compatible with traditional DBMS query optimization and query processing architecture. We also present a heuristic-based solution to this problem, called DOT. We have implemented DOT in PostgreSQL, and experiments using TPC-H and TPC-C demonstrate significant TOC reduction by DOT in various settings.
    Full-text · Article · Dec 2011 · The VLDB Journal
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A database system optimized for in-memory storage can support much higher transaction rates than current systems. However, standard concurrency control methods used today do not scale to the high transaction rates achievable by such systems. In this paper we introduce two efficient concurrency control methods specifically designed for main-memory databases. Both use multiversioning to isolate read-only transactions from updates but differ in how atomicity is ensured: one is optimistic and one is pessimistic. To avoid expensive context switching, transactions never block during normal processing but they may have to wait before commit to ensure correct serialization ordering. We also implemented a main-memory optimized version of single-version locking. Experimental results show that while single-version locking works well when transactions are short and contention is low performance degrades under more demanding conditions. The multiversion schemes have higher overhead but are much less sensitive to hotspots and the presence of long-running transactions.
    Full-text · Article · Dec 2011 · Proceedings of the VLDB Endowment
  • Avrilia Floratou · Sandeep Tata · Jignesh M. Patel
    [Show abstract] [Hide abstract]
    ABSTRACT: Existing sequence mining algorithms mostly focus on mining for subsequences. However, a large class of applications, such as biological DNA and protein motif mining, require efficient mining of "approximate" patterns that are contiguous. The few existing algorithms that can be applied to find such contiguous approximate pattern mining have drawbacks like poor scalability, lack of guarantees in finding the pattern, and difficulty in adapting to other applications. In this paper, we present a new algorithm called FLexible and Accurate Motif DEtector (FLAME). FLAME is a flexible suffix-tree-based algorithm that can be used to find frequent patterns with a variety of definitions of motif (pattern) models. It is also accurate, as it always finds the pattern if it exists. Using both real and synthetic data sets, we demonstrate that FLAME is fast, scalable, and outperforms existing algorithms on a variety of performance metrics. In addition, based on FLAME, we also address a more general problem, named extended structured motif extraction, which allows mining frequent combinations of motifs under relaxed constraints.
    No preview · Article · Aug 2011 · IEEE Transactions on Knowledge and Data Engineering

Publication Stats

3k Citations
63.60 Total Impact Points

Institutions

  • 1996-2015
    • University of Wisconsin–Madison
      • Department of Computer Sciences
      Madison, Wisconsin, United States
  • 2009
    • Beijing Computational Science Research Center
      Peping, Beijing, China
  • 2000-2009
    • University of Michigan
      • • Department of Electrical Engineering and Computer Science (EECS)
      • • Division of Computer Science and Engineering
      Ann Arbor, Michigan, United States
  • 2007
    • Concordia University–Ann Arbor
      Ann Arbor, Michigan, United States