Luis Leopoldo Perez

Rice University, Houston, Texas, United States

Are you Luis Leopoldo Perez?

Claim your profile

Publications (7)0 Total impact

  • ACM Trans. Database Syst. 01/2011; 36:18.
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Since the 1970's, database systems have been "compute-centric". When a computation needs the data, it requests the data, and the data are pulled through the system. We believe that this is problematic for two reasons. First, requests for data naturally incur high latency as the data are pulled through the memory hierarchy, and second, it makes it difficult or impossible for multiple queries or operations that are interested in the same data to amortize the bandwidth and latency costs associated with their data access. In this paper, we describe a purely-push based, research prototype database system called DataPath. DataPath is "data-centric". In DataPath, queries do not request data. Instead, data are automatically pushed onto processors, where they are then processed by any interested computation. We show experimentally on a multi-terabyte benchmark that this basic design principle makes for a very lean and fast database system.
    Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June 6-10, 2010; 01/2010
  • Luis Leopoldo Perez, Subi Arumugam, Christopher M. Jermaine
    Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June 6-10, 2010; 01/2010
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Enterprises often need to assess and manage the risk arising from uncertainty in their data. Such uncertainty is typically modeled as a probability distribution over the uncertain data values, specified by means of a complex (often predictive) stochastic model. The probability distribution over data values leads to a probability dis- tribution over database query results, and risk assessment amounts to exploration of the upper or lower tail of a query-result distribu- tion. In this paper, we extend the Monte Carlo Database System to efficiently obtain a set of samples from the tail of a query-result distribution by adapting recent "Gibbs cloning" ideas from the sim- ulation literature to a database setting.
    PVLDB. 01/2010; 3:782-793.
  • Source
    Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10-12, 2008; 01/2008
  • Source
    Conference Proceeding: The DBO database system.
    [show abstract] [hide abstract]
    ABSTRACT: We demonstrate our prototype of the DBO database system. DBO is designed to facilitate scalable analytic processing over large data archives. DBO's analytic processing perfor- mance is competitive with other database systems; however, unlike any other existing research or industrial system, DBO maintains a statistically meaningful guess to the nal answer to a query from start to nish during query processing. This guess may be quite accurate after only a few seconds or min- utes, while answering a query exactly may take hours. This can result in signicant savings in both user and computer time, since a user can abort a query as soon as he or she is happy with the guess' accuracy.
    Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10-12, 2008; 01/2008
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: We demonstrate our prototype of the DBO database system. DBO is designed to facilitate scalable analytic processing over large data archives. DBO's analytic processing perfor- mance is competitive with other database systems; however, unlike any other existing research or industrial system, DBO maintains a statistically meaningful guess to the final answer to a query from start to finish during query processing. This guess may be quite accurate after only a few seconds or min- utes, while answering a query exactly may take hours. This can result in significant savings in both user and computer time, since a user can abort a query as soon as he or she is happy with the guess' accuracy.