The LSST Data Mining Research Agenda

11/2008; DOI:10.1063/1.3059074
Source: arXiv

ABSTRACT We describe features of the LSST science database that are amenable to scientific data mining, object classification, outlier identification, anomaly detection, image quality assurance, and survey science validation. The data mining research agenda includes: scalability (at petabytes scales) of existing machine learning and data mining algorithms; development of grid-enabled parallel data mining algorithms; designing a robust system for brokering classifications from the LSST event pipeline (which may produce 10,000 or more event alerts per night); multi-resolution methods for exploration of petascale databases; indexing of multi-attribute multi-dimensional astronomical databases (beyond spatial indexing) for rapid querying of petabyte databases; and more. Comment: 5 pages, Presented at the "Classification and Discovery in Large Astronomical Surveys" meeting, Ringberg Castle, 14-17 October, 2008

0 0
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: The 20th century philosophy of science began on a positivistic note. Its focal point was scientific explanation and the hypothetico-deductive (HD) framework of explanation was proposed as the standard of what is meant by “science.” HD framework, its inductive and statistical variants, and other logic-based approaches to modeling scientific explanation were developed long before the dawn of the information age. Since that time, the volume of observational data and power of high performance computing have increased by several orders of magnitude and reshaped the practice and concept of science, and indeed, the philosophy of science. A new observational-inductive (OI) framework for scientific research is emerging due to recent developments in sensors, data systems, computers, and knowledge discovery techniques. We examine the nature of these changes and their impact on the question of what is meant by “science” after discussing five examples of the OI framework, and conclude that the HD and OI frameworks are complementary and synergistic.
    World Futures 01/2009; 65(1):61-75.
  • [show abstract] [hide abstract]
    ABSTRACT: Future space missions and science programs will be massive data producers. The technology to produce large data volumes must be matched by technologies to process, analyze, and make use of the data flood, in order to reap the maximum engineering benefit and scientific return from those technology investments. In particular, the integration of data from multiple sources will be standard practice, both for operational decision-making and for scientific decision-making. We describe the application of the emerging e-Science paradigm and its related technologies to data-driven discovery in space missions of the future.
    Space Mission Challenges for Information Technology, IEEE International Conference on. 01/2006;
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: I provide an incomplete inventory of the astronomical variability that will be found by next-generation time-domain astronomical surveys. These phenomena span the distance range from near-Earth satellites to the farthest Gamma Ray Bursts. The surveys that detect these transients will issue alerts to the greater astronomical community; this decision process must be extremely robust to avoid a slew of “false” alerts, and to maintain the community's trust in the surveys. I review the functionality required of both the surveys and the telescope networks that will be following them up, and the role of VOEvents in this process. Finally, I offer some ideas about object and event classification, which will be explored more thoroughly by other articles in these proceedings. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)
    Astronomische Nachrichten 02/2008; 329(3):280 - 283. · 1.40 Impact Factor

Full-text (2 Sources)

Available from
Apr 8, 2013