The Palomar‐Quest digital synoptic sky survey

Astronomische Nachrichten (Impact Factor: 1.12). 02/2008; 329(3):263 - 265. DOI: 10.1002/asna.200710948
Source: arXiv

ABSTRACT We describe briefly the Palomar-Quest (PQ) digital synoptic sky survey, including its parameters, data processing, status, and plans. Exploration of the time domain is now the central scientific and technological focus of the survey. To this end, we have developed a real-time pipeline for detection of transient sources.We describe some of the early results, and lessons learned which may be useful for other, similar projects, and time-domain astronomy in general. Finally, we discuss some issues and challenges posed by the real-time analysis and scientific exploitation of massive data streams from modern synoptic sky surveys. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Time series photometry of 20 Cataclysmic Variables detected by the Catalina Real-time Transient Survey is presented. 14 of these systems have not been observed previously and only two have been examined in-depth. From the observations we determined 12 new orbital periods and independently found a further two. Eight of the CVs are eclipsing systems, five of which have eclipse depths of more than 0.9 mag. Included in the sample are six SU UMa systems (three of which show superhumps in our photometry), a polar (SSS1944-42) and one system (CSS1417-18) that displays an abnormally fast decline from outburst.
    Monthly Notices of the Royal Astronomical Society 10/2013; 437(1). · 5.23 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: New modes of discovery are enabled by the growth of data and computational resources (i.e., cyberinfrastructure) in the sciences. This cyberinfrastructure includes structured databases, virtual observatories (distributed data, as described in Section 20.2.1 of this chapter), high-performance computing (petascale machines), distributed computing (e.g., the Grid, the Cloud, and peer-to-peer networks), intelligent search and discovery tools, and innovative visualization environments. Data streams from experiments, sensors, and simulations are increasingly complex and growing in volume. This is true in most sciences, including astronomy, climate simulations, Earth observing systems, remote sensing data collections, and sensor networks. At the same time, we see an emerging confluence of new technologies and approaches to science, most clearly visible in the growing synergism of the four modes of scientific discovery: sensors-modeling-computing-data (Eastman et al. 2005). This has been driven by numerous developments, including the information explosion, development of large-array sensors, acceleration in high-performance computing (HPC) power, advances in algorithms, and efficient modeling techniques. Among these, the most extreme is the growth in new data. Specifically, the acquisition of data in all scientific disciplines is rapidly accelerating and causing a data glut (Bell et al. 2007). It has been estimated that data volumes double every year—for example, the NCSA (National Center for Supercomputing Applications) reported that their users cumulatively generated one petabyte of data over the first 19 years of NCSA operation, but they then generated their next one petabyte in the next year alone, and the data production has been growing by almost 100% each year after that (Butler 2008). The NCSA example is just one of many demonstrations of the exponential (annual data-doubling) growth in scientific data collections. In general, this putative data-doubling is an inevitable result of several compounding factors: the proliferation of data-generating devices, sensors, projects, and enterprises; the 18-month doubling of the digital capacity of these microprocessor-based sensors and devices (commonly referred to as "Moore’s law"); the move to digital for nearly all forms of information; the increase in human-generated data (both unstructured information on the web and structured data from experiments, models, and simulation); and the ever-expanding capability of higher density media to hold greater volumes of data (i.e., data production expands to fill the available storage space). These factors are consequently producing an exponential data growth rate, which will soon (if not already) become an insurmountable technical challenge even with the great advances in computation and algorithms. This technical challenge is compounded by the ever-increasing geographic dispersion of important data sources—the data collections are not stored uniformly at a single location, or with a single data model, or in uniform formats and modalities (e.g., images, databases, structured and unstructured files, and XML data sets)—the data are in fact large, distributed, heterogeneous, and complex. The greatest scientific research challenge with these massive distributed data collections is consequently extracting all of the rich information and knowledge content contained therein, thus requiring new approaches to scientific research. This emerging data-intensive and data-oriented approach to scientific research is sometimes called discovery informatics or X-informatics (where X can be any science, such as bio, geo, astro, chem, eco, or anything; Agresti 2003; Gray 2003; Borne 2010). This data-oriented approach to science is now recognized by some (e.g., Mahootian and Eastman 2009; Hey et al. 2009) as the fourth paradigm of research, following (historically) experiment/observation, modeling/analysis, and computational science.
    Advances in Machine Learning and Data Mining for Astronomy. 03/2012;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: An automated, rapid classification of transient events detected in the modern synoptic sky surveys is essential for their scientific utility and effective follow-up using scarce resources. This presents some unusual challenges: the data are sparse, heterogeneous and incomplete; evolving in time; and most of the relevant information comes not from the data stream itself, but from a variety of archival data and contextual information (spatial, temporal, and multi-wavelength). We are exploring a variety of novel techniques, mostly Bayesian, to respond to these challenges, using the ongoing CRTS sky survey as a testbed. The current surveys are already overwhelming our ability to effectively follow all of the potentially interesting events, and these challenges will grow by orders of magnitude over the next decade as the more ambitious sky surveys get under way. While we focus on an application in a specific domain (astrophysics), these challenges are more broadly relevant for event or anomaly detection and knowledge discovery in massive data streams.

Full-text (2 Sources)

Available from
May 27, 2014