arXiv:0801.3005v1 [astro-ph] 21 Jan 2008
Astron. Nachr. / AN 329, No.3, (2008)/ Ref. Proc. ”Hotwiring the Transient Universe”, eds. A.Allan, R.Seaman,J.Bloom
The Palomar-Quest Digital Synoptic Sky Survey
S.G. Djorgovski1,,⋆C. Baltay2, A.A. Mahabal1, A.J. Drake3, R. Williams3, D. Rabinowitz2, M.J.
Graham3, C. Donalek1, E. Glikman1, A. Bauer2, R. Scalzo2, N. Ellman2, and J. Jerke2
1Astronomy, MS 105-24, Caltech, Pasadena, CA 91125, USA
2Physics Dept., Yale University, New Haven, CT 06520, USA
3Center for Advanced Computing Research, MS 158-79, Caltech, Pasadena, CA 91125, USA
Received 01 Sep 2007, accepted 25 Dec 2007
Sky Surveys – Transients – Software Systems
We describe briefly the Palomar-Quest (PQ) digital synoptic sky survey, including its parameters, data processing, status,
and plans. Exploration of the time domain is now the central scientific and technological focus of the survey. To this end,
wehavedeveloped areal-timepipeline for detection of transient sources. Wedescribe some of the earlyresults, andlessons
learned which may be useful for other, similar projects, and time-domain astronomy in general. Finally, we discuss some
issues and challenges posed by the real-time analysis and scientific exploitation of massive data streams from modern
synoptic sky surveys.
1 A Brief Description of the PO Survey
The Palomar-Quest (PQ) digital synoptic sky survey is a
collaborative project between groups at Yale University and
Caltech (Co-PIs: C. Baltay and S.G. Djorgovski), with an
extendednetworkof collaborationswith other groupsworld
- wide, including Indiana U. (M. Gebhard et al.), NCSA
(R. Brunner et al.), LBNL Nearby SN Factory (NSNF; S.
Perlmutter et al.), INAOE (Puebla, Mexico; L. Carrasco, O.
Lopez-Cruz et al.), EPFL (Switzerland; G. Meylan et al.),
and Caltech/JPL (M. Brown et al.). The data are obtained
at the Palomar Observatorys Samuel Oschin telescope (the
48-inch Schmidt) using the QUEST-2 112-CCD, 161 Mpix
camera (Baltay et al. 2007). Approx. 45% of the telescope
time is used for the PQ survey.The survey started in the late
summer of 2003, and will finish in the late 2008.
In the first phase of the survey, data were obtained in
the drift scan mode in 4.6◦wide strips of a constant Dec, in
the range −25◦< δ < +25◦, excluding the Galactic plane.
The total area coverage is ∼ 15,000 deg2, with multiple
passes, ranging from a few to about 25, and typically 5 –
10 times, with time baselines ranging from hours to years.
There are some thin-strip gaps in the coverage, due to a
combination of inter-CCD gaps, bad CCDs, and a subop-
timal dithering strategy. Typical area coverage rate is up to
∼ 500 deg2/night in 4 filters. The raw data rate is on aver-
age ∼ 70GB per clear night. To date, about 25 TB of usable
data have been collected in the drift scan mode.
Data were obtained with two filter sets, Johnson UBRI
and Gunn/SDSS rizz, recently changed to griz. Effective
exposures are ∼ 150 sec / cos δ per pass. Typical estimated
limiting magnitudes are rlim≈ 21.5, ilim≈ 20.5, zlim≈
19.5, Rlim ≈ 22, and Ilim ≈ 21 mag, depending on the
⋆Corresponding author: Djorgovski, e-mail: firstname.lastname@example.org
seeing, lunarphase,etc. Coaddingof ∼ 8 passes reaches the
are done independently at Yale and Caltech, mainly using
the overlap region with SDSS.
In the second phase of the survey, which started in the
spring of 2007, data are obtained in the traditional point-
and-track mode, in a single, wide-band red filter (RG610),
with ∼ 10% of the time in the drift scan mode. The cover-
age and the cadence are optimized for the nearby supernova
search, in collaboration with the LBNL NSNF group, and a
search for dwarf planets, in collaboration with M. Brown.
Data are processed with several different pipelines, op-
timized for different scientific goals. This includes the Yale
pipeline (Andrews et al. 2007), which does the PSF fitting
and was designed for a search for gravitationally lensed
quasars; the Caltech data cleaning pipeline, used to remove
numerous instrumental artifacts present in the data; the Cal-
sient events, as described below; the LBNL NSNF pipeline,
based on image subtraction and designed for detection of
nearby SNe; and a pipeline for an optimal coadding of im-
ages and detection of sources in them, now developed at
Caltech. Images and resulting catalogs are stored in multi-
ple locations, using a variety of databases.
PQ is the first major digital sky survey fully designed
and implemented in the Virtual Observatory (VO) era, and
it uses VO standards and protocols throughout. Public data
releases will be also done through VO-type interfaces. The
first publicdatareleaseis imminent,pendingthecompletion
of various data quality control and assessment tests.
The survey is feeding multiple scientific goals and pro-
jects. The initial motivation was a search for > 105QSOs,
using colors and variability, in order to discover > 100
strong gravitational lenses, and use them to constrain cos-
mology and/or history of mass assembly. Another project
2S.G. Djorgovski et al.: Palomar-Quest Survey
was a search forhigh-z QSOs, to be used as probesofreion-
ization and early structure formation. Both of them are now
finally starting to yield results; the progress was slow due to
numerous problems with the data, all of which have been
solved, and will be documented in detail elsewhere. Our
principalscientific focusnowis explorationoftime domain,
as described below.
Our mainpublicoutreacheffortto date has been the cre-
ation of theGriffith Observatory’s“Big Picture”, andthe as-
sociated website, http://bigpicture.caltech.edu. This exhibit
will be seen by millions of visitors, serving multiple educa-
tional roles in the years to come.
2 PQ Exploration of the Time Domain:
Some Preliminary Results
With a data set covering nearly 40% of the entire sky, with
multiple passes reaching∼ 21mag each,andtime baselines
ranging from minutes (between different CCDs) to hours
(repeated scans in the same night), days (within the same
lunation), months, and years, and (using the cross-matches
to DPOSS and SDSS catalogs) up to decades, PQ is in a
unique position to explore time-variable sky in a systematic
fashion. For some early reports, see Graham et al. (2005),
Mahabal et al. (2004, 2005), or Djorgovski et al. (2006).
One major effort is a search for nearby (z ∼ 0.1) SNe
Ia, to be used as the low-z calibration of the Hubble dia-
gram. This project is led by the Yale group in collaboration
with the LBNL NSNF. To date, this effort has found a total
of about 500 SNe, about a half of which were spectroscop-
ically confirmed, and among them about 70 Type Ias with
10 or more spectra taken; as well as a plethora of other SNe
(including some peculiar ones) and transients. All are pub-
lished in IAU Circulars, CBETs, and ATel’s. The work uses
image subtraction technique, in order to remove the well-
detected light host galaxies. The Caltech real-time pipeline
is now also starting to detect SNe, using a search for tran-
sients in the catalog domain.
We are now using the archives of our data to study sys-
tematically the variability of QSOs, and especially Blazars.
Some examples are shown in Fig. 1. The main goal is to
devise an algorithm based on colors and variability alone
to define a purely optically selected sample of Blazars, and
thus check on the selection effects in the traditional radio
and x-ray approaches. These sources may be the main con-
tributors to the extragalactic γ-ray background, a subject
of considerable interest with the upcoming launch of the
GLAST mission. They are also implicated as sources of
ultra-high energy cosmic rays (UHECR). These cosmic ac-
celerators can reach energies several orders of magnitude
higher than any predictable terrestrial accelerators. Their
census and detailed studies are thus of a considerable and
Our exploration of the archival PQ data has yielded a
large number of transients, operationally defined as PSF-
data. The top row shows them in a relatively high state, the
bottom row in a relatively low state.
Examples of 3 known Blazars, as seen in the PQ
like sources detected in only one epoch, with no detectable
apparent motion between different CCDs in a single pass.
Subsequent studies have revealed counterparts for some of
them in deeper, coadded images. We believe that many of
them are probablyasteroids caught near the stationary point
(see below). However, this has underscored the need to de-
tect and follow transients in a real or near-real time, in order
to determine their physical nature.
We have thus developed a real-time pipeline, which is
now operational. The pipeline does the standard removal of
instrumental signatures, pushes the data through the Cal-
ments astrometry, compares the new catalogs to those from
the previous passes, finds newly detected sources, imple-
ments a number of software filters to eliminate the residual
instrumental artifacts, known asteroids or variables, moving
objects (uncatalogued asteroids), produces cutout images
and webpages for the candidate transients, and publishes
them using the VOEvent protocols and on VOEN website,
We typically do a ∼ 4-hour long scan, then re-scan the
same area again, with the real-time pipeline running. In a
typical half-night scan, we may detect a couple of million
sources, and about a thousand potential transients. Removal
of residual instrumental artifacts leaves a few hundred gen-
uine detections, nearly all of which are asteroids; of them,
typically only a half are among the previously catalogued;
the rest are largelyremovedafterthe second scan.The num-
procedures were improved. Over the past year or so, nearly
4800 events have been submitted, with an average rate of ∼
200 per night. About 85% of these were immediately clas-
sified as asteroids, and the majority of the remaining ones
are as well. Finally, there are only a few (< 10/night) ap-
follow-up observations to date show that they are a mixture
of SNe, AGN, probableflaring M dwarfs, and the rest are of
as yet unknown nature. Some are re-discovered on different
c ? 2008 WILEY-VCH Verlag GmbH&Co.KGaA, Weinheim
Astron. Nachr. / AN (2008)3
time pipeline, PQT 070519:143304+150707; see Drake et
al., ATel 1083. The top row are detection images in the g
and r bands; the bottom row are the comparison baseline
images. The source faded slowly, but got redder rapidly; it
may be a rare type of a SN.
An example of a transient detected with our real-
3 Some Lessons Learned
Combining the current PQ experiences with the older work
with DPOSS (see, e.g., Mahabal et al. 2005), we estimate
that ina single-passsnapshotsurveythereare∼ 10−2astro-
physical transients/deg2down to ∼ 20 mag at high Galactic
latitudes. Many of them are known, highly variable types of
objects, where the “low state” is below the detection of the
baseline data, with variable stars of different kinds domi-
nating on the short time scales (∼ minutes to months), and
AGN (mainly Blazars and OVVs) dominating on the longer
time scales (years and longer). Some are a variety of stel-
lar explosions. Some may be as-yet unknown types of ob-
jects and phenomena, but real-time spectroscopic and other
follow-up is necessary in order to discover them.
We find that a principal contaminant for optical surveys
are the slow-moving asteroids; there are ∼ 1 − 3 of them
per deg2down to ∼ 21 mag, depending very much on the
Ecliptic latitude; i.e., > 100asteroids for each astrophysical
transient. A joint analysis for movingand variableobjects is
necessary,andanytypeofa synopticsky surveydatastream
can feed both scientific domains simultaneously. Improv-
ing the existing catalogs of asteroids is an urgent task. At
least two epochs are neededin orderto eliminate previously
unknown asteroids in any synoptic survey, and their base-
line will define the effective time resolution of any transient
search (we also note that at least 3 properly spaced epochs
are needed to compute even a rough orbit).
The quality of the baseline or fiducial sky against which
current observations are compared is a key issue. It must
be deep, clean, complete, and wavelength-matched. Gen-
erating a standard, dynamically evolving, annotated, multi-
wavelength baseline sky may be a good community (VO)
project; we are developing a prototype from PQ and other
publicly available panoramic imaging data sets.
Achieving a high completeness (a few real transients
missed) and a low contamination (a few false alarms) is a
huge challenge. Interesting sources are discovered as out-
liers in some parameter space; problems with the data also
generate outliers in some parameter space. In a large data
set, most unlikely things will happen, and most of them are
bad. Robust and reliable data cleaning is a key requirement.
This is hard to do in a cutting-edge software system.
Data systems (pipelines, archives, and analysis) and op-
erational procedures for synoptic sky surveys are subject to
a substantial tension between static and dynamic compo-
nents, including both real-time and subsequent (non-time-
critical) analysis and distribution, data ingestion, database
updating and recomputing, etc. This has implications both
for survey strategies and system architecture design.
Another key challenge is an automated classification of
events for prioritized follow-up, as discussed by Mahabal
et al. and Bloom et al. elsewhere in this volume. This will
certainly requireuse of machinelearningtools, as described
by Vestrand et al. in this volume.
All of these challenges will grow much sharper, as the
data volume and data flux increases dramatically in upcom-
ing synoptic sky surveys. We are now dealing with data
streams of the order of 0.1TB/night, and ∼ 10 transients/nt.
On a time scale of ∼ 1 − 5 years, this will increase to ∼ 1
TB/night and ∼ 104transients/night (e.g., PanSTARRS),
and on a time scale of ∼ 5 − 10 years, this will increase to
∼ 20 TB/night and ∼ 105− 106transients/night (LSST).
Development and testing of software, methodologies, and
operational and follow-up procedures is an urgent task, in
which surveys such as PQ can play an important role.
Acknowledgements. Wethank many collaborators who have made
essential contributions to the survey, and the staff of Palomar Ob-
servatory for their tireless efforts during the survey operations.
This work was supported in part by the NSF grants AST-0407448,
AST-0326524, and CNS-0540369, by the Ajax Foundation, and
other private donors. SGD acknowledges astimulating atmosphere
of the Aspen Center for Physics. Finally, we thank the workshop
organizers for an excellent and productive meeting.
Andrews, P., et al.: 2007, PASP, in press (astro-ph/0703446)
Baltay, C., et al.: 2007, PASP, in press (astro-ph/0702590)
Djorgovski, S.G., et al.: 2006, in Proc. ICPR2006, eds. Y.Y. Tang
et al., IEEE Press, p. 856 (astro-ph/0608638)
Graham, M., et al. (the PQ Survey Team): 2004: in Proc. ADASS
XIII, eds. F. Ochsenbein et al., ASPCS 314, 14
Mahabal, A., et al.: 2004, in press (astro-ph/0408035)
Mahabal, A., et al. (the PQ Survey Team): 2005, in Proc. ADASS
XIV, eds. P. Shopbell et al., ASPCS 347, 604
c ? 2008 WILEY-VCH Verlag GmbH&Co.KGaA, Weinheim