The Chandra survey of the COSMOS field II: source detection and photometry
ABSTRACT The Chandra COSMOS Survey (C-COSMOS) is a large, 1.8 Ms, Chandra program, that covers the central contiguous ~0.92 deg^2 of the COSMOS field. C-COSMOS is the result of a complex tiling, with every position being observed in up to six overlapping pointings (four overlapping pointings in most of the central ~0.45 deg^2 area with the best exposure, and two overlapping pointings in most of the surrounding area, covering an additional ~0.47 deg^2). Therefore, the full exploitation of the C-COSMOS data requires a dedicated and accurate analysis focused on three main issues: 1) maximizing the sensitivity when the PSF changes strongly among different observations of the same source (from ~1 arcsec up to ~10 arcsec half power radius); 2) resolving close pairs; and 3) obtaining the best source localization and count rate. We present here our treatment of four key analysis items: source detection, localization, photometry, and survey sensitivity. Our final procedure consists of a two step procedure: (1) a wavelet detection algorithm, to find source candidates, (2) a maximum likelihood Point Spread Function fitting algorithm to evaluate the source count rates and the probability that each source candidate is a fluctuation of the background. We discuss the main characteristics of this procedure, that was the result of detailed comparisons between different detection algorithms and photometry tools, calibrated with extensive and dedicated simulations. Comment: Accepted for publication in The Astrophysical Journal Supplement Series
-
Citations (0)
-
Cited In (0)
Page 1
arXiv:0910.2617v1 [astro-ph.IM] 14 Oct 2009
Draft version October 14, 2009
Preprint typeset using LATEX style emulateapj v. 03/07/07
THE CHANDRA SURVEY OF THE COSMOS FIELD II: SOURCE DETECTION AND PHOTOMETRY
S. Puccetti1, C. Vignali2,3, N. Cappelluti4, F. Fiore5, G. Zamorani3, T. L. Aldcroft6, M. Elvis6, R. Gilli3T.
Miyaji7,8, H. Brunner4, M. Brusa4, F. Civano6, A. Comastri3, F. Damiani9, A. Fruscione4, A. Finoguenov4,10, A.
M. Koekemoer11, V. Mainieri12
Draft version October 14, 2009
ABSTRACT
The Chandra COSMOS Survey (C-COSMOS) is a large, 1.8 Ms, Chandra program, that covers the
central contiguous ∼ 0.92 deg2of the COSMOS field. C-COSMOS is the result of a complex tiling,
with every position being observed in up to six overlapping pointings (four overlapping pointings in
most of the central ∼ 0.45 deg2area with the best exposure, and two overlapping pointings in most of
the surrounding area, covering an additional ∼ 0.47 deg2). Therefore, the full exploitation of the C-
COSMOS data requires a dedicated and accurate analysis focused on three main issues: 1) maximizing
the sensitivity when the PSF changes strongly among different observations of the same source (from
∼ 1 arcsec up to ∼ 10 arcsec half power radius); 2) resolving close pairs; and 3) obtaining the best
source localization and count rate. We present here our treatment of four key analysis items: source
detection, localization, photometry, and survey sensitivity. Our final procedure consists of a two step
procedure: (1) a wavelet detection algorithm, to find source candidates, (2) a maximum likelihood
Point Spread Function fitting algorithm to evaluate the source count rates and the probability that
each source candidate is a fluctuation of the background. We discuss the main characteristics of this
procedure, that was the result of detailed comparisons between different detection algorithms and
photometry tools, calibrated with extensive and dedicated simulations.
Subject headings: X-rays; Surveys
1. INTRODUCTION
It is well known that X-ray surveys are an extremely
efficient tool to select Active Galactic Nuclei (AGN). For
example in the XMM-Newton COSMOS survey, at the
0.5-2 keV limiting flux of 7·10−16erg s−1cm−2, the AGN
surface density is ∼1000 deg−2(Hasinger et al. 2007,
Cappelluti et al. 2007), a factor 2-4 greater than the
AGN surface density in the most recent deep optical sur-
veys, 250 deg−2in the COMBO-17 ( Wolf et al. 2003)
and 470 deg−2in VVDS Survey (Gavignaud et al. 2006).
There are four main causes for the higher efficiency of X-
ray surveys in finding AGN: 1) X-rays directly trace the
super massive black hole (SMBH) accretion, while AGN
Electronic address: puccetti@asdc.asi.it
1ASI Science data Center, via Galileo Galilei, 00044 Frascati,
Italy
2Dipartimento di Astronomia, Universita’ di Bologna, via Ran-
zani 1, Bologna, Italy
3INAF−Osservatorio Astronomico di Bologna, Via Ranzani 1,
I–40127 Bologna, Italy
4Max Planck Institut f¨ ur extraterrestrische Physik, Giessen-
bachstrasse 1, D–85748 Garching bei M¨ unchen, Germany
5INAF-OAR, via Frascati 33, Monteporzio, I00040, Italy
6Harvard-Smithsonian Center for Astrophysics, 60 Garden St.,
Cambridge, MA 02138 USA
7Instituto de Astronom´ ıa, Universidad Nacional Aut´ onoma de
M´ exico, Ensenada, M´ exico (mailing address: PO Box 439027, San
Ysidro, CA, 92143-9027, USA)
8Center for Astrophysics and Space Sciences, University of Cal-
ifornia San Diego, Code 0424, 9500 Gilman Drive, La Jolla, CA
92093, USA
9INAF - Osservatorio Astronomico di Palermo, Piazza del Par-
lamento 1, I-90134 Palermo, Italy
10University of Maryland, Baltimore County, 1000 Hilltop Cir-
cle, Baltimore, MD 21250, USA
11Space Telescope Science Institute 3700 San Martin Drive,
Baltimore MD 21218 U.S.A.
12ESO, Karl-Schwarschild-Strasse 2, D–85748 Garching, Ger-
many
classification trough optical line spectroscopy may suffer
of uncompleteness and/or misidentifications; 2) AGN are
the dominant X-ray population. In fact most (∼ 80%) of
the X-ray sources AGN in deep and shallow surveys turn
out to be AGN, unlike at optical wavelengths. 3) 0.5-
10 keV X-rays (the typical Chandra and XMM-Newton
enery band) are capable to penetrate column densities
up to ∼1024cm−2, allowing the selection of moderately
obscured AGN; 4) low luminosity AGN are difficult to
select in optical surveys, because their light is diluted in
the host galaxy emission.
So far Chandra and XMM-Newton have performed sev-
eral deep, pencil beam, and shallower but wider surveys.
Fig.1 compares the flux limit and area coverage of
the main Chandra and XMM-Newton surveys. This fig-
ure shows that XMM-Newton COSMOS and Chandra-
COSMOS (C-COSMOS, Elvis et al. 2009, Paper I here-
after) surveys are the deepest surveys on large contiguous
area. The coverage of larger areas at similar flux lim-
its is today achieved only by serendipitous surveys using
mostly not contiguous areas (see e.g., CHAMP, Kim et
al. 2004a, 2004b, Green et al. 2004).
The Cosmic evolution survey (COSMOS, Scoville et
al. 2007) is aimed at studying the interplay between
the Large Scale Structure (LSS) in the Universe and the
formation of galaxies, dark matter, and AGN. The COS-
MOS field is located near the equator (10h,+02degrees),
covers ∼ 2 square degrees as originally defined by the
HST/ACS imaging (Koekemoer et al. 2007), with sub-
sequent deep and extended multi-wavelength coverage
overlapping this area. The size of COSMOS was chosen
to sample LSS up to a linear size of about 50 Mpc h−1at
z ∼ 1-2, where AGN and star formation in galaxies are
expected to peak. To study the role of AGN in galaxy
evolution the X-ray data are fundamental. Therefore,
Page 2
2 Puccetti et al.
Fig. 1.— The 0.5-2 keV flux range vs. the area coverage for var-
ious surveys. The black solid lines represent few serendipitous sur-
veys: Helllas2XMM (Baldi et al. 2002, symbol A), CHAMP (Kim
et al. 2004a, 2004b, Green et al. 2004, symbol B), SEXSI (Harrison
et al. 2003, symbol C), XMM-BSS (Della Ceca et al. 2004, symbol
D), AXIS (Carrera et al. 2007, symbol E); the red dotted lines rep-
resent few deep pencil beam surveys: CDFN (Brandt et al. 2001,
Alexander et al. 2003, symbol F), CDFS (Giacconi et al. 2001, Luo
et al. 2008, symbol G), XMM-Newton Lockman Hole (Worsley et
al. 2004, Brunner et al. 2008, symbol H); the blue dotted lines
represent few wide shallow contiguous surveys: C-COSMOS (Elvis
et al. 2009, symbol I), XMM-COSMOS (Hasinger et al. 2007,
Cappelluti et al. 2007, 2009, symbol L), ELAIS-S1 (Puccetti et
al. 2006, symbol M), E-CDF-S (Lehmer et al. 2005, symbol N),
AEGIS-X (Laird et al. 2009, symbol O), SXDS (Ueda et al. 2008,
symbol P). The black solid triangle represent the ROSAT all sky
survey (RASS, Voges et al. 1999).
the central square degree of the COSMOS field has been
the target of a Chandra ACIS-I, 1.8 Msec Very Large
Program: the Chandra-COSMOS survey.
The C-COSMOS survey has a rather uniform effective
exposure of ∼ 160 ksec over a large area (∼ 0.45 deg2),
thus reaching ∼ 3.5 times fainter fluxes than XMM-
COSMOS in both 0.5-2 keV band and 2-7 keV band.
This flux limit is below the threshold where starburst
galaxies become common in X-rays. The sharp Chandra
Point Spread Function (PSF) allows nearly unambigu-
ous identification of optical counterparts (Civano et al.
2009, hereafter Paper III).Chandra secures the identifi-
cations of X-ray sources down to faint optical magnitude
(i.e., I ∼ 26), with only ∼ 2% ambiguous identifications,
significantly better than the ∼ 20% ambiguous identifi-
cations in XMM-Newton (Brusa et al. 2007).
The C-COSMOS survey has a complex tiling (see Fig.
2) in comparison to other X-ray surveys, in which the
overlapping areas of the single pointings are small and
with similar PSFs (see e.g., the Extended Groth-Streep,
AEGIS-X, Laird et al. 2009), or all the pointings are co-
assial and nearly totally overlapping (see e.g., CDFS, Gi-
acconi et al. 2001, Luo et al. 2008). In the C-COSMOS
tiling, the pointings are strongly overlapping and not-
coassial. While this ensures a very uniform sensitivity
over most of the field, each source is observed with up
to six different PSFs, requiring the development of an
analysis procedure for data observed with this mixture
of PSF. The procedures presented in this paper are aimed
at optimizing (1) source detection, (2) localization, (3)
photometry, and (4) survey sensitivity. We have made
detailed comparisons between different detection algo-
rithms and photometry tools, testing them extensively
on simulated data. We furthermore validate our results
by detailed inspections of each single source candidate.
Our final analysis consists of a two main steps:
1 a wavelet detection algorithm, PWDetect (Dami-
ani et al. 1997) is first used to find source candi-
dates. This algorithm is optimized to cleanly sep-
arate nearby sources, to detect point-like sources
on top of extended emission and to give the most
accurate positions.
2 A maximum likelihood PSF fitting algorithm
then used to evaluate the source count rates and
the probability that each source candidate is not
a fluctuation of the background.
emldetect algorithm (Cappelluti et al. 2007 and
references therein).
emldetect works simultane-
ously with multiple overlapping pointings using
PSFs appropriate to each one. This fitting method
ensures accurate evaluation of the survey com-
pleteness and contamination, efficient deblending
and good photometry for close pairs, which may
be partly blended even at the Chandra resolution.
is
We used the
Asa thirdstep, wealso performed
aperture photometry for each candidate X-ray source
using 50%, 90%, and 95% encircled count fractions,
using the PSFs appropriate to each observation. The
aperture photometry is also used to check the results.
Aperture photometry is preferable in all cases where the
systematic error introduced by PSF fitting are larger
than the statistical errors, i.e., for bright sources (count
rates ≥ 1 counts/ksec).
The survey sensitivity is limited by both the net (i.e.,
including vignetting) exposure time, and by the ac-
tual PSF with which a given region of the area is ob-
served. The latter issue is particularly relevant for the
C-COSMOS tiling. We have developed an algorithm
that evaluates the survey sensitivity at each position
on C-COSMOS using a parameterization of the Chan-
dra ACIS-I PSF and taking into account the mixture of
PSFs at each position. The resulting sensitivity maps
have been compared and validated with extensive simu-
lations.
The paper is organized as following: in Sect. 2 we
briefly present the C-COSMOS observations and data
reduction; we describe the simulations in Sect. 3; how
they were used to select the most efficient detection algo-
rithm and the final source characterization procedure is
described in Sect. 4; the completeness and reliability are
shown in Sect. 5; in Sect. 6 we apply this procedure to
the observed data; in Sect. 7 we present the calculation
of survey sensitivity, the sky-coverage, and X-ray number
counts using the simulated data. Finally, in Sect. 8 we
compare C-COSMOS to a similar Chandra survey, i.e.,
AEGIS-X, and in Sect. 9 we give our conclusion.
2. OBSERVATIONS AND DATA REDUCTION
Page 3
C-COSMOS data analysis3
123456
Fig. 2.— The final tiling of the C-COSMOS field, with a color
scale showing the number of the ACIS-I overlapping pointings, as
indicated in the color bar at the bottom of the figure.
We give here a brief description of the observations and
data reduction. The full details are given in Paper I. The
C-COSMOS field covers a contiguous area of ∼ 0.92 deg2,
centered at 10h00m18.91s+02◦10’ 33.48”, near the
center of the full COSMOS field. The survey is made up
of 36 different heavily overlapping ACIS-I pointings, each
with a mean exposure of ∼ 50 ksec, for a total exposure
of 1.8 Msec. Twelve of the 36 pointings were scheduled
as two or more separate observations, with very similar
roll-angles, thus resulting in 49 observations in total. Fig.
2 shows the number of ACIS-I pointings per pixel. Note
that the central ∼ 0.45 deg2area is covered by four to
six overlapping pointings, while most of the outer ∼ 0.47
deg2area is covered by one to two overlapping pointings.
As an example, Fig.3 shows the image of the same
source observed in four overlapping fields at different off-
axis angles.
The 49 observations were processed using the standard
CIAO 3.4 software tools13(Fruscione et al. 2006). Event
files were cleaned of bad pixels, soft proton flares and
cosmic-ray afterglows, and were brought to a common
reference frame by matching the positions of bright X-
ray sources with the optical position of bright (18<I<23),
point-like optical counterparts. The systematic shifts be-
tween the X-ray and optical positions are ∆ RA=0.04”
and ∆ DEC=0.25” (see Paper I). Observations with the
same aim points and consistent roll-angles were merged
together, producing 36 event files, one for each indepen-
dent pointing.
The flux limits for source detection are influenced by
three main factors: (1) net exposure time, (2) back-
ground per pixel, and (3) size of the source extraction
region, which in turn depends on the size of the PSF
13http://cxc.harvard.edu/ciao/
at the given position. The Chandra ACIS-I on-axis PSF
has a spatial resolution of 0.5” FWHM, equivalent to
< 4-4.5 kpc at any redshift, and permits observations
of up to ∼ Msec to be photon limited. The adopted
tiling produces a rather homogeneous exposure time over
the C-COSMOS field (i.e., ±12% in the central ∼ 0.45
deg2area) and a uniform background. In the vignetting-
corrected exposure time we clearly distinguish two main
peaks at 80 and 160 ksec (see Fig. 7 of Paper I). Fig. 4
shows the fraction of the C-COSMOS area with a given
background per square arcsec in the three analyzed en-
ergy bands: 0.5-7 keV (full band, F), 0.5-2 keV (soft
band, S), and 2-7 keV (hard band, H). We see two main
peaks at 0.07 and 0.14 counts/arcsec2in the F band and
at 0.02 and 0.04 in the S band, corresponding to the
two main peaks of the exposure time distribution. These
peaks correspond to a level of ∼ 2 and ∼ 4 counts in
the F band, and ∼ 0.6 and ∼ 1.2 counts in the S band
over an area of 3 arcsec radius, a typical source detection
region for off-axis angles less than 5-6 arcmin. Even the
area with the largest exposure time has therefore rela-
tively low background for point source detection; this is
important for the detection of the faintest sources.
3. GENERATION OF SIMULATED DATA
Extensive simulations were performed in order to test
various source detection schemes. The simulations were
used (1) to test the reliability of the source position re-
construction, (2) to verify the count rate reconstruction,
and (3) to assess and validate the level of significance
of each detected source at each given detection thresh-
old and thus to evaluate the level of completeness of the
source list as a function of flux.
3.1. Creating the simulated input source catalog
In order to include realistic source clustering into the
simulated data, we sampled particles from a COSMOS
Mock galaxy catalog (V3.0) derived by Kitzbichler and
White (2008). They made use of the Millennium Simula-
tion (Springel et al. 2005), a very large simulation which
follows the hierarchical growth of dark matter structures
from redshift z=127 to the present. The simulation as-
sumes the concordance ΛCDM cosmology and follows
the trajectories of 21603(∼ 1010) particles in a periodic
box 500 h−1Mpc on a side, using a special reduced-
memory version of the GADGET-2 code (Springel et al.
2001; Springel 2005). The formation and evolution of the
galaxy population is simulated by using a semi-analytical
model (Croton et al. 2006, De Lucia & Blaizot, 2007).
We randomly selected 10000 mock galaxies per square
degree in the ad hoc redshift range 0.4< z <0.9 and i
band magnitude range 17< i <26. The selected ran-
dom sources in this redshift-magnitude range show the
same angular correlation function (ACF) as the S band
XMM-COSMOS sources (Miyaji et al. 2007), as shown
in Fig. 5, not taking into account that the clustering
strength could depend on the survey flux limit (Plionis
et al. 2008). The agreement between the ACF of the
random sample and the XMM-COSMOS sample is good
down to the 0.5 arcminute scale. Below 0.5 arcmin, the
uncertainties in the S-band of the XMM-COSMOS ACF
and the other X-ray ACF from literature (see e.g., the
Chandra Deep Field South, D’Elia et al. 2004) are too
large to allow them to be sensibly compared with the one
Page 4
4 Puccetti et al.
0123456789
Off-Axis = 7.9 arcmin
Off-Axis = 6.7 arcmin
Off-Axis = 4.9 arcmin
Off-Axis = 3.1 arcmin
Fig. 3.— The image of the same source (i.e., source-id 50 in the C-COSMOS catalog presented in Paper I) observed in four overlapping
fields at different off-axis angles. The contours are drawn at 90%, 50%, 25%, and 10% of the peak counts. The red circles centered on the
position of the source have a radius of 2 arcsec.
Fig.4.—
Area fraction for a given background per square
arcsec in the F band (solid blue histogram), S band (dashed green
histogram), and H band (empty red histogram).
we derive. Each simulated galaxy was then assigned an
S band flux, randomly drawn from the number weighted
logN – logS relation of the AGN population synthesis
model by Gilli et al. (2007). The corresponding mini-
mum S band flux for the input particles was ∼ 3·10−18
erg s−1cm−2, which is a factor 100 below the detection
limit of C-COSMOS. Hence background fluctuations due
to unresolved faint sources are included in the simula-
tions.
The S band flux of each source was then converted into
an F band flux assuming a power−law spectrum with an
energy index αE=0.414. The simulated sources cover a
3 deg2sky area, which is enough to completely enclose
the COSMOS field.
3.2. Creating the simulated X-ray event files
Using the MARX simulator15(version 4.2.1), we simu-
lated a set of 49 Chandra ACIS-I pointings with the same
exposure times, aim points, and roll-angles as the real C-
COSMOS pointings (see Paper I). The simulated source
list was fed into each simulated pointing and net source
counts F recorded. This procedure returns 49 Chandra
events files containing only source photons.
14fE∝E−Γ, with Γ = αE+ 1
15http://space.mit.edu/CXC/MARX/
Page 5
C-COSMOS data analysis5
Fig. 5.— The ACF of the sources selected from the COSMOS
Mock catalog for input to the C-COSMOS S band simulation (red
squares) compared with that of XMM-COSMOS (S band, blue
solid dots).
To include a background appropriate to each point-
ing we used the CXC compilation of blank sky fields16.
These blank fields lie at high Galactic latitude, away from
soft bright features such as the North Polar Spur, and
have a median exposure of ∼ 70 ks. Point-like and ex-
tended sources down to fluxes that would be detectable
in each exposure have been excluded, and the individ-
ual exposures have been stacked into different blank sky
files. We chose the stacked blank sky file appropriate for
ACIS-I data at the epoch of our observations17, filtered
to keep only photons detected in VFAINT mode observa-
tions. This blank sky field has a total effective exposure
of ∼ 1.5 Msec.
We then extracted 49 background event files by ran-
domly resampling the events out of the blank sky file
scaling by the exposure time of each observation. Faint
simulated sources with only a few counts would not be de-
tected and increase the background level by ∼ 5% at the
depth of the blank sky observations. Since these faint,
unresolved sources are already included in the blank sky
files, in order to avoid counting them twice, we removed
5% of the photons in each background event file.
The background files were then reprojected to the co-
ordinates of the real pointings by using the real aspect
solution files, and then combined with the corresponding
source event files. The final result is a set of 49 simulated
ACIS-I fields that closely mirror the actual 49 observa-
tions.
4. CHOOSING THE C-COSMOS SOURCE
DETECTION AND CHARACTERIZATION
PROCEDURE
In order to fully exploit the large and deep C-COSMOS
coverage a particular care had to be devoted to maximize
16http://cxc.harvard.edu/contrib/maxim/acisbg/
17http://cxc.harvard.edu/contrib/maxim/acisbg/data/
acisi D 01236 bg evt 010205.fits
areal coverage and produce uniform depth; C-COSMOS
used a complex tiling, with four overlapping pointings in
most of the central ∼ 0.45 deg2area with the best expo-
sure, and two overlapping fields in most of the surround-
ing area, covering additional ∼ 0.47 deg2(see Fig. 2).
As a result, each source is observed at different off-axis
angles, θi(i.e., the distance of the source position from
the aim point in all overlapping fields), and thus with
different PSFs. For some sources in the central area the
number of different θi is as high as six. This mixture
of PSFs requires addressing three main issues: (1) max-
imizing the sensitivity when the PSF changes so widely
between different observations of the same source (from
∼ 1 arcsec to ∼ 10 arcsec half power radius); (2) max-
imizing the spatial resolution aimed to obtain the best
source localization and the effective deblendig; (3)obtain-
ing accurate photometry, even in cases of partly blended
sources. To solve these issues a dedicate analysis pro-
cedure was developed, and the simulations were used to
determine and validate it.
We tested sliding cell and wavelet algorithms to find
and locate source candidates, and both PSF fitting and
aperture photometry. In particular, we compared the
results obtained using the SAS eboxdetect18and emlde-
tect19tasks, used for the XMM-COSMOS survey (Cap-
pelluti et al. 2009), with those obtained using the PWDe-
tect code (Damiani et al. 1997) and CIAO wavdetect20
(Freeman et al. 2002). We compared PWDetect and
CIAO wavdetect on a data subset including 8 ACIS-I
fields and found consistent results. We adopt the PWDe-
tect as the main wavelet algorithm because of its much
faster processing time (i.e., factor of 40÷50) with respect
to CIAO wavdetect.
4.1.
PWDetect
The PWDetect code (Damiani et al. 1997) was orig-
inally developed for the analysis of ROSAT data, and
was then adapted for the analysis of Chandra and XMM-
Newton data. This method is particularly well suited for
cases in which the PSF is varying across the image, as for
Chandra images, since PWDetect is based on the wavelet
transform (WT) of the X-ray image, i.e., a convolution of
the image with a “generating wavelet” kernel, which de-
pends on position and length scale, that is a free parame-
ter. For the Chandra data, the length scale is varied from
0.35” to 16” in steps of√2. This choice spans the range
from the smallest to the largest (for large θi) Chandra
PSFs. Both radial and azimuthal PSF variations are ac-
counted for by PWDetect, which first assumes a gaussian
PSF and then corrects by a PSF shape factor, calibrated
on both radial and azimuthal coordinates. PWDetect was
run on each of the 36 event files with a low significance
level of ∼ 10−3, to have entries with just 5 source counts
(i.e., to pick up most of the input sources). The catalogs
of source candidates from overlapping fields were then
merged. The off-axis angle θiis recorded and the source
position measured at the smallest θi(i.e., with the best
PSF) is adopted as the reported source position. If a
candidate is not detected in one or more of the overlap-
ping fields, the count rate is computed at the position
18http://xmm.esac.esa.int/sas/8.0.0/eboxdetect/
19http://xmm.esac.esa.int/sas/8.0.0/emldetect/
20http://asc.harvard.edu/ciao/ahelp/wavdetect.html
Page 6
6 Puccetti et al.
of the source candidate and within a circle of radius Ri,
corresponding to 90% of the encircled count fraction of
the PSF (fpsf21=90%) at θi, as calibrated by the CXC22.
Finally, a mean count rate, that is weighted by the count
rate errors, is associated at each source. Analysis of the
simulated data showed that all candidates with a wavelet
size smaller than the PSF size and less than 5 counts are
spurious detections. These were then excluded from the
candidate catalogs.
4.2.
EBOXDETECT and EMLDETECT
Both eboxdetect and emldetect are part of the XMM-
Newton SAS package and are based on programs origi-
nally developed for the detection in ROSAT images (see
e.g., Voges et al. 1999). eboxdetect is a standard sliding
cell detection tool, which is run on each of the 49 sin-
gle observations. eboxdetect produces a list of candidate
sources down to a selected low significance level. The list
of source candidates is then passed to the emldetect task.
emldetect performs a simultaneous maximum likelihood
PSF fitting for each candidate to all the images at each
position (see e.g. Cappelluti et al. 2007 for more details
on eboxdetect and emldetect).eboxdetect was run setting
a low significance level (DET ML=3 or Prandom=0.05),
to provide a list of source candidates to emldetect, that
recognizes all possible significant sources.
emldetect has been adapted to run on Chandra data by
replacing the XMM-Newton PSF library with the Chan-
dra PSF library (see note 22), and to work with many
different PSFs, simultaneously. The counts at each po-
sition were fitted using a model obtained by convolv-
ing the PSF at that position with a β model (Cruddace,
Hasinger and Schmitt 1988). The program interpolates
over the calibration library of Chandra PSFs to find the
most appropriate PSF at the position of each source in
each observation. The more crowded is the field, the
more candidates are fitted simultaneously. emldetect can
provide both source positions and source count rates, or
only source count rates using fixed source positions. We
ran it fitting for both source positions and count rates.
The best fit maximum likelihood, DET ML, is related
to the Poisson probability that a source candidate is a
random fluctuation of the background (Prandom):
DET ML = −ln(Prandom) (1)
Sources with low values of DET ML, and correspond-
ingly high values of Prandom, are then likely to be back-
ground fluctuations.
4.3. Tests on simulations
We ran both detection algorithms on the simulated
data. Catalogs of candidates were produced with both
eboxdetect and PWDetect. These lists were visually in-
spected to identify obviously spurious detections on the
wings of the Chandra PSF around bright sources, and
near the edges of the ACIS-I chips. For both detection
algorithms, the number of these clearly spurious detec-
tions is rather small in all three bands (< 1−2%). These
entries were deleted and the ’cleaned’ lists used as in-
put for the emldetect tool. The emldetect output catalog
21fpsfindicates a fraction of the source counts distributed in a
circular area, following the PSF shape.
22http://cxc.harvard.edu/caldb/
TABLE 1
Comparison between eboxdetect+emldetect and PWDetect
Parameter
1
eboxdetect+ emldetect
2
PWDetect
3
Comparison on source position:
<∆ R.A.>a
<∆ Dec.>a
∆ R.A. RMSb
∆ Dec. RMSb
0.17”±0.16”
-0.18”±0.15
0.32”
0.35”
0.02”±0.15”
0.003”±0.15”
0.31”
0.34”
Comparison on completeness of close pairs:
% of missed pairsc
∼ 75%
Comparison on source photometry:
<
Fx(F)
FxI(F)>d
Fx(S)
FxI(S)>d
Fx(H)
FxI(H)>d
0.97±0.110.86±0.12
<
1.00±0.120.94±0.14
<
1.05±0.160.88±0.17
Note. — Column (1) shows the parameters used to test the
accuracy of source localization, the completeness un the recovery
of close pairs, and the flux reconstruction of the two detection
algorithm, which we used. Column (2) and (3) show the results
for the eboxdetect+emldetect and the PWDetect algorithm, re-
spectively.
aThe median and interquartile of the shifts between the R.A. or
Dec. of the input sources and the R.A. or Dec of the detected
sources, see also Fig. 6.
bThe RMS of the R.A. or Dec. shifts between input and detected
positions, see also Fig. 6.
cPercentage of the pairs with a separation smaller than 4 arcsec,
that are missed in comparison to PWDetect, see also Fig. 7.
dThe median and interquartile of the ratio between the output
detected and input simulated count rates in the F, S, and H band,
see also Fig. 8.
was then cut at a conservative value of DET ML=12
(Prandom < 6 × 10−6), to ensure that the number of
spurious detections in this catalog is practically zero, so
that the results are not contaminated by spurious asso-
ciations.
Matched catalogs between the input simulated cata-
log and the emldetect and PWDetect output catalogs
were produced using two methods: (1) a conservative
approach, using a fixed matching radius of 0.5 arcsec.
This produces matched catalogs which probably miss a
fraction of real associations, but are virtually free from
spurious associations. (2) A maximum likelihood algo-
rithm, to find the most probable association between an
input source and an output detected source. We used
the catalogs produced using the first method to study
the accuracy of source localization and flux reconstruc-
tion, while we used the catalogs produced by the second
method to study the completeness and reliability of the
detection algorithms (see Sect. 5).
Table 1 summarizes the comparison of the results of
the application of eboxdetect+emldetect and PWDetect
on simulated data.
We first compared the best-fit source coordinates pro-
vided by emldetect and PWDetect with the input source
positions (see Tab. 1). The RMS variations and the in-
terquartile of the shifts are similar for the two detection
algorithms; however, we find a small systematic median
shift between input and detected R.A. and Dec. (see also
Fig. 6 ) using emldetect. We conclude that PWDetect
Page 7
C-COSMOS data analysis7
provides positions of higher quality.
As a second step, we focussed on the ability of the de-
tection algorithms to separate close pairs of sources in
Chandra data, comparing the numbers of pairs found by
emldetect and PWDetect (see Fig. 7 and Tab. 1). The
two algorithms are equivalent for large ( >4”) separa-
tions, but there is a deficiency in the number of pairs
recovered by emldetect at small (<4”) separations. We
verified that all the ∼ 75% of the pairs with a separation
smaller that 4 arcsec missed by emldetect are in the input
source list, and not spuriously created by the splitting of
a single source. Analysis of the eboxdetect candidate list
and emldetect final list shows that the majority (>70%)
of these pairs are missed in the emldetect step, where the
program finds a best fit including one significant source
only, while the second falls below the detection thresh-
old. We conclude that PWDetect is more efficient than
emldetect at resolving close pairs with separations <4”
and greater than ∼ 1.8”.
Finally we compared the emldetect and PWDetect best-
fit count rates with the input count rates in the F, S,
and H band (see Fig. 8 and Tab. 1). The PWDetect re-
constructed count rates were systematically smaller than
the input count rates by 10-20%. A similar problem was
found by Puccetti et al. (2006) using a similar detection
algorithm on XMM-Newton data. emldetect reconstructs
much better the count rates in all the bands.
The accuracy of the count rate reconstruction of the
emldetect algorithm is also good at all count rates, with-
out any large systematic shifts, both at low count rates
and at high count rates (see left panel of Fig. 9). The
right panel of Fig. 9 shows the difference between the
emldetect count rate and the input simulated count rate
divided by the emldetect error on the count rate as a
function of the emldetect count rate. We see that the
distribution is approximately centered around zero for
count rates smaller than ∼ 0.5 counts/ksec, but becomes
positive for larger count rates. This suggests that at high
count rates, there is a not negligible systematic error in
the emldetect count rate determination, due to the un-
certainties in the PSF model becoming comparable to, or
higher than, the statistical error. For this reason we also
performed aperture photometry (see Sect. 6.4), which
should be free from this systematic error.
4.3.1. Error on the positions
The source positional error is proportional to the PSF
at the position of the source, and inversely proportional
to the square root of the source number counts.
evaluated the errors on the positions by dividing the
PSFradiusby (1) the square root of the total source plus
background counts (T, PosError= PSFradius/√T) and
(2) the square root of the net, background subtracted,
source counts (Cs, PosError = PSFradius/√Cs). We
used different PSFradius, from 50% to 90% of the fpsf at
the position of each source in the field where the source
is detected at the smallest θi(i.e., with the best PSF).
These errors were then compared with the deviations
between the X-ray positions and input positions in the
simulations. Method (2) gave the best match using a
PSFradius corresponding to the 50% fpsf at the θi of
each source, and the counts included in a circular region
with the same radius. We used the fpsf - θicalibration
We
provided by CXC (see note 19). Larger PSFradii pro-
vided implausibly large position errors for bright sources.
Including background counts (method 1) produces too
small errors for faint sources, where the background is
not negligible. For ∼ 60 sources with more than ∼ 120
counts, the errors on RA and Dec are formally smaller
than 0.07 arcsec (i.e., errors on source position smaller
than 0.1 arcsec). In these cases the error on source po-
sition was conservatively set to 0.1 arcsec to account for
possible small systematic errors in the astrometric correc-
tions (see Sect. 2 and 4.3). Fig. 10 shows the distribution
of the ratio between the deviation between the PWDetect
positions and input positions and the X-ray error on the
position evaluated as in method (2). The distributions in
the three detection bands are similar and peak at a value
of ∼ 0.7-0.8. These distributions are compared with the
expectation based on Gaussian statistics which peaks at
unity. This comparison shows that the assumed errors
on the positions, although very small, are, on average,
somewhat larger than the deviation between input po-
sitions and detected positions. However, to account for
small systematic errors in the astrometric corrections,
which are not included in the input positions while they
certainly affect the observed data, we use in the follow-
ing the conservative errors on the positions computed as
described above.
4.4. The final C-COSMOS source detection and
characterization procedure
In summary, the comparison of the two methods
PWDetect and eboxdetect+emldetect on the simulated C-
COSMOS field shows that PWDetect is superior in sep-
arating closely spaced sources and in localizing sources,
and relatively poor at photometry. Conversely, emldetect
is poor at separating closely spaced sources, while it is
good at estimating source reliability, completeness, and
photometry. These results suggested the following source
detection and characterization procedure:
1- PWDetect is run first with a low threshold to pro-
duce a catalog of source candidates, with the best
localization;
2- this catalog of source candidates is used as input
for emldetect which performs a PSF fitting to find
the best fit maximum likelihood source count rate
and the probability that each source candidate is a
fluctuation of the background. In emldetect the co-
ordinates used to fit each source are those provided
by PWDetect for the most on-axis observation;
3- aperture photometry is used to get good photome-
try for bright sources.
This combined approach allows us to obtain both the
best possible position determination and reliable pho-
tometry for all sources.
5. COMPLETENESS AND RELIABILITY
The threshold for source detection must be set by
balancing completeness (the fraction of true sources de-
tected, i.e., ratio between the number of the detected
sources and the number of input simulated sources) ver-
sus reliability (one minus the fraction of spurious sources
Page 8
8 Puccetti et al.
Fig. 6.— Left panel: shift between the input simulated source positions and the source positions by emldetect using a matching radius of
2 arcsec (black solid dots). The solid black lines represent the zero shifts. The red circles have a radius of 0.5, 1, and 2 arcsec, respectively.
Right panel: shift between the input simulated source positions and the source positions by PWDetect using a matching radius of 2 arcsec
(black solid dots). Symbols as in the left panel.
Fig. 7.— Top panel: number of pairs in the F band detected
by PWDetect (black empty histogram) and emldetect (green solid
histogram), as a function of the separation. Bottom panel: ratio
between the difference between the number of pairs detected by
PWDetect and emldetect, and the pairs detected by PWDetect as
a function of the separation.
detected, i.e., one minus the ratio between the number
of spurious sources and the number of input simulated
sources). Our simulations allow us to choose a thresh-
old which has a known completeness and reliability. The
three panels of Fig. 11 show the completeness in the F,
S, and H band as a function of the significance level for
sources with at least 12 counts (solid lines) and 7 counts
(dashed lines). The latter value refers to the counts of a
typical source close to our flux limit, where we expect
Fig. 8.— The PWDetect (red dashed histogram) and emldetect
(solid black histogram) best fit count rates over the input count
rates in the F (left panel), S (center panel), and H ( right panel)
band.The dotted vertical line corresponds to the exact match
between the evaluated count rates and the input count rates. Note
as the emldetect count rates are in good agreement with the input
count rates.
a rather large incompleteness.
counts) ensures significantly higher completeness. Fig.
11 also shows the reliability as a function of the signifi-
cance levels for the same two cases. We chose a signifi-
cance level of 2 · 10−5(or DET ML=10.8), which repre-
sents a reasonable compromise between high complete-
ness and high reliability. Higher significance levels give
higher completeness but lower reliability. At the chosen
threshold we have 87.5% and 68% (F band), 98.2% and
83% (S band), 86% and 67% (H band) completeness for
sources with at least 12 and 7 counts, respectively. At
the same significance level and the same counts limits,
the reliability is ∼ 99.7% for the three bands and both
source count limits. This implies about 5, 4, and 3 spuri-
ous detections with ≥ 7 counts in the F, S, and H bands,
The former value (12
Page 9
C-COSMOS data analysis9
Fig. 9.— Left panel: The ratio between the best fit count rates obtained by emldetect and the input count rates as a function of the
input count rates for the simulations in the F (blue solid circles), S (green open circles), and H (red open squares) band. The solid line is
the exact match between the best fit count rates and the input count rates. Right panel: the difference between the emldetect count rate
and the input count rate divided by the emldetect error on the count rate, as a function of the count rate for the sources detected in the S
band.
Fig. 10.— The distributions of the ratio between the deviation
between detected positions by PWDetect and input positions and
the X-ray error on the position for the simulations in three energy
bands (blue: F band; green: S band; red: H band). The solid curve
is the expectation based on Gaussian statistics.
respectively, and 3, 4, and 3 spurious detections with
≥ 12 counts in the same bands.
Fig. 12 shows the completeness for a significance level
of 2 · 10−5as a function of the flux for the F, S, and H
bands. Table 2 gives the flux limits corresponding to 4
completeness fractions in the F, S, and H bands.
We have also evaluated the completeness of the method
in the detection of close pairs. Fig. 13 compares the num-
ber of pairs having one member with at least 7 and 12
TABLE 2
Flux limit and Completeness
Completeness
%
F(0.5-10 keV)
erg cm−2s−1
F(0.5-2 keV)
erg cm−2s−1
F(2-10 keV)
erg cm−2s−1
90
80
50
20
4.1 · 10−15
3.1 · 10−15
1.7 · 10−15
1.1 · 10−15
1.1 · 10−15
9.4 · 10−16
4.5 · 10−16
3.3 · 10−16
7.8 · 10−15
6.1 · 10−15
2.9 · 10−15
2.0 · 10−15
counts in the simulated data with the detected number
of pairs. The number of pairs in the simulated data have
been corrected dividing them by the square of the com-
pleteness expected at their counts thresholds (87.5% for
the pairs with at least 12 counts and 68% for the pairs
with at least 7 counts). In fact to correctly compare the
number of pairs in the simulated data and the detected
number of pairs, it is necessary to take into account that
the detected number of pairs is not complete at the cho-
sen significance level, and moreover that each pair must
be corrected for the completeness of both sources in pair,
that is the square of completeness. We see that at dis-
tances smaller than 5 arcsec, we miss between 50% and
70% of the pairs with at least 12 counts and between
70% and 80% of the pairs with more than 7 counts. The
reason is that it is increasingly difficult to detect a faint
(7 or 12 counts) source near a bright source, because of
the wings of the PSF of the latter. Indeed, all pairs re-
covered have a counts ratio < 3, while about 40% of the
input pairs have a count ratio > 3, none of which are
detected in our analysis.
6. OBSERVED DATA: SOURCE DETECTION AND
COUNT RATES
Source detection and characterization were performed
on the real, observed, event files using the approach de-
Page 10
10Puccetti et al.
Fig. 11.— Completeness (solid and dot-short dashed lines, left y axis) and reliability (long dashed and short dashed lines, right y axis)
as a function of the significance level for the simulations in the F (left panel), S (center panel) and H (right panel) band, for sources with
at least 12 counts (solid and long dashed lines, respectively) and at least 7 counts (dot-short dashed and short dashed lines, respectively).
The dotted vertical black lines indicate the chosen significance level of 2 · 10−5.
Fig. 12.— The crosses represent the completeness as a function
of the flux at the chosen significance level of 2 · 10−5, in F band
(blue crosses), S band (green crosses), and H band (red crosses).
The dashed lines connect the relative cross points. The solid lines
represent the sky-coverage calculated as in Sect. 7.2 and normal-
ized to the maximum sky-coverage. The horizontal black dashed
and solid lines indicate 5 completeness fractions.
scribed in Sect. 4.4. The three energy bands, F, S, and H
were used. The candidate catalogs produced by PWDe-
tect, used as input for emldetect were cut at a low thresh-
old of ∼ 10−3, corresponding to 5 counts. The number of
PWDetect source candidates in each of the three bands
was between 2500 and 3500. These lists were visually
cleaned to identify obviously spurious detections on the
wings of the Chandra PSF around bright sources and
near the edges of the ACIS-I chips, following the same
procedure adopted for the simulated data (see Sect. 4).
As for the simulations, the number of clearly spurious
detections is small in all three bands (< 1 − 2%).
At the chosen probability (i.e. significance level 2·10−5
Fig. 13.— Top panel: the number of pairs in the F band detected
by PWDetect (black solid histogram), compared to the number of
pairs in the simulations having one member with at least 7 counts
(dotted histogram) and 12 counts (dashed histogram) as a function
of the separation. Bottom panel: ratio between the pairs detected
by PWDetect and the number of pairs in the simulations with 7
or 12 counts as a function of the separation. In both panels the
numbers of pairs in the simulations are corrected dividing them by
the square of the completeness expected at their counts thresholds
(see text).
or DET ML=10.8), the number of spurious detections is
presumably << 12 in the total catalog (i.e. F, S, and
H band). The total catalog is obtained by the cross-
correlation of the three single band (i.e. F, S, and H)
catalogs, in this way the number of spurious sources in
the single F, S, and H bands, evaluated by the detailed
analysis of the simulations (see Sect. 5), are no longer
indipendent. As a result the number of the total spurious
sources is less than the sum of the spurious sources in
each of three single bands.
Page 11
C-COSMOS data analysis 11
6.1. Source position
Fig. 14 (left panel) shows the positional error, eval-
uated using the empirical technique described in Sect.
4.3.1, as a function of the off-axis angle. The notch in
figure depends on the fact that at a fixed off-axis angle,
the PSFradiusis ∼constant, while√Csis a discrete vari-
able, since T are integer numbers and B are small. The
error is typically less than ∼ 0.5 arcsec at the smallest
off-axis angles, θi <2 arcmin, and then increases to 1-2
arcsec for θi≥2 arcmin. Most of the scatter at a given
off-axis angle in this figure is due to the range of count
rates in the sources. Fig. 14 (right panel) shows the po-
sitional error in the F band as a function of the source
count rate in 4 off-axis bins. Both figures show that the
quality of the data is good enough to provide positions
with sub-arcsec accuracy, except for
sources (i.e., 202 sources), and for ∼ 13.5% of the en-
tire source catalog (see Paper I for more details). These
small positional errors are the key to the high identifi-
cation rate of the C-COSMOS sources with optical and
infrared counterparts (Civano et al. 2009, Paper III).
<
∼12% of F band
6.2. Count rates
Vignetting corrected count rates for each source are
obtained by dividing the best-fit counts derived from
emldetect for each band and in each single field by the
net exposure time, reduced by the vignetting at the po-
sition of each source, as in the exposure maps23,24. The
exposure maps are computed averaging over an area of 8
pixels to smooth out CCD gaps and cosmetic defects, and
are weighted with an absorbed power-law spectral model
with an energy index αE=0.4 and the Galactic column
density of the COSMOS field, NH=2.7·1020cm−2.
The errors on count rates at 68% confidence level were
then computed using the equation:
Error =
?Cs+ (1 + a) · B
0.9 · Texpo
(2)
where Csare the source net counts estimated by emlde-
tect, corrected to an area including 90% of the PSF (see
note 19), B are the background counts from the emldetect
background maps (counts/pixel2) multiplied by a circu-
lar area of radius corresponding to fpsf=90% and Texpo
is the vignetting corrected exposure time at the position
of the source from the exposure maps. a is a parame-
ter which accounts for the fact that the background at
the source position is not known with infinite precision.
a = 1 corresponds to the situation of a background area
equal to the source extraction area, which for Chandra is
always very small because of the very good PSF; a = 0
would correspond to assuming no uncertainty on the es-
timate of the average value of the background. Unfor-
tunately emldetect provides neither the B errors, nor the
information on the size of the region used to measure the
background counts. Because of the way emldetect esti-
mates the background counts, i.e. by a fit, using a sophis-
ticated background modeling (Cappelluti et al. 2007), we
are in an intermediate situation between the two extreme
cases a=0 and a=1. For this reason, we chose to adopt
a=0.5. This ensures that we are not under-estimating
23http://hea-www.harvard.edu/∼elvis/CCOSMOS.html
24http://irsa.ipac.caltech.edu/data/COSMOS/
the error on the background, even for sources close to
problematic areas like the edge of the field or CCD gaps.
We chose an area corresponding to fpsf=90%, because
this is the typical size of the area where emldetect works
for relatively bright sources. We checked that the errors
computed using Eq. 2 agree well with the errors evalu-
ated from aperture photometry (Sect. 6.4).
Fig. 15 plots the signal-to-noise ratio25of each source
as a function of DET ML. Note the regular behavior of
the signal-to-noise ratio, which increases smoothly and
monotonically with increasing DET ML, or with decreas-
ing Prandom(see Eq. 1), with a small dispersion around
the correlation. The six diagonal black lines show the
expectations computed for six values of the background
in the detection cell, from 0.5 counts to 8 counts. This
range is centered on ∼ 4 counts, a value typical for the
C-COSMOS survey (see Sect. 2), and accounts for two
effects: a) the differences in exposure time and b) the dif-
ference sizes of the source extraction region as a function
of the off-axis angle, due to the variation of the Chandra
PSF with the off-axis angle. This range of background
counts explains most of the observed dispersion in Fig.
15, especially for the faintest sources. For the brightest
sources in the F band, the best fit DET ML is some-
what smaller than expected based on the signal-to-noise
ratio, even for the case of a background of 8 counts per
detection cell. This can be explained if the fit of bright
sources is performed over an area significantly larger than
the 90% fpsf area, that so does not fully optimize the
signal-to-noise ratio. This shift is smaller for the S band
sources because of the smaller background in this band
with respect to the F band.
6.3. Fluxes
The emldetect count rates (R) were converted to fluxes
(Fx) using the formula: Fx=R/(CF·1011), where CF is
the energy conversion factor, that is evaluated by us-
ing spectra simulated through Xspec26, including the ap-
propriate on-axis response matrix and the chosen spec-
tral models. We used energy conversion factors of 0.742
counts erg−1cm2, 1.837 counts erg−1cm2, and 0.381
counts erg−1cm2appropriate for a power-law spectrum
with energy index αE= 0.4 and Galactic column density
for the COSMOS field (NH=2.7·1020cm−2), to convert
the F count rate into the 0.5-10 keV flux, the S count
rate into the 0.5-2 keV flux, and the H count rate into
the 2-10 keV flux, respectively. We extend the F and H
bands up to 10 keV, to allow an easier comparison with
the results of literature. The conversion factors are sen-
sitive to the spectral shape: for αE= 1 they change by
∼ 40% in the F band, by less than 5% in the S band and
by less than 25% in the H band. For absorbed power-law
spectra with NH= 1022cm−2and αE= 0.4 or αE= 1.0,
the conversion factors change by up to ∼ 46% in the F
band, by up to ∼
∼ 18% in the H band (see Tab. 3). The conversion fac-
tor for the F band depends more strongly on the spectral
shape because of the wider band.
17% in the S band, and by up to
6.4. Aperture photometry
25Ratio between the source count rate and the error on the
source count rate at 68% confidence level
26http://xspec.gsfc.nasa.gov/
Page 12
12 Puccetti et al.
Fig. 14.— Left panel: The error on the source position as a function of the off-axis angle for sources detected in the F band. Right panel:
the positional error in the F band as a function of the source count rate in 4 off-axis bins: filled circles = off-axis <2′; open squares =
2′<off-axis< 4′; filled squares = 4′<off-axis<6′; filled triangles = off-axis >6′. The off-axis angle is the distance of the sources candidate
position from the aim point of the pointing where the source position is measured with the best PSF (see Sect. 4.1).
Fig. 15.— The signal-to-noise ratio of each source as a function
of DET ML. Filled circles = F sources; open circles = S sources;
open squares = H sources. The six diagonal black lines correspond
to the expectations assuming a background of 8, 4, 2, 1, and 0.5
counts in the detection cell, from top to bottom.
In addition to PSF fitting photometry, we have also
performed standard aperture photometry on the sources
included in the final emldetect catalog. We find an overall
consistency between the two estimates, with the emlde-
tect count rates slightly larger, less than 10 %, than the
count rates by aperture photometry. For each source in
the catalog, aperture photometry was performed in F, S,
TABLE 3
Conversion factors for count rates to fluxes
αE
NH
CF(F)a
cts erg−1cm2
CF(S)a
cts erg−1cm2
CF(H)a
cts erg−1cm2
1022cm−2
0.4
1
0.4
1
0.027
0.027
1
1
0.742
1.042
0.508
0.712
1.837
1.759
2.12
2.151
0.381
0.474
0.361
0.447
aenergy conversion factor to convert the F count rate into the 0.5-10
keV flux (CF(F)), the S count rate into the 0.5-2 keV flux (CS(S)),
and the H count rate into the 2-10 keV flux (CF(H)) using the formula
Fx=R/(CF·1011) and appropriate for a absorbed power-law spectra with
the listed NH and αE.
and H band with the yaxx tool27. The aperture photome-
try values are derived from event data for each individual
Chandra observation, where a source is located. Then for
sources being located in multiple observations, the aper-
ture photometry is performed in each of the multiple ob-
servations, and the corresponding multiple aperture pho-
tometry values are combined to produce a single set of
values, using the appropriate method shown in Tab. 4.
To extract source counts, circular regions of radii cor-
responding to 50%, 90%, and 95% fpsf, centered on each
source location, are used for each observation, where the
source is located. The radii are calculated using the off-
axis and azimuthal angles of the source in each observa-
tion, and interpolating the circular fpsftable provided by
the CXC calibration group to the nearest angles. Mean
energies 2 keV, 1.2 keV, and 3.6 keV were chosen for
the F, S, and H band, respectively. To extract back-
ground counts, annuli with the inner edge at the 95%
fpsf radius plus 8 pixels, and with a width of 40 pixels
27http://cxc.harvard.edu/contrib/yaxx/
Page 13
C-COSMOS data analysis13
Fig. 16.— Ratio between the count rates evaluated by emldetect
and count rates evaluated by the aperture photometry as a function
of the emldetect count rates, for the F sources (blue filled dots), S
sources (green open dots), and H sources (red open squares). The
solid black line is the exact match between the emldetect count
rates and the aperture photometry count rates.
are used. To limit contamination, all sources that over-
lap with the source or background regions are masked by
using circular exclusion regions with the 95% fpsfradius.
Exclusions can also come from the CCD edge, with an
8 pixel padding inward from the edge. Aperture fluxes
for which the net source extraction area was less than
75% of the available area (i.e., the original circle prior to
exclusions) are not given in the catalog.
Using the region described above, photometry was ex-
tracted using the CIAO tool dmextract. The source net
counts were then corrected for the fraction of fpsf. dmex-
tract was also run on the exposure maps with exactly
the same regions in order to compute the vignetting cor-
rected exposure times, that are needed to compute the
source count rates.
Fig. 16 compares the count rates evaluated by emlde-
tect with the count rates evaluated by the aperture pho-
tometry. The median and interquartile of the count rate
ratios are 1.03±0.16, 1.08±0.19, 1.07±0.18 in the S, H,
and F band, respectively.
6.5. Upper limits
If a source is not detected in one band, we give the
90% upper limits to the source count rates and fluxes
in this band. The upper limits are computed following
as follows: if T is the total number of counts measured
at the position of a source not satisfying our detection
threshold, B are the expected background counts and X
are the unknown counts from the source, the 90% upper
limit on X (X(90%)) can be defined as the number of
counts X(90%) that gives 10% probability to observe T
(or less) counts. Applying the Poisson probability distri-
bution function, X(90%) is therefore obtained by itera-
tively solving for different X values the following equa-
tion:
0.1 = e−(X+B)
T
?
i=0
(X + B)i
i!
(3)
(see e.g., Narsky 2000). We collected the counts T both
from a region of 5 arcsec radius and from the aper-
ture photometry discussed in Section 6.5. The results
were always statistically consistent with each other. The
X(90%) upper limits derived with Eq. 3 do not take
into account the statistical fluctuations on the expected
number of background counts. In order to take the back-
ground fluctuations into consideration, we used the fol-
lowing procedure: if σ(B) is the root mean square of
B (e.g., σ(B)=√B for large B), we estimated the 90%
lower limit on B as B(90%) =B − 1.282 · σ(B)28and,
as a consequence, the “correct” 90% upper limit on X
becomes:
Xcorr(90%) ∼ X(90%) + 1.282· σ(B)
We used Xcorr(90%) as upper limits for C-COSMOS
sources. We also evaluated the upper limits following
the method described in Kashyap et al. (2009). Com-
paring the upper limits obtained using the two methods,
we found that our upper limits are generally more con-
servative (i.e., higher) than those which would be derived
using the method by Kashyap et al. (2009).
(4)
7. SURVEY SENSITIVITY AND SKY-COVERAGE
7.1. Survey sensitivity
In X-ray observations the sensitivity, i.e., the flux limit,
is not uniform in the field of view (FOV), due to two main
reasons: (1) the variable size of the PSF, that determines
the background counts that limit the source detection;
and (2) the vignetting of effective area. In C-COSMOS,
where we have used multiple overlapping pointings giving
different PSFs and vignetting factors for each observation
of each source, the problem of assessing the sensitivity at
each position in the field of view is more complex than
normal. To evaluate the C-COSMOS survey sensitivity,
we have developed a dedicated procedure by adapting
the analytical method, used for the easier case of the
ELAS-S1 mosaic (Puccetti et al. 2006 and references
therein), to the more complicated C-COSMOS mosaic.
In this procedure, the full C-COSMOS field is divided
into a grid of positions with spacing of 4 pixels, i.e., 2
arcsec. This bin size is a suitable balance between the
spatial resolution in the C-COSMOS survey, and the ram
memory required for computing the sensitivity maps. At
each point of the 2 arcsec grid, we evaluated the mini-
mum number of counts Cminneeded to exceed the fluc-
tuations of the background, assuming Poisson statistics
with a significance level equal to that used for the catalog
(i.e., 2 · 10−5, see Sect. 6.1), according to the following
formula:
PPoisson= e−B
∞
?
k=Cmin
Bk
k!
= 2 · 10−5
(5)
28The value 1.282 is the value appropriate for the 90% probabil-
ity (see e.g., Bevington P.R. and K. Robinson 1992). This approx-
imate formula produces 90% limits which differ by ∼ 10% (4%)
from the exact estimate for values of B = 5 (10) in the extraction
region, corresponding to 0.064 (0.128) cts/arcsec2(see Fig. 3).
Page 14
14Puccetti et al.
TABLE 4
Merge methods
ParameterSymbolMerge method
exposure time corrected for the vignetting
counts
background counts
net counts
Texpo
T
B
Cs
P
iTexpoi
P
P
P
qP
qP
qP
P
P
rP
rP
1
iTi
iBi
iCsi
ierr Ti2
errors on countserr T
errors on background counts err B
ierr Bi2
errors on net counts err Cs
ierr Csi2
Ri·Texpoi
Texpo
Rsi·Texpoi
Texpo
count ratesR
i
net count ratesRs
i
count rate errorserr R
ierr Ri2·Texpoi
Texpo
net count rate errorserr Rs
ierr Rsi2·Texpoi
Texpo
Note. — The index i indicates each of the observations where a source is located.
where B is the total background counts computed at the
position of each point (Pj) of the grid by B=?n
where i runs from 1 to the number of overlapping fields at
the position of each Pjand Biare the background counts
computed using the background map of each Chandra
pointing covering the position, in a region centered at
Pj and of radius Ri. Ricorresponds to a fixed value of
fpsf, and is evaluated from the distance of Pj and the
aim point of each single Chandra pointing covering the
position, using the CXC calibration. We solved Eq. 5
iteratively to calculate Cmin. The count rate limit, Rlim,
at each point of the grid is then computed by:
i=1Bi,
Rlim=
Cmin− B
fpsf· Texpo
(6)
where Texpois the total, vignetting corrected, exposure
time at each position of the grid, read from the merged
C-COSMOS exposure map (see notes 20, 22).
Finally, the flux limits at each Pjare computed using
the same conversion factor used for the real C-COSMOS
sources. This procedure is applied to the S, H, and F
bands to produce binned sensitivity maps.
7.2. Sky-coverage
The “sky-coverage” is the integral of the survey area
covered down to a given flux limit, as a function of the
flux in the sensitivity map. The solid lines in Fig. 12 are
the normalized sky-coverages, computed using the pro-
cedure described above and adopting fpsf = 0.5. We
studied how the sensitivity maps and the sky-coverage
depend on the assumption on fpsf and found that they
change less than few per cent for fpsf values up to 0.90.
We also studied how the sensitivity maps change using
different fpsf values at different off-axis angles where a
single source is observed, finding again very little change.
The reason for this behaviour is the relatively low back-
gound within each Ri, even at large off-axis angles.
A relatively large uncertainty in the sensitivity maps
and sky coverage computation is instead, the unknown
spectrum of the sources near the detection limit. The
magnitude of this uncertainty depends on the width
of the energy band, and therefore is largest in the F
band.
we calculated the sky coverage for power-law spectra
with αE= 1.0, and for absorbed power-law spectra with
αE= 0.4 or αE= 1.0 and NH= 1022cm−2, in addition
to the baseline case (αE= 0.4, NH= 2.7·1020cm−2; see
Fig. 17). At the flux limits corresponding to 90% com-
pleteness (see Tab. 2) the deviations are less than 3%,
∼ 3%, and ∼ 16% for the S, H, and F bands, respectively.
This uncertainty related to the unknown spectrum of the
sources becomes significant only at fluxes below the 50%
completeness.
To estimate the magnitude of the uncertainty,
7.3. The logN – logS
We used the catalogs of the sources detected in the sim-
ulations in the S, H, and F bands, and the sky-coverage
curves computed in Sect.
counts (logN – logS) of the sources detected in the sim-
ulations. We cut the catalogs in the S, H, and F band at
a signal-to-noise ratio higher than 2, 2.5, and 2.8, respec-
tively. These cuts are introduced because: (1) we do not
correct for Eddington bias, which may be strong (up to
30-50%) at the lowest flux limits; (2) low signal-to-noise
implies a large statistical uncertainty in the flux, which
in turn would introduce a large uncertainty on the num-
ber counts at the lowest fluxes; (3) at the lowest fluxes,
the sky-coverage is small, and the relative statistical and
systematic errors are therefore large, again introducing
large uncertainties in the number counts. We chose the
signal-to-noise thresholds by requiring that the devia-
tions between the logN – logS computed from the de-
tections and the input logN – logS are smaller than 5%.
The logN-logS are shown in Fig. 18. The flux limits im-
plied by the signal-to-noise thresholds are ∼ 2.3·10−16,
∼ 1.6 · 10−15, and ∼ 9.6 · 10−16erg s−1cm−2for the
0.5-2 keV, 2-10 keV, and 0.5-10 keV band, respectively.
These flux limits are fully consistent with the flux limits
of the logN – logS derived from the observed data (see
Paper I).
7.2 to obtain the number
8. COMPARISON WITH AEGIS-X
Chandra was used to perform a survey somewhat
similar to C-COSMOS in the Extended Groth-Streep
Page 15
C-COSMOS data analysis15
Fig. 17.— The sky-coverage calculated as in Sect. 7.2 for the
0.5-10 keV (top panel), 0.5-2 keV (middle panel), and 2-10 keV (bot-
tom panel) band. The black solid lines represent the sky-coverages
evaluated with the baseline model (i.e., power-law spectra with
αE= 0.4 absorbed by Galactic NH= 2.7 · 1020cm−2). The cyan
long-dashed lines represent the sky-coverages for power-law spectra
with αE = 1 absorbed by Galactic NH = 2.7 · 1020cm−2. The
blue short-dashed lines represent the sky-coverages for power-law
spectra with αE = 0.4 absorbed by NH = 1022cm−2. The red
dotted lines represent the sky-coverages for power-law spectra with
αE= 1 absorbed by NH= 1022cm−2. The black dot-long dashed
vertical lines represent the fluxes correspondig to the 90% and 50%
completeness, respectively.
(AEGIS-X, Laird et al. 2009). The 1.6 Ms AEGIS-X
survey is made of 8 ACIS-I pointings, each of a nomi-
nal 200 ksec exposure, with very little overlap, covering
∼ 0.67 deg2. While the effective exposure time and area
coverage are similar to C-COSMOS (see Fig. 19), the
tiling is completely different. In C-COSMOS each source
in the central area is observed at four to six different off-
axis angles, while in AEGIS-X each source is observed
only at one off-axis angle.
To compare the two surveys quantitatively, we cut the
C-COSMOS catalog at the same significance level used
for AEGIS-X (i.e., 4 · 10−6or DET ML=12.4, Laird et
al. 2009). We also recomputed the C-COSMOS sky-
coverage using the same significance level.
compares the C-COSMOS sky-coverage to the AEGIS-X
one computed without the Bayesian correction for the
Eddington bias.The C-COSMOS sky-coverage has a
significantly sharper drop toward lower fluxes than the
AEGIS-X sky-coverage. This means that the sensitivity
in C-COSMOS is more uniform over the field than in
AEGIS-X, while the AEGIS-X tiling reaches fainter lim-
iting fluxes than C-COSMOS. The estimated AEGIS-X
flux limit in the S band is 50% deeper than C-COSMOS,
while the flux limits in the H and F bands are about twice
as deep as C-COSMOS, albeit in small areas. The deeper
AEGIS-X flux limit in the H and F bands with respect to
the S band depends on the higher internal background in
these bands and on the smaller typical source extraction
Fig. 19
regions in the areas of best sensitivity of AEGIS-X with
respect to C-COSMOS. In fact, AEGIS-X has a PSF bet-
ter than ∼ 1 arcsec over an area of ∼ 0.15 deg2, while
the complex C-COSMOS tiling implies effective source
extraction regions of radii of ∼ 3 arcsec over most of the
area.
The more characteristic flux limits corresponding to
90% completeness in the F, S, and H bands are similar
in C-COSMOS and AEGIS-X, while the AEGIS-X flux
limits corresponding to 50% completeness in the F, S,
and H bands are lower than C-COSMOS by a factor 2-3
(see Tab. 5).
The more uniform sensitivity of C-COSMOS over the
field reaches a higher source density (see Tab. 5).
C-COSMOS we estimate a slightly lower number of
spurious sources at a higher significance level (i.e., 2·10−5
vs. 4 · 10−6, see Tab. 5), with respect to AEGIS-X sur-
vey. The number of spurious sources is roughly given by
the product of the significance level times the number of
independent detection cells in the field. The combination
of different PSFs at each C-COSMOS position produces
an effective source extraction region of ∼ 3 arcsec radius,
i.e., significantly wider than the Chandra PSF at off-axis
angles smaller than 5-6 arcmin. This means that the
number of independent cells per unit area is smaller in
C-COSMOS than in AEGIS-X. In conclusion, the lower
number of spurious detections in C-COSMOS with re-
spect to AEGIS-X at a given significance level is due to
the fact that each field is observed more than once at dif-
ferent off-axis angles and therefore with different PSFs.
9. CONCLUSION
The complex tiling of C-COSMOS survey required the
development of a tailored multistep procedure to fully
exploit the data. Detailed simulations were used to test
different detection (sliding cell and wavelet) and photom-
etry (PSF fitting and aperture photometry) algorithms.
In particular, we compared the results obtained using
the SAS eboxdetect and emldetect tasks, used for the
XMM-COSMOS survey (Cappelluti et al. 2007, 2009),
with those obtained using the PWDetect code (Damiani
et al. 1997). Through these tests we selected a pro-
cedure consisting in first identifying source candidates
using PWDetect, and then performing accurate PSF fit-
ting photometry and evaluating aperture photometry for
each source candidate. In this way we obtained subarc-
sec source localizations and accurate photometry even
for partly blended sources.
We set a threshold for source detection to P = 2·10−5,
which implies a completeness of 87.5% and 68% for
sources with at least 12 and 7 F band counts, respec-
tively, and 3 to 5 spurious detections in the F band at
the same count limits, respectively.
We evaluated the survey sensitivity and the sky-
coverage, through an analytical method, tuned using
simulations. We then evaluated the logN – logS of the
detected sources in the simulations down to F, S, and H
band flux limits of Fx∼ 2.3 · 10−16, ∼ 1.6 · 10−15, and
∼ 9.6 · 10−16erg s−1cm−2, respectively.
Finally we compared the C-COSMOS survey to the
AEGIS-X survey, a Chandra survey with similar sky-
coverage and total exposure time, but using non overlap-
ping ACIS-I pointings. We found that the complex tiling
of C-COSMOS helps in obtaining a contiguous area with