The XMM Cluster Survey: X-ray analysis methodology
E. J. Lloyd-Davies, A. Kathy Romer, Nicola Mehrtens, Mark Hosmer, Michael Davidson, Kivanc Sabirli, Robert G. Mann, Matt Hilton, Andrew R Liddle, Pedro T. P. Viana, Heather C. Campbell, Chris A. Collins, E. Naomi Dubois, Peter Freeman, Craig D. Harrison, Ben Hoyle, Scott T. Kay, Emma Kuwertz, Christopher J. Miller, Robert C. Nichol, Martin Sahlen, S. A. Stanford, John P. Stott
ABSTRACT The XMM Cluster Survey (XCS) is a serendipitous search for galaxy clusters
using all publicly available data in the XMM-Newton Science Archive. Its main
aims are to measure cosmological parameters and trace the evolution of X-ray
scaling relations. In this paper we describe the data processing methodology
applied to the 5,776 XMM observations used to construct the current XCS source
catalogue. A total of 3,675 > 4-sigma cluster candidates with > 50
background-subtracted X-ray counts are extracted from a total non-overlapping
area suitable for cluster searching of 410 deg^2. Of these, 993 candidates are
detected with > 300 background-subtracted X-ray photon counts, and we
demonstrate that robust temperature measurements can be obtained down to this
count limit. We describe in detail the automated pipelines used to perform the
spectral and surface brightness fitting for these candidates, as well as to
estimate redshifts from the X-ray data alone. A total of 587 (122) X-ray
temperatures to a typical accuracy of < 40 (< 10) per cent have been measured
to date. We also present the methodology adopted for determining the selection
function of the survey, and show that the extended source detection algorithm
is robust to a range of cluster morphologies by inserting mock clusters derived
from hydrodynamical simulations into real XMM images. These tests show that the
simple isothermal beta-profiles is sufficient to capture the essential details
of the cluster population detected in the archival XMM observations. The
redshift follow-up of the XCS cluster sample is presented in a companion paper,
together with a first data release of 503 optically-confirmed clusters.
-
Citations (0)
-
Cited In (0)
Page 1
Mon. Not. R. Astron. Soc. 000, 1–?? (2010)Printed 6 October 2010(MN LATEX style file v2.2)
The XMM Cluster Survey: X-ray analysis methodology
E. J. Lloyd-Davies,1?A. Kathy Romer,1Mark Hosmer,1Nicola Mehrtens,1
Michael Davidson,2Kivanc Sabirli,3Robert G. Mann,2
Matt Hilton,4,5Andrew R. Liddle,1Pedro T. P. Viana,6,7
Heather C. Campbell,1,8Chris A. Collins,9E. Naomi Dubois,1Peter Freeman,10
Ben Hoyle,8,11Scott T. Kay,12Emma Kuwertz,1Christopher J. Miller,13
Robert C. Nichol,8Martin Sahl´ en,14S. Adam Stanford,15,16John P. Stott9
1Astronomy Centre, University of Sussex, Falmer, Brighton, BN1 9QH, UK
2SUPA, Institute for Astronomy, University of Edinburgh, Royal Observatory, Edinburgh, EH9 3HJ, UK
3Department of Physics, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
4School of Mathematical Sciences, University of KwaZulu-Natal, Private Bag X54001,Durban 4000, South Africa
5University of Nottingham, School of Physics & Astronomy, Nottingham, NG7 2RD, UK
6Centro de Astrof´ ısica da Universidade do Porto, Rua das Estrelas, 4150-762, Porto, Portugal
7Departamento de F´ ısica e Astronomia, Faculdade de Ciˆ encias, Universidade do Porto, 4169-007 Porto, Portugal
8Institute of Cosmology and Gravitation, Dennis Sciama Building, Burnaby Road, Portsmouth, PO1 3FX, UK
9Astrophysics Research Institute, Liverpool John Moores University, Twelve Quays House, Egerton Wharf, Birkenhead, CH41 1LD, UK
10Department of Statistics, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
11Institut de Ci` encies del Cosmos (ICCUB), Departmento de F´ ısica, Mart´ ı i Franqu´ es 1, 08034 Barcelona, Spain
12Jodrell Bank Centre for Astrophysics, School of Physics and Astronomy, The University of Manchester, Manchester M13 9PL
13Astronomy Department, University of Michigan, Ann Arbor, MI 48109, USA
14The Oskar Klein Centre for Cosmoparticle Physics, Department of Physics, Stockholm University, SE-106 91 Stockholm, Sweden
15Physics Department, University of California, Davis, CA 95616, USA
16Institute of Geophysics and Planetary Physics, Lawrence Livermore National Laboratory, Livermore, CA 94551, USA
Accepted 2010 ??. Received 2010 ??; in original form 2010 October 4
ABSTRACT
The XMM Cluster Survey (XCS) is a serendipitous search for galaxy clusters using
all publicly available data in the XMM- Newton Science Archive. Its main aims are to
measure cosmological parameters and trace the evolution of X-ray scaling relations.
In this paper we describe the data processing methodology applied to the 5776 XMM
observations used to construct the current XCS source catalogue. A total of 3669 > 4-σ
cluster candidates with >50 background-subtracted X-ray counts are extracted from
a total non-overlapping area suitable for cluster searching of 410 deg2. Of these, 1022
candidates are detected with >300 X-ray counts, and we demonstrate that robust
temperature measurements can be obtained down to this count limit. We describe in
detail the automated pipelines used to perform the spectral and surface brightness
fitting for these sources, as well as to estimate redshifts from the X-ray data alone.
A total of 517 (126) X-ray temperatures to a typical accuracy of <40 (<10) per
cent have been measured for XCS cluster candidates with redshifts, the largest such
sample assembled to date. We also present the methodology adopted for determining
the selection function of the survey and show, by inserting mock clusters derived
from hydrodynamical simulations into real XMM images, that the extended source
detection algorithm is robust to a range of cluster morphologies. These tests show that
the simple isothermal β-model surface brightness profile is sufficient to capture the
essential details of the cluster population detected in the archival XMM observations.
The redshift follow-up of the XCS cluster sample is presented in a companion paper,
together with a first data release of optically-confirmed clusters with redshift and
temperature measurements.
Key words: X-rays: galaxies: clusters — galaxies: clusters: intracluster medium —
surveys — cosmology: observations
c ? 2010 RAS
arXiv:1010.0677v1 [astro-ph.CO] 4 Oct 2010
Page 2
2E. J. Lloyd-Davies et al.
1INTRODUCTION
Clusters of galaxies are massive objects (1013.5−15M?) com-
prised of galaxies, hot ionised gas and dark matter. The
gravitational potential is dominated by dark matter, with
the mass ratio of the three components being roughly 5:15:80
respectively. Clusters provide us with the opportunity to ob-
tain information about the underlying cosmological model
and important insights into the processes that govern struc-
ture formation (see Voit 2005 for a review).
While detailed studies of individual clusters are ex-
tremely important, especially for obtaining insight into the
small-scale processes that influence the evolution of their
baryonic components, a full understanding of the complex
nature of cluster formation and evolution requires the study
of the galaxy cluster population as a whole. This is best
achieved, in practice, by undertaking cluster surveys. The
first large cluster surveys were carried out via eye-ball
searches for galaxy over-densities on optical photographic
plates (Abell 1958; Zwicky et al. 1968), but, nowadays, clus-
ter finding uses sophisticated automated techniques.
In this paper we describe automated cluster finding
at X-ray wavelengths; the hot ionised gas (or intracluster
medium/ICM) emits soft X-ray radiation in proportion to
the square of the electron density. However, this is not the
only way new clusters are being discovered. For example, the
effect of cluster sized gravitational potentials can be seen in
the optical/infra-red, via strong or weak gravitational lens-
ing (e.g. Wittman et al. 2003). Increasing numbers of clus-
ters are also being discovered at millimetre wavelengths (e.g.
Staniszewski et al. 2009; Vanderlinde et al. 2010; Menanteau
et al. 2010) using the Sunyaev-Zel’dovich (SZ) effect (Sun-
yaev & Zeldovich 1972): the inverse Compton scattering of
photons from the cosmic microwave background (CMB) by
the hot ICM. At longer wavelengths still, one can discover
clusters out to high redshift using radio telescopes, via the
unusual signature of head-tail galaxies (Blanton et al. 2003).
Due to the advent of large format CCD detectors, cluster
finding using galaxy over-densities is also currently under-
going a renaissance (e.g. Gladders & Yee 2000; Miller et al.
2005; Koester et al. 2007; Wilson et al. 2009).
Cluster surveys have already revolutionised our under-
standing of the physics of the ICM (e.g. Ponman et al.
1999; Arnaud et al. 2010) and delivered cosmological con-
straints independent of, and competitive with, those de-
rived from observations of the CMB (e.g. Larson et al. 2010;
Dunkley et al. 2010) and Type 1a supernovae (e.g. Kessler
et al. 2009). When combined with these other cosmological
probes, clusters are playing an important role in the quest to
understand the nature of dark energy (e.g. Vikhlinin et al.
2009; Mantz et al. 2010; Rozo et al. 2010, see Sahl´ en et al.
2009 for a review of earlier cluster cosmology studies dating
back to Frenk et al. 1990 and Oukbir & Blanchard 1992).
Clusters are also being used to test general relativity on
large scales (e.g. Rapetti et al. 2010), constrain the proper-
ties of neutrinos (e.g. Mantz et al. 2010), and search for ev-
idence of non-Gaussian primordial density fluctuations (e.g.
Hoyle et al. 2010).
Clusters are also being used to test general relativity
on large scales (e.g. Rapetti et al. 2010), constrain the prop-
erties of neutrinos (e.g. Mantz et al. 2010), and search for
signatures of non-Gaussianity in the matter power spectrum
(e.g. Hoyle et al. 2010).
Future cluster surveys will be wider, more senstive and
better calibrated than ever before, and so are sure to deliver
significantly improved constraints compared to these exist-
ing works (e.g. Predehl et al. 2006; Majumdar & Mohr 2004;
Cunha et al. 2009; Wu et al. 2010).
In this paper we present the XMM Cluster Survey
(XCS), a search for serendipitous galaxy clusters in archival
XMM-Newton observations. The original XCS concept and
motivation is described in Romer et al. (2001). The main
goals of the survey are (i) to measure cosmological parame-
ters, (ii) to measure the evolution of the X-ray luminosity–
temperature scaling relation (LX− TX relation, hereafter),
(iii) to study galaxy properties in clusters to high redshift,
and (iv) to provide the community with a high quality, ho-
mogeneously selected X-ray cluster sample. The XCS fol-
lows a rich tradition of X-ray cluster surveys dating back al-
most 30 years using earlier satellites: Piccinotti et al. (1982,
HEAO I ), Gioia et al. (1990, Einstein ), and several derived
from the ROSAT All Sky Survey (RASS; Ebeling et al. 1998;
B¨ ohringer et al. 2000; Ebeling et al. 2000; Ebeling et al. 2001,
2002; Cruddace et al. 2002; Gioia et al. 2003; B¨ ohringer et al.
2004; Henry et al. 2006), and the ROSAT pointed observa-
tions archive (Rosati et al. 1998; Romer et al. 2000; Perlman
et al. 2002; Mullis et al. 2003; Burke et al. 2003; Burenin
et al. 2007; Horner et al. 2008).
The XCS is not the only project currently exploiting
the XMM–Newton (XMM hereafter) archive for new detec-
tions of clusters. Other projects include: XDCP (Mullis et al.
2005; Fassbender et al. 2008; Santos et al. 2009; Schwope
et al. 2010; Fassbender et al. 2010); XMM-LSS (Pierre et al.
2006; Bremer et al. 2006; Pacaud et al. 2007); SEXCLAS
(Kolokotronis et al. 2006); COSMOS (Finoguenov et al.
2007); XMM-BSC (ˇSuhada et al. 2010) ; SXDS (Finoguenov
et al. 2010); and one being carried out by members of the
XMM Survey Science Center (Schwope et al. 2004; Lamer
et al. 2008a). This intense international interest stems from
the fact that XMM has several advantages for cluster search-
ing: in essence it combines sensitivity, and a large field of
view, with spectral imaging capabilities. Serendipitous clus-
ter surveys have also been conducted using the Chandra
archive (e.g. Barkhouse et al. 2006), although the available
area for cluster searching is significantly smaller in compar-
ison to the XMM archive.
The XMM image quality does not match that of Chan-
dra, but it is still good enough to allow one to differenti-
ate between point-like and extended sources over the whole
field of view: given that clusters dominate the extended X-
ray source population, this then allows us to identify clus-
ter candidates efficiently, despite the fact that clusters only
comprise ?10% of the total X-ray source population. More-
over, the spectral capabilities of XMM allow the measure-
ment of the temperature of the hot ICM directly from the
discovery data. These TX measurements allow us to then
estimate cluster masses, something of vital importance to
cosmological studies. Finally, the mission has been in oper-
ation for over 10 years, and has built up a large archive of
observations distributed across the sky. By now there are
several hundred square degrees available that are suitable
for a serendipitous cluster survey, already exceeding that of
the largest deep ROSAT survey (Burenin et al. 2007).
c ? 2010 RAS, MNRAS 000, 1–??
Page 3
The XMM Cluster Survey: X-ray analysis methodology3
As predicted in Romer et al. (2001), and now demon-
strated below, XCS will deliver the largest number of clus-
ter temperature measurements to date. Importantly, these
clusters will form a homogeneous sample (both in terms of
selection and analysis) and have a well-understood selec-
tion function. XCS could ultimately (given sufficient redshift
follow-up) measure TX values for 1000 or more clusters. By
comparison, the largest compilations of TX values from ho-
mogeneous samples contain less than 100 clusters each, e.g.
Ikebe et al. (2002, 61 clusters) Henry (2004, 25 clusters),
Pratt et al. (2009, 31 clusters), and Pacaud et al. (2007,
29 clusters). Larger compilations of clusters with heteroge-
neous selection do exist, and some have significantly better
per cluster TX precision than XCS, but even so the largest
published collection is still only 115 strong (Maughan 2007)
(a larger sample, of 273 low-redshift clusters, was put to-
gether by Horner 2001, but was not made public).
XCS highlights to date include the detection and sub-
sequent multi-wavelength follow-up of a z = 1.46 cluster
(XMMXCS J2215.9-1738; Stanford et al. 2006; Hilton et al.
2007, 2009, 2010), which for several years held the record
for the highest redshift spectroscopically confirmed cluster
(a z = 1.62 system was recently discovered by Tanaka et al.
2010; Papovich et al. 2010). XCS clusters have also been
used in compilation studies of galaxy evolution in high red-
shift clusters (Collins et al. 2009; Stott et al. 2010). Conser-
vative forecasts of the performance of XCS for cosmological
parameter estimation and cluster scaling relations can be
found in Sahl´ en et al. (2009): we expect to measure (at 1-σ
and from clusters alone, i.e. not in combination with CMB
or supernovae observations) Ωm to ±0.03 (and ΩΛ to the
same accuracy assuming flatness), and σ8 to ±0.05, whilst
also constraining the normalisation and slope of the LX−TX
relation to ±6 and ±13 per cent, respectively.
In this paper, we present an overview of the XCS data
analysis strategy, from acquiring the data to producing a
catalogue. A schematic of our approach is shown in Figure 1,
although note that components indicated with dashed out-
lines are discussed elsewhere (Mehrtens et al. 2010). The
paper is broken up into 3 main sections. In Section 2 we
describe data acquisition, reduction and image generation.
In Section 3 we describe source detection and the compi-
lation of source lists and candidate lists. In section 4 we
describe how we return to the XMM-Newton observations
of our cluster candidates to measure redshifts, temperatures
and luminosity. Broadly speaking these 3 sections cover data
reduction, source detection and post processing respectively.
These sections also reflect the collaborative nature of the
project: Sections 2 and 3 were a result of the doctoral thesis
work of K. Sabirli and M. Davidson respectively (although
their respective roles have subsequently been taken over by
E. Lloyd-Davies and M. Hosmer), whereas Section 4 was the
responsibility of E. Lloyd-Davies.
2XMM DATA REDUCTION
The XMM archive contains thousands of public observations
suitable for conducting the XCS survey. Such a large volume
of data means we have to carry out most of the XCS in a
fully automated manner (the only parts that are not auto-
mated are the mask making, § 2.4.1, optical follow-up and
quality control, Mehrtens et al. (2010)). While this automa-
tion presents a number of challenges, in terms of handling
the variety and complexity of the archival data, it also has
a number of benefits: not only has the entire data set been
treated in a consistent and systematic way, but we are also
able to run realistic simulations of our selection function.
In this section we describe how the raw XMM archive
is manipulated into science grade image files. First the data
are downloaded from the remote storage facility at the Euro-
pean Space Astronomy Centre (ESAC) near Madrid to the
University of Sussex (§ 2.2). Then the data are calibrated
and cleaned of periods of high background contamination
(§ 2.3). Next, images are produced (§ 2.4) and flux conver-
sion factors calculated (§ 2.5). We begin this section with an
overview of some of the salient features of the XMM mission.
2.1The XMM–Newton Mission
The XMM mission (Jansen & Laine 1997) consists of three
co-aligned Wolter Type I (Wolter 1952b,a) X-ray telescopes
mounted on the same spacecraft. The mission was under-
taken by the European Space Agency (ESA) and the space-
craft was launched on 10th December, 1999. The mission
configuration, with three separate telescopes simultaneously
illuminating three cameras, means that most exposure gen-
erate data with potential for serendipitous source finding (by
comparison Chandra, Weisskopf 1999, has a single telescope
that illuminates only one of several instruments at any given
time, and not all those instruments are suitable for cluster
finding).
The European Photon Imaging Camera (EPIC: Villa
et al. 1996) consists of three separate cameras, each in the
focal plane of a separate X-ray telescope. Each camera con-
sists of an array of charge-coupled devices (CCDs: Boyle &
Smith 1970) in different configurations. Two cameras, the
EPIC-mos1 and 2, consist of arrays of 7 metal oxide semi-
conductor CCDs illuminated by 44% of the light from their
respective telescopes (the rest is redirected to the Reflection
Grating Spectrometers). The EPIC-pn camera consists of 12
back-illuminated CCDs. These CCDs are not only more sen-
sitive than those in the mos cameras, but the pn receives all
the light from its respective telescope. Thus, the EPIC-pn
camera has therefore more than twice the sensitivity of the
EPIC-mos cameras.
One disappointing aspect of both XMM and Chandra
has been the unexpectedly high background in their CCD
cameras. Both these missions are in similar, highly elliptical
orbits, and it was only after their launch that it was re-
alised that these orbits intersect a population of low-energy
protons trapped in the Earth’s magnetosphere. The lower
energy protons can be funnelled by the grazing incidence
mirrors onto the detectors and has resulted in a significantly
higher background than was expected before launch. Con-
sequently, certain aspects of XCS have proved to be more
challenging than was anticipated in our pre-launch predic-
tions (Romer et al. 2001). In addition to the enhanced back-
ground, there have been a number of incidents of damage to
the EPIC cameras while in orbit, but in only one case has
this resulted in a significant loss of detector area (Abbey
et al. 2006).
c ? 2010 RAS, MNRAS 000, 1–??
Page 4
4E. J. Lloyd-Davies et al.
XMM Archive
Download and clean
event lists
Produce Images
Source Detection
Compile Master
Detection List
Select Cluster
Candidates
Select Best
Redshift
Produce Spectra
X-ray Redshifts
NED Redshifts
Redshift Follow-up
(New)
Redshift Follow-up
(Archive)
Produce Spectra
Spectral Fitting
(Temperatures)
Produce Images
Spatial Fitting
(Luminosities)
Quality Control
Complie Cluster
Catalogue
All Clusters
All Observations
All Candidates
Figure 1. Flowchart showing an overview of the XCS analysis methodology. This illustrates the sequence by which data from the XMM
archive is used to create a catalogue of galaxy clusters.
2.1.1XMM–Newton Point Spread Function
A crucial issue for the detection of extended sources by XCS
is the treatment of the XMM Point Spread Function (PSF).
The PSF is a strong function of off-axis angle and photon
energy (where off-axis angle is the angle between the posi-
tion in the image and the centre of the field of view). As
off-axis angle increases, the PSF shape morphs from be-
ing circularly symmetric to ellipsoidal and finally bow-tie
shaped. There have been a number of attempts to charac-
terise the XMM PSF including: simulations based on mea-
surements of the shape of the mirrors (Gondoin et al. 1996);
measurements taken on the ground by passing X-ray beams
from synchrotron sources through XMM mirror modules
(Stockman et al. 1997; Gondoin et al. 1998); and fitting 1-
dimensional profiles to observations of bright X-ray sources
(Gondoin et al. 2000; Ghizzardi 2001, 2002; Read 2004).
Unfortunately, thus far, this has not resulted in a complete
and reliable characterisation of the XMM PSF. Currently
four PSF models are available: the Low, Medium, High and
Extended Accuracy Models (Altieri et al. 2004). Of these,
only the Medium Accuracy Model (MAM) is 2-dimensional,
but as it is based on simulations that relied on on pre-
launch measurements of the mirrors, it suffers from a num-
ber of deficiencies. The Extended Accuracy Model (EAM)
is a 1-dimensional model based on in-orbit measurements of
real sources, and is considered the most accurate but obvi-
ously does not encapsulate the complex 2-dimensional struc-
ture observed in the PSF at large off-axis angles. Currently
in XCS, we use the EAM when measuring source extents
(§ 3.2.3), and when carrying out spatial fits to cluster sur-
face brightness profiles (§4.3.2), and we use the MAM when
estimating the survey selection function (§ 3.4.2). However,
in the near future we hope to use a more appropriate 2-d
model that correctly encodes the off-axis and azimuthal vari-
ation of the PSF. This is currently under development and
is based on 2-dimensional fits to bright (but not piled-up)
points sources discovered by XCS.
2.2Data Acquisition
In Figure 2 we illustrate how the non-overlapping area in
the public XMM–Newton archive has grown over the past
ten years, both in terms of total area and in terms of area
suitable for the discovery of clusters, i.e. outside the Galaxy
(|b| > 20◦) and Magellanic Clouds (> 6◦[3◦] of the Large
[Small] Magellanic Clouds). We note that these calculations
take into account other, smaller, regions deemed by XCS to
be unsuitable for serendipitous source detection (see § 2.4.1).
By now there are over 600 deg2of the sky covered by XMM,
but of that, only ?50deg2, 280 deg2and 410 deg2, at > 40
ks, > 10 ks and > 0 ks depths respectively (exposure times
after flare cleaning, § 2.3.3), are in regions suitable for cluster
searching. Note that this area is distributed across the sky
(Figure 3) rather than as a contiguous region.
As shown in Figure 2, new data enters the archive al-
most every day, but due to practical constraints we have only
processed the data in a small number of large batches, corre-
sponding to all the public EPIC data available at that par-
ticular time. The downloads take advantage of the Archive
InterOperability System (AIO: Arviset et al. 2004); this pro-
tocol allows the XMM Science Archive (XSA: Clavel 1998;
Arviset et al. 2002) to be searched in an automated fash-
ion. At the time of writing, the most recent download was
completed on 21st July 2010, corresponding to 5776 sepa-
rate XMM observations. Their locations are shown in Fig-
ure 3. Each of these observations (including those broken
down into multiple exposures) has a unique ObsID that can
be used for identification. For each observation there is a
set of Observation Data Files (ODF) that contains all the
c ? 2010 RAS, MNRAS 000, 1–??
Page 5
The XMM Cluster Survey: X-ray analysis methodology5
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
Year
0
100
200
300
400
500
600
Cumulative Area (degree2)
> 0 ks
> 10 ks
> 20 ks
> 30 ks
> 40 ks
> 50 ks
Figure 2. Cumulative sky area covered by public data in the
XMM–Newton archive as a function of time for the whole sky
(solid) and excluding the Galactic plane and Magellanic Clouds
(dashed) and for a variety of different exposure time cuts. The
flattening of the curves mid-way through 2009 reflects the fact
that proprietary observations only become public a year after
they are completed.
observation-specific data necessary to analyse the observa-
tion. We note that, even with appropriate compression etc.,
the XCS archive, of raw and processed data products, takes
up on the order of 4 terabytes.
2.3Data Reduction
The data reduction was carried out in a fairly standard man-
ner (see for instance section 3 of Read & Ponman 2003).
Only events with patterns (characterisations of how many
CCD pixels are involved in an event) 0-4 were used for the
EPIC-pn and 0-12 for the EPIC-mos. A schematic of the
data reduction procedure is shown in Figure B1.
2.3.1 Calibration
The reduction and analysis of XMM data requires calibra-
tion information detailing how the telescopes and instru-
ments behave, e.g. the effective area of the XMM telescopes
and the detection efficiency of the instruments (both being
functions of photon energy and detector position), plus the
instrumental uncertainty associated with measuring photon
energies. The most up-to-date version of the XMM Current
Calibration Files (CCF), as of 21st July 2010, were used for
the analysis presented herein.
2.3.2Software Versions
Several different software packages are deployed for XCS
analysis: version 10.0.0 of the Science Analysis Software
(SAS: Gabriel et al. 2004); version 6.9 of HEASOFT (Black-
burn 1995); version 4.2 of CIAO (Doe et al. 2001; Deponte
Evans et al. 2008); and version 12.6.0i of XSPEC (Arnaud
1996). In order for these packages to be used in the au-
tomated batch manner needed for XCS, several different
wrapper programmes were written in scripting languages.
For the work described in section 2 and 4, version 2.6.4 of
Python (docs.python.org) was used to write these wrapper
programmes, whereas version 7.1 of IDL (www.ittvis.com)
was used for the work presented in section 3.
2.3.3 Flare Cleaning
One important aspect of our pipeline reduction was the
treatment of background flares. It is well documented (Lumb
et al. 2002; Read & Ponman 2003; Pradas & Kerp 2005) that
XMM observations often suffer from periods of enhanced
particle background, caused mostly by variations in solar
activity in conjuction with the position of the spacecraft in
it’s orbit. To increase the signal-to-noise of the data, we have
designed an automated procedure to remove periods of high
background. This was achieved by creating a lightcurve, di-
vided into 50-second bins. The bin size was chosen to balance
a reasonable time resolution with minimising shot noise.
This lightcurve was first generated, and cleaned, using the
high-energy events (12-15 keV for the EPIC-pn and 10-12
keV for the EPIC-mos cameras), because these events are
more likely to be from the particle background than from
astronomical sources. The cleaning process is then repeated,
using a soft-energy lightcurve (0.2-1.0 keV), to account for
periods of elevated background coming from soft protons.
The cleaning process for each energy band involved
an iterative 3-σ clipping procedure that selected which 50s
bins to exclude. The mean and standard deviation of the
lightcurve were calcuated and bins more than ±3-σ from the
mean were removed. The 3-σ limits were then re-calculated
and the process repeated up to 50 times or until a stable
state is reached, whereby the bins that are being excluded
are not changing (note that previously excluded bins can be
re-instated in subsequent iterations if the 3-σ limits become
larger). The maximum of 50 iterations was set to avoid cases
where the stable solution oscillates between two or more sim-
ilar states.
We note that before the first 3-σ clipping took place,
an initial maximum rate threshold is used to “clip” the
lightcurve. This threshold is the greater of either 50 counts
per bin for the EPIC-pn (and half this for the EPIC-mos
cameras) or 125 percent of the highest value in the lowest
5 percent of the bins. This initial filtering was found to im-
prove the flare cleaning results when flares accounted for a
large fraction of the total exposure time. A flowchart illus-
trating the flare cleaning steps is shown in Figure B2. Figure
4 shows an example hard-band lightcurve before and after
cleaning.
The combination of the excluded bins for the hard and
soft background lightcurves is then used to define the good
time intervals (GTI) used to filter the raw event files. Fig-
ure 5 shows the distribution of observation exposure times
before and after the process of flare cleaning. The filtered
event files are used several times during XCS analysis. They
are used to produce the images (§ 2.4) used for the initial
XCS source detection (§ 3.1) and then again to determine
spectroscopic (§ 4.2, 4.4) and spatial parameters (§ 4.3) for
the cluster candidates.
c ? 2010 RAS, MNRAS 000, 1–??
Page 6
6E. J. Lloyd-Davies et al.
Figure 3. The distribution on the sky of the 5776 observations in the XMM archive as of 21st July 2010. Locations in green [blue]
are inside [outside] the proposed footprint of the Dark Energy Survey (darkenergysurvey.org). The Galactic plane and locations of the
Magellanic Clouds are highlighted by the red dashed line (we do not carry out cluster searches within those regions).
020000 40000 60000
Time (s)
80000100000 120000
0
500
1000
1500
2000
Hard-band Counts
0200004000060000
Time (s)
80000100000 120000
0
10
20
30
40
Hard-band Counts
Figure 4. EPIC-pn example hard band lightcurve with 50s bins. Left panel: Raw events before cleaning. Right panel: Cleaned events
with periods of high background removed.
2.4Image Production
Starting with cleaned event lists described above (§ 2.3.3),
the individual camera exposures were spatially binned, with
a pixel size of 4??.35, to generate images. This pixel size was
chosen because it is smaller than the PSF, at all detector
locations and photon energies, and close to an integer mul-
tiple of the raw EPIC pixel size. Images were produced in
two bands, soft (0.5-2.0 keV) and hard (2-10 keV). Exposure
maps were also created for each image. The exposure maps
encode the impact of vignetting on the image sensitivity and
also record the locations of chip gaps, bad rows etc.
The EPIC cameras do not have shutters, so events re-
ceived while an observation is reading out, the so called out-
of-time events, will be assigned incorrect positions and ener-
gies. For XCS, only EPIC-pn images were corrected for out
of time events, because the EPIC-mos cameras have much
lower readout rate and negligible out-of-time events. The
EPIC-pn corrections were done in the standard way, i.e. the
event file is recreated assuming all the events are out-of-time
and assigning them new positions along the CCD column at
random. These are then used to create out-of-time images
that can be subtracted off the true images (with the appro-
priate correction for the fraction of out of time events).
The images and exposure maps for the individual cam-
eras were merged to create a single image and exposure map
per observation. For this, the EPIC-mos maps were scaled
to that of the EPIC-pn camera using the previously calcu-
lated ECFs (§ 2.5). Examples of XCS generated exposure
maps and images can be seen in Figures A1 and 6. A total
of 5642 image files have been generated from the 5776 XMM
observations that make up the current XCS dataset (a small
number of observations in the archive are not suitable for au-
tomated image generation for a variety of technical reasons
such as telemetry and calibration issues etc.).
c ? 2010 RAS, MNRAS 000, 1–??
Page 7
The XMM Cluster Survey: X-ray analysis methodology7
100.01000.010000.0100000.0
Exposure (s)
10.0
100.0
1000.0
Observations
100.01000.010000.0100000.0
Exposure (s)
10.0
100.0
1000.0
Observations
Figure 5. Distribution of observation exposure times. Left panel: Before (green) and after (blue) the process of flare cleaning. Right
panel: Observations in which extended XCS sources with 300 or more counts were detected (red), compared with all observations (after
flare cleaning).
Figure 6. Examples of reduced and merged XMM images with a variety of different target types.
c ? 2010 RAS, MNRAS 000, 1–??
Page 8
8E. J. Lloyd-Davies et al.
Figure 7. Examples of masks created for the lower two images in figure 6.
2.4.1Image Masking
The production of images is an automated process, however
they do need to be checked by eye before passing them to the
source detection pipeline (§ 3.1). This is because we down-
load all public data, regardless of the intended (by the PI)
target. As a result, the XCS image archive includes ObsIDs
with very extended targets (such as low-redshift clusters or
Galactic supernova remnants) and ObsIDs with very bright
targets (such as luminous AGN). The very extended targets
will enhance the background level over the majority of the
XMM field of view, and thus reduce our ability to make
serendipitous detections of sources. The very bright sources
will generate artefacts in the images, such as radial spikes
and out-of-time bleed trails; those artefacts could then be
falsely identified as additional sources. The eye-balling pro-
cess identifies ObsIDs that should be completely excluded
from the other stages of the XCS pipelines. It also allows
us to mask out regions of ObsIDs that are only partially af-
flicted by bright/extended targets. Approximately one-third
of observations require some degree of masking, with the me-
dian area lost being around 4 percent (though this can be a
high as 80 percent in extreme cases). The mask files are of
the same dimensions as the image files and are used during
the source detection and also when creating backgrounds for
the spectral and spatial fitting. We show some examples of
XCS images that require full or partial masking in Figure 7.
2.5Energy Conversion Factors
In order to be able to convert image source counts into en-
ergy fluxes, energy conversion factors (ECFs) need to be cal-
culated. These are necessarily model dependent and are af-
fected not only by the source and instrument properties but
also by the HI column, nH hereafter, along the line of sight.
In our survey, the source properties are not known in ad-
vance, so a generic model has to be assumed. Since the vast
majority of the sources detected by XCS are point sources,
and point sources are likely to have power law spectra, the
model used to calculate the conversion is an absorbed power
law with a canonical AGN index of 1.7 (Mushotzky et al.
1993). The photoelectric absorption is set to the appropriate
nH for the field (§ 2.5.1). The ECFs were calculated, using
the XSPEC spectral fitting package and the on-axis spectral
responses, for each camera exposure in each observation. For
the specified model, the ratio of the resulting flux and count-
rate is stored as the ECF for that exposure. ECFs are not
exposure time dependent, but due to variations in nH, the
choice of optical blocking filter and the effective area of the
instrument, ECFs in XCS still vary from exposure to expo-
sure and from ObsID to ObsID. They generally range from
4.4 to 6.6 for the EPIC-pn and 1.6 to 2.0 for the EPIC-mos
cameras (in units of 10−11cts cm−2erg−1). Even though
the ECFs are calculated for the on-axis aim point, they can
still be used for sources detected anywhere in the field of
view, by correcting them using the exposure map.
We also calculate, for each ObsID, a further set of con-
versions using the MEKAL model (Mewe et al. 1986). The
MEKAL model is the standard model used to describe ther-
mal and line emission from clusters of galaxies. The MEKAL
conversions are done over a grid of nH, temperature and red-
shift, however the metal abundance is kept fixed at Z=0.3×
the Solar values in Anders & Grevesse (1989). (This choice
of metallicity is standard in the field because previous work,
such as by Maughan et al. (2008), has shown that abun-
dances vary little from this value over a wide range of red-
shifts.) The gridded MEKAL conversions can be used to
convert count-rates to bolometric luminosities and vice versa
(and we refer to these conversions as LCFs hereafter). The
LCFs are used to calculate synthetic cluster count rates for
the survey selection function (§ 3.4.2) and to estimate lu-
minosities for XCS candidates during the literature redshift
search (§ 4.1). The LCFs, like the ECFs, are calculated for
the on-axis aim point, but can be adjusted to another loca-
tion using the exposure map.
2.5.1Galactic HI Column
X-ray photons are efficiently absorbed by neutral hydrogen.
Therefore, it is important for us to know the nH value to-
wards all the sources detected by XCS. As mentioned above,
we use nH for the calculation of ECFs, but also at other
points in the XCS pipeline, e.g. when analysing X-ray spec-
tra (§ 4.2). We estimate the nH values using the compilation
c ? 2010 RAS, MNRAS 000, 1–??
Page 9
The XMM Cluster Survey: X-ray analysis methodology9
of Dickey & Lockman (1990), which combines the Bell Labs
HI Survey (Stark et al. 1992) data with other surveys for all
sky coverage. We note that self-shielding of molecular hy-
drogen, from ambient ultra-violet radiation, can occur when
nH > 5 × 1020cm−2(Arabadjis & Bregman 1999). This
molecular gas absorbs X-rays and thus distorts flux conver-
sions that are based only on nHvalues. For this reason, XCS
fluxes derived when nH > 5×1020cm−2should be regarded
as lower estimates.
3 GENERATION OF THE XCS SOURCE
CATALOGUE
In this section, we provide details of our source detection al-
gorithm, known as the XCS Automated Pipeline Algorithm
or XAPA. In section 3.1, we explain how XAPA applies
wavelets to the pipeline generated images (§ 2.4) to gen-
erate a source list per ObsID. In section 3.2, we describe the
parameters that are measured by XAPA for each detected
source. In Section 3.3 & 3.4 we demonstrate the quality of
the XAPA data products for point and extended sources re-
spectively. In Appendix B we provide additional technical
information, including flow charts.
3.1 Source Detection
XAPA source detection is based upon the mission-
independent source detection package WAVDETECT (Free-
man et al. 2002, F02 hereafter), which is available as part of
the CIAO software package. F02 have shown that WAVDE-
TECT’s wavelet-based algorithm is more sensitive than
standard sliding-cell algorithms (e.g. CELLDETECT from
CIAO, Fruscione et al. 2006) and is considerably faster
than event-list-based algorithms such as CIAO’s VTPde-
tect. Before deciding to use WAVDETECT as the basis
for the XAPA algorithm, we also examined the XMM SAS
EWAVELET program and the SExtractor package (Bertin
& Arnouts 1996), finding them both to be inadequate for
our purposes (see Davidson 2006, for a discussion).
The F02 version of WAVDETECT consists of two com-
ponents, wtransform and wrecon. The former convolves
binned images with Mexican Hat (Slezak et al. 1990) wavelet
functions with various user-specified scale sizes and then
identifies pixels that are significantly above the background.
In XAPA, we use the F02 version of wtransform as part
of an automated pipeline known as md detect,1as illus-
trated by the flowchart of Figure B3. We use a set of nine
wavelet scales, numbered according to increasing size, and
corresponding to
image pixels. At each scale, the convolved image is com-
pared with a threshold image. Convolved image pixels with
values greater than their corresponding threshold image pix-
els are assumed to be associated with astronomical sources
(“significant pixels” hereafter). For those pixels, we reject
the null hypothesis that they are consistent with the mea-
sured background. We then generate a set of support images,
which record the significant pixels at each wavelet scale.
√2, 2, 2√2, 4, 4√2, 8, 8√2, 16 and 32
1Where the md prefix acknowledges the architect of the routine,
Michael Davidson.
In order to enhance the detectability of faint extended
emission, md detect performs the wavelet analysis in two
stages (or “Runs”). In Run 1 (scales 1-3), bright compact
sources are located first. These are then masked out before
performing Run 2 (scales 3-9). The masking step was found
to be necessary because bright point sources can pollute the
wavelet signal on large scales, and hence mimic extended
sources. Unfortunately, this masking can occasionally result
in genuine extended sources being excluded from the candi-
date list, so an extra step was added to XAPA to mitigate
this effect (§ 3.1.1).
The second component of the F02 version of WAVDE-
TECT is wrecon. This generates a source list for each im-
age, by grouping collections of significant pixels together
into source regions, or “cells”. A drawback of the F02 ver-
sion of wrecon is that it uses the instrument PSF to define
the size of the cells. This means that extended sources can
be broken up into multiple contiguous “sources” (because
a single PSF-sized cell is not big enough to enclose all the
flux). To overcome this problem, we wrote a modified ver-
sion of wrecon, called md recon, for XAPA. Unlike wrecon,
md recon does not assume a priori the size of the detected
sources, and is consequently considerably better at fitting
ellipses to extended sources. The operation of md recon is
as illustrated by the flowchart in Figure B4. At each wavelet
scale, md recon first combines lists of significant source pix-
els into source cells. Multi-scale objects, i.e. those detected
by md detect on multiple scales, are then filtered using a “vi-
sion model” (§ 3.1.2). The vision model is a set of rules for
combining the support images derived for different wavelet
scales. The vision model is able to recognise when a point
source is embedded in an extended source. It also fits ellipti-
cal regions to the recovered sources (the region enclosed by
a source ellipse is referred to as ?f in the following descrip-
tions).
3.1.1Extended Sources with Central Cusps
The two step (Run 1, Run 2) procedure adopted by
md detect for source detection works well, in that it pre-
vents bright point sources from contaminating the extended
source list. However, it has the disadvantage that when a
genuine extended source is detected in Run 1, it will be
excluded from Run 2. This means that its size will be un-
derestimated by the vision model, and it will not appear in
an extended source list. Extended sources with cuspy bright-
ness profiles will be particularly affected by this, e.g. clusters
with cool cores. We have therefore devised a “cuspiness test”
that is carried out between Run 1 and Run 2. This involves
generating a grid of 5 by 5 pixels, Q, centred on the position
of each source detected in Run 1. A quantity, C, represent-
ing the cuspiness of the central region is then calculated, as
follows:
C =Qmax− Qmin
Qmax
(1)
Tests showed that real point sources have C ≥ 0.85, so if
a Run 1 source is found to have C < 0.85 — i.e. it pos-
sesses a flatter central profile than a real point source — it
is removed from the list of Run 1 detections, resulting in it
being available to be detected again in Run 2. This situation
is illustrated by Figure 8.
c ? 2010 RAS, MNRAS 000, 1–??
Page 10
10E. J. Lloyd-Davies et al.
Figure 8. Illustration of the effect of extended source cuspiness. Left: The original (before the cuspiness test was introduced) Run1
(blue) and Run2 (green) detections. Middle: The final source list if the cuspiness test is not performed. Right: The final source list (after
the cuspiness test was introduced). Extended and point sources have green and red outlines respectively.
3.1.2 The XAPA Vision Model
Here we give more details about the vision model used to
filter sources detected at multiple scales by md recon. To
describe our vision model we introduce the following two
terms: a “structure” is a connected set of pixels in the sup-
port image for a particular scale; and an “object” is a set of
connected structures from different scales. The steps are:
(i) For each structure, comprising a set of pixels {(x,y)} in
Si which is the support image for scale i, determine whether
the structure defines the “root” of an object, i.e. whether
Sj({(x,y)}) = 0 for all j < i.
(ii) For each such root, check to see if there is a structure
in the scale above at this position, i.e. if ∃(x?,y?),(x?,y?) ∈
{(x,y)},Si+1(x?,y?) ?= 0.
(iii) If such a structure exists, and its maximum pixel
value lies within {(x,y)}, then these two structures are
linked, such that the image pixels belonging to the object
comprise the union of the pixels in the linked structures from
scales i and i + 1.
(iv) The process of upward linking continues until the
condition in step (ii) is not satisfied, at which time the ob-
ject is terminated. When each scale has been scanned for
root structures and they have been propagated in the ‘tree-
like’ fashion, then for each object created there exists a set
of image pixels belonging to it. An ellipse can then be fitted
to these regions and a source list created.
This vision model can handle both point and extended
sources. Crucially, it can also cope with point sources em-
bedded in extended sources, and with close pairs of points
which should be separated rather than blended. A schematic
to illustrate how the vision model works when a point source
is embedded in an extended source can be seen in Figure 9.
3.2Source Properties
Once md recon has been run on a given image, the source
list is passed on to the next part of the XAPA pipeline,
find srcprop. The two-stage operation of find srcprop
is illustrated by the flowchart in Figure B5. In the
first stage, find srcprop determines the significance of
each detected source. In the second, a sub-routine known
as find srcprop final, computes other source properties
(such as the count-rate and probability of extent); it is the
Table 1. Mask and aperture configurations for source and
backgroundfluxdetermination
find srcprop final.
used in
find srcprop
and
Type Configuration (find srcprop )
Run 1Mask: Run 1 sources masked at 2?f
Flux: 1?f+Uniq(3?f)
Background: Inner radius at 2?f, min. area = 400 pix
Mask: All sources masked at 3?f
Flux: 1?f+Uniq(3?f)
Background: Inner radius at 3?f, min. area = 2000 pix
Run 2
Type Configuration (find srcprop final)
PointMask: Point sources masked at 2?f
Flux: 1?f
Background: Inner radius at 2?f, min. area = 400 pix
Mask: All sources masked at 3?f
Flux: 1?f+Uniq(3?f), with internal point sources
masked at 1?f
Background: Inner radius at 3?f, min. area = 2000 pix
Extended
results from the find srcprop final that appear in the
XCS data tables (§ 3.2.7).
3.2.1 Measuring Source and Background Counts
Here we describe how background corrected source counts
were calculated in XAPA by find srcprop and by its sub-
routine find srcprop final. Tests during the development
of XAPA showed that the best results were obtained using
different aperture sets for each stage. The aperture set com-
prises the region for source flux determination, the region for
background flux determination and a masked region (which
is not used for either). In Table 1 we note the configuration
for both aperture sets. In specifying these, we denote by ?f
the ellipse as fitted to the object region, so that 3?f is the
ellipse with major and minor axes three times those fitted to
the source by the vision model. We use Uniq(X) to denote,
for a particular source, those pixels which lie only within re-
gion X defined relative to that source: e.g. Uniq(3?f) defines,
for a particular source, the set of pixels which lie within the
3?f region for that source and for no other (as illustrated in
Figure 10).
The expected background contribution is computed lo-
cally. An elliptical annulus is placed around the source po-
c ? 2010 RAS, MNRAS 000, 1–??
Page 11
The XMM Cluster Survey: X-ray analysis methodology11
Figure 9. Illustration of the ‘tree’ vision model. Left: The source configuration showing a point source embedded in a larger source. The
dashed line indicates a 1-d cut through the sources. Right: A schematic of the significant pixels at each scale showing how the structures
are connected to form objects. The vertical bars denote the position of the maximum coefficient at each scale. The maximum of scale 3
lies outside of the structure of scale 1 hence a new object is started.
sition: the inner edge varies but is usually at 3?f and the
outer edge is increased until there are at least 2000 back-
ground pixels, or no more area, is available. The background
count-rate, bpix, is then calculated as bpix = B/¯E?× a?,
where B is the total number of counts in the annulus,¯E?
the mean exposure in the annulus and a?is the number of
pixels in the annulus. The expected number, Ba, of back-
ground counts within the source aperture is then computed
as Ba = bpixׯE × a, where¯E is the mean exposure in the
source aperture and a the number of pixels in the aperture.
3.2.2 Removing Low-Significance Sources
The first task is to remove any sources which are statistically
of low significance, because they will not yield accurate prop-
erties. The source and background apertures used to deter-
mine this significance must be chosen carefully (§ 3.2.1), but
once the expected number of background counts, Ba, within
the source aperture, ?f, is known, it is possible to assess the
significance of the detected source. This is done by comput-
ing the probability that the background could, by chance,
produce the detected number of counts in the source aper-
ture, assuming a Poisson distribution for the background
counts, with mean Ba. Those sources with a probability
higher than 0.000032 are removed from the source list: this
probability is equivalent to a 4-σ threshold for a Gaussian
distribution. In addition, detections comprising of only a
single significant pixel are excised from the source list, re-
gardless of their significance. These are likely to be hot pixels
or sources that are too faint to be accurately parameterised.
3.2.3Measuring Source Extents
After low-significance sources have been removed, the
find srcprop routine is run again on the ≥ 4-σ source list,
in order to classify the sources as point-like or extended.
For this, we need to compare the sources to the instru-
ment PSF. Unfortunately, no satisfactory 2-d PSF model
for XMM exists (§ 2.1.1), so for XCS we adopted the best
Figure 10. A diagram showing how the aperture used to measure
source flux is created. The source to be measured is Source A and
there are also two other objects nearby (Source B and Source C).
Both the 1 ∗ ?fand 3 ∗ ?fellipses are shown for each source (red
and green respectively for Source A and dark blue and light blue
for B and C. Hence, the area used to calculate the flux for Source
A is the red plus the green region.
publicly-available 1-d (radially-averaged) model — the Ex-
tended Accuracy Model (EAM). This, in turn, necessitated
the development of a source classification criterion based on
a ID source property. For XCS, we used the Encircled En-
ergy Fraction (EEF). The EEF records the fraction of the
total energy of a source as a function of increasing (circular)
aperture size. We note that even though the shape of the
PSF changes considerably towards large off-axis angles, its
radial average, the EEF profile, is only a weak function of
off-axis angle (Davidson 2006), making it a good basis for a
classification criterion to be applied across the full field of
view.
Our extent classification is based on testing the null hy-
pothesis that the measured EEF for a source is consistent
with the PSF EEF, at the appropriate off-axis angle. This
is implemented using a Kolmogorov-Smirnov (K-S) test, us-
c ? 2010 RAS, MNRAS 000, 1–??
Page 12
12E. J. Lloyd-Davies et al.
ing the EEF profile of the source and a model-merged PSF
EEF. The PSF EEF is derived from EAM EEFs produced by
the SAS task CALVIEW from the Current Calibration Files
(CCF) for each camera. This is weighted by the Energy Cor-
rection Factor (ECF, § 2.5) appropriate for that ObsID. We
adopt for P(point), the probability that the source is point-
like, the maximum value of the probability returned by the
K-S test run on a 3×3 pixel grid (with spacing ±0.5 pixels in
x and y) around the source position (in section 3.3.1 we show
that the typical positional accuracy of XCS source centroids
are good to better than 1 pixel).
The reliability of the P(point) values is a function of
several factors, including the position on the field of view,
the background level, the number of source counts, and the
proximity of neighbouring sources. For that reason, choos-
ing a fixed threshold in P(point) for our classification would
be inappropriate. Instead we are forced to conduct a series
of Monte Carlo (MC) simulations for every source: this is
computationally expensive, but it is vital to prevent misclas-
sification. This simulation process involves generating 200
realisations of the appropriate PSF EEF model and popu-
lating them with the same number of counts as measured
in the data. Each of the 200 realisations are compared to
the model and a empirical distribution of the K-S d values
is established. If none of the simulated distributions returns
a d value as great as the measured value, we classify the
source as being extended. With this procedure, the statis-
tical probability of misclassifying an isolated point source
as extended is 0.005 or less. However, we note that this
does not take into account systematics, such as when two
or more point sources have been blended by XAPA into a
single source profile. These can only be removed a posteriori,
by eye-balling the extended sources that make it through to
the cluster candidate list. This eye-balling, or quality con-
trol (see Figure 1), process is described in more detail in
Mehrtens et al. (2010).
3.2.4Correcting Artefacts
After the second pass of find src prop has been completed,
we have a preliminary list of sources (classified as extended
or point-like) for a given ObsID. Initial tests showed that
these preliminary lists include a number of artefacts. These
must be corrected for before inclusion in an XCS source
catalogue (see below). The corrections are not foolproof, as
not all genuine clusters make it through to the candidate list
and not all contaminating sources are excluded, but because
the corrections are folded into the survey selection function
(§ 3.4.2), they should not impact our ability to use XCS
cluster catalogues for statistical studies.
XAPA’s
md recon
algorithm
sources within sources (see Figure 9). However, one unin-
tended consequence is the occasional multiple detection of
a single source that has become split into two or more over-
lapping sources. This more often happens with extended
sources, but can also occur with point sources at the edge
of the field of view. Therefore, where there are incidences of
two sources with overlapping cells, the sources are merged
and source properties recalculated by find srcprop (see
Figure 11). This refinement ensures that in most cases the
source flux and morphology is recovered well.
When a bright compact source lies in the outskirts of
successfully detects
Figure 11. An example of how when there are several detections
of an extended source they are merged by XAPA to improve the
derived properties.
Figure 12. Source ellipses defined by XAPA for a bright, off-
axis, point source. Left: before the lobe removal step was included.
Right: after the lobe removal step was included: note that the two
point sources have still been recovered, but there is no erroneous
large (extended) ellipse enclosing both of them.
the field of view, it can produce a significant number of
counts in the asymmetric outer regions of the PSF. We term
these objects as “point-sources-with-lobes”. The core of such
sources are detected in Run 1 of md detect, and hence the
core counts will be masked from Run 2 (§ 3.1), but the re-
maining outer counts might still yield a Run 2 detection (see
Figure 12). Removing these point-sources-with-lobes, with-
out also removing clusters with cuspy cores (§ 3.1.1), proved
to be one of the most difficult problems to overcome with
XAPA. After extensive tests, we arrived at the following
compromise: an extended source is excised from the source
list, as a suspected point-source-with-lobe, if it is both lo-
cated within the 3?f region of a Run 1 source, and has less
than one fifth of the counts of that source. This removes the
majority of the lobe artefacts, but can unfortunately also re-
sult in some genuine faint extended sources being excluded
from the XCS cluster candidate list.
3.2.5 Extended Source Flags
When developing XAPA, we had to find a compromise be-
tween contamination and completeness, i.e. between effec-
tive and over cleaning of the extended source list. Therefore,
c ? 2010 RAS, MNRAS 000, 1–??
Page 13
The XMM Cluster Survey: X-ray analysis methodology13
rather than removing from the extended source list every ob-
ject that could be in there erroneously, we have flagged cer-
tain sources that, conservatively, we view as suspicious. Our
aim is to use the survey selection function (§ 3.4.2) to help
us understand whether flagged sources should be included
in statistical studies or not, but to date we have taken a con-
servative approach and not included them in cluster candi-
date lists, or as targets for optical follow-up (Mehrtens et al.
2010). The source flags are as follows:
(i) Extended Sources that are PSF-sized. At large off-axis
angles it is not infrequent for the flaws in the PSF model
to cause an obvious, bright, point source to be classified as
extended. Therefore, any source that is only just extended
(i.e. that has a size very close to the PSF at the respective
off-axis angle) is flagged as being “PSF-sized” by XAPA.
(ii) Extended Sources with Internal Point Sources. Even
with the inclusion of the point-source-with-lobe test
(§ 3.2.4), the XAPA vision model (§ 3.1.2) will occasion-
ally misclassify flux from the outskirts of a point source (or
flux from a collection of neighbouring point sources) as an
erroneous extended source. We can mitigate against this by
flagging up likely incidences. Therefore, any extended source
region that encloses one or more point sources that con-
tribute ≥ 1.3 times the extended source flux is flagged as
being “point contaminated” by XAPA.
(iii) Extended Sources with Internal Run1 Sources. The
final flag is similar to the “point contaminated” case, but
covers the incidences of genuine point sources, detected in
Run 1 by md detect, being erroneously passed on to Run
2 by the cuspiness test (§ 3.1.1). Therefore, any extended
source region that encloses one or more Run 1 detection
regions that contribute at least half the extended source flux
is flagged as being “Run 1 contaminated” by XAPA.
3.2.6Source Parameters
Once the source list per ObsID has been cleaned of artefacts,
a file is generated that saves all the relevant data. This file
is then interrogated when the survey-wide database is being
generated (§ 3.2.7). The following attributes are saved per
source:
(i) The centroid location in image coordinates;
(ii) The centroid location in sky coordinates (J2000);
(iii) The centroid location in radial coordinates, i.e. the
off-axis angle (arcminutes) and the azimuthal angle (de-
grees);
(iv) The major axis, minor axis and orientation of the
source ellipse;
(v) The average exposure time at the source location (sec-
onds);
(vi) The 0.5-2.0 and 2-10 keV background subtracted
source counts (in the merged image and in the individual
camera exposures);
(vii) The 0.5-2.0 and 2-10 keV background subtracted
count-rates and 1-σ count-rate uncertainties (in the merged
image and in the individual camera exposures);
(viii) The source significance and extent probability;
(ix) The value of the source flags (see § 3.2.5).
3.2.7 Master Detection List
XAPA produces a source list for each of the input ObsIDs,
then these lists are concatenated to form a Master Detec-
tion List (MDL). Present in the XMM archive are many
areas that have been observed multiple times. As a result,
some sources will have been detected by XAPA multiple
times. When duplicates are found, only the detection with
the most soft-band counts is passed to the MDL. To remove
duplicates, it is necessary to set an appropriate matching
radius. The positional accuracy of the survey is higher for
point sources than for extended sources, so it makes sense
to use a different radius for each type. The accuracy for
point sources varies as a function of off-axis and azimuthal
angles (amongst other parameters). However, for simplicity
we use a single value for the radius of 5??. The case for ex-
tended sources is less straightforward because of the variety
of source types and morphologies. The positional accuracy
for large diffuse objects, such as low-redshift clusters, can
be very poor, making it hard to pick an appropriate radius.
Fortunately, the largest diffuse sources should have already
been masked from their host observation. So, for XCS, we
use a fixed matching radius of 30??for extended sources. This
radius is large enough to allow reliable source matching, but
small enough to minimise removal of genuine cluster candi-
dates.
As of October 1st 2010, XAPA had run on 4,029 Ob-
sIDs, resulting in 114,711 point sources and 12,582 extended
sources being included in the MDL. Of the 12,582 extended
sources, roughly half were flagged (§ 3.2.5) and these were
removed from the list of potential cluster candidates (leav-
ing 6,983 sources). Additional cuts to this list included the
removal of sources within 20◦of the Galactic plane and 6◦
[3◦] of the Large [Small] Magellanic Cloud. Those cuts were
made because in regions of high projected stellar density, it
can be hard to carry out effective optical follow-up. More-
over, the closer one gets to the Galactic plane, the higher
the hydrogen column (large nH values impact our ability to
recover accurate source fluxes). A further cut, on off-axis an-
gles < 2?, is then imposed to ensure that the vast majority
of XCS cluster candidates are genuinely serendipitous detec-
tions, rather than the intended target of the ObsID. After
the cuts have been imposed, 4,251 (of the 6,983) un-flagged
extended sources remain, these come from 1,692 different
ObsIDs. A final cut, on minimum source count (> 50) is
then applied, leaving 3,669 sources; these 3,669 are referred
to as cluster candidates hereafter. The candidates have a
range of counts, from only a few to several thousand. Of
particular interest to the cosmology and evolution studies
we plan with XCS are the 1,022 with more than 300 counts,
because these should deliver, once redshift information is
available, reliable temperature estimates (Figure 17).
We acknowledge that some ObsID targets are deliber-
ately offset from the centre of the FOV, so a small number
of XMM targets will filter through into our “serendipitous-
only” cluster candidate list. However, these clusters can be
removed, at the quality control stage by cross checking with
the ObsID header information (Mehrtens et al. 2010). We
further acknowledge that some of the 3,669 cluster candi-
dates will have been discovered in ObsIDs with cluster tar-
gets. These particular objects, if confirmed as clusters, are
still serendipitous detections, but can only be used for cer-
c ? 2010 RAS, MNRAS 000, 1–??
Page 14
14E. J. Lloyd-Davies et al.
tain applications (e.g. cosmological studies) if they have a
sufficiently large metric separation from the target.
3.3XAPA Verification: Point Sources
As mentioned above (§ 3.2.6), XAPA has catalogued to date
in excess of 100,000 unique point sources. In this section we
test XAPA astrometry and flux measurements using these
point sources, finding both measures to be robust.
3.3.1Positions
To determine the positional accuracy of the XCS point
sources, it is desirable to use a catalogue that has high spa-
tial resolution and astrometric precision. It would also need
to have significant overlap with the XMM archive. A natu-
ral choice for this is the Sloan Digital Sky Survey (SDSS2,
Abazajian et al. 2009); the data is of high quality and con-
tains many objects that would be expected to have X-ray
counterparts, e.g. quasars and AGN. A cross match of XCS
point sources against the SDSS Quasar Catalog IV (Schnei-
der et al. 2007) using a radius of 10??produces 1131 matches.
This was extended further using the catalogue of V´ eron-
Cetty & V´ eron (2006, VeronCat hereafter). VeronCat is a
compilation of all known AGNs and QSOs (including those
in the SDSS). A 10??matching radius returns 2807 matches,
the distribution of which can be seen in Figure 13. We have
determined the chance of false association between the XCS
and VeronCat with a 10??matching radius to be 1%. The
mean matching distance is 2.6??, and 95% of the matches
fall within 6.6??. This level of precision is consistent with
previous determinations (Watson et al. 2009).
3.3.2 Fluxes
To assess the accuracy of the point source fluxes measured
by XAPA we have compared the XCS point source list to
the XMM-Newton Serendipitous Survey 2XMM catalogue
(Watson et al. 2009). This catalogue is the ideal counterpart
to XCS because it is also based on automated pipeline anal-
ysis of the entire XMM archive. A 10??matching radius has
been used to compare the samples. Figure A2 shows the flux
comparison from the individual cameras aboard XMM, us-
ing a 0.5−2.0 KeV band. There is clear consistency between
the two surveys, with no systematic offsets. It is important
to note that the default XAPA fluxes for extended sources
are not similarly reliable. This is for two reasons, first the
ECFs used to generate the fluxes relate to power-law spectra
(whereas extended sources are more likely to have thermal
spectra) and second, the fluxes have not been properly cor-
rected for any source flux lying outside the XAPA defined
ellipse. In § 4.3, we describe how aperture corrected energy
fluxes are determined for XCS cluster candidates.
3.4 XAPA Verification: Extended Sources
As mentioned above (§ 3.2.6), XAPA has catalogued to date
in excess of 10,000 unique sources that have been statisti-
cally classified as extended. XAPA is not infallible however,
2http://www.sdss.org
and some of the objects in the XCS cluster candidates list
will be erroneous – because they are blends of point sources
or other artefacts of the data reduction – and a small frac-
tion will be other types of genuinely extended X-ray sources
(such as nearby galaxies or supernova remnants). Neverthe-
less most of them will be clusters. In this section we first
compare the XAPA extended source list to the cluster sam-
ple of the XMM-LSS survey in the same ObsIDS (§ 3.4.1).
We then describe how we quantify the completeness level us-
ing simulations of our selection function (§ 3.4.2). We note
that it is harder to quantify the contamination (by blends
and artefacts) level than the completeness level. In XCS we
do not use simulations for this, but rather examine each
source (and its optical counterpart) by eye (Mehrtens et al.
2010).
3.4.1 Comparison with the XMM-LSS
The XMM Large Scale Structure (XMM-LSS) Survey is re-
ported in Pierre et al. (2006) and Pacaud et al. (2007). It
covers a single contiguous region of roughly 6 deg2, com-
prised of 51 ObsIDs, in which the authors have undertaken
a dedicated cluster survey, accompanied by a detailed se-
lection function. In this region they detected 33 “Class 1”
extended objects. This class is designed to be uncontami-
nated by mis-classified point sources. A more detailed ex-
amination at these objects (including optical overlays, pho-
tometry and spectroscopy) has confirmed 28 of these to be
genuine clusters; the remaining 5 were shown to be nearby
X-ray emitting galaxies. Twenty-nine of the 33 Class 1 ob-
jects have counterparts in XCS that were classified as ex-
tended by XAPA. This includes 2 of the non-cluster objects.
Three of the remaining 4 Class 1 objects were detected by
XAPA, but classified as point-like. The final object (XLSS
J022210.7-024048) was detected by XAPA, but subsequently
removed from the source list because it did not meet our 4-σ
significance requirement.
The radius used in the matching of XAPA sources to the
XMM-LSS was typically 10??. However for XLSS J022433.8-
041405 a radius of 24” was required to get a match; this
source is large and elliptical, hence there is some uncertainty
in the source centre, though the extent of the XCS source
and its XMM-LSS counterpart are overlapping.
The XMM-LSS also have a C2 class of clusters with
slightly less conservative selection criteria. This sample is
yet to be published, but the authors report this class to
contain ∼ 60 sources. Within the XMM-LSS ObsIDs, XCS
detects 82 extended objects without flags (§ 3.2.5), so the
overlap is likely to be substantial.
3.4.2 Selection Function: Method
Pioneering work by Adami et al. (2000), and later by Bu-
renin et al. (2007), demonstrated the impact of complex se-
lection effects on cluster samples derived from X-ray sur-
veys. Pacaud et al. (2007) have shown, using the XMM-
LSS Class 1 sample described above, that the measured
evolution in the normalisation of the LX − TX relation is
significantly affected by selection biases. In another X-ray
study, Mantz et al. (2010) provide an in-depth discussion of
Malmquist and Eddington biases and their effect on mea-
c ? 2010 RAS, MNRAS 000, 1–??
Page 15
The XMM Cluster Survey: X-ray analysis methodology 15
Figure 13. The relative position of the matches between XCS and VeronCat (V´ eron-Cetty & V´ eron 2006) source positions. The dashed
line represents the 95% matching radius.
surements of scaling relations. Optical and SZ cluster sur-
veys are also increasingly supported by selection function
simulations (Melin et al. 2005; Koester et al. 2007).
The ability to measure selection functions for XCS was
embedded at the outset in XAPA. Indeed, one of the driv-
ing reasons behind us designing our own source detection
pipeline, rather than using the (otherwise excellent) data
products from the XMM Survey Science Centre (Watson
et al. 2009), was the requirement that we needed to be able
to quantify the extended source selection function using syn-
thetic clusters. In the following, we describe how the selec-
tion functions are carried out and present some results.
Our approach follows a general method in which syn-
thetic cluster profiles are added to EPIC merged images,
which are then run through XAPA. The angular size of the
synthetic cluster profile is determined from the angular di-
ameter distance at the chosen input redshift. The profile is
then randomly positioned into a blank XMM “image”, with
a uniform probability across the field of view, and then con-
volved with the appropriate PSF model. For this purpose we
use the two-dimensional Medium Accuracy Model (MAM,
§ 2.1.1). This is a natural choice of PSF model for the se-
lection functions because it accounts for the azimuthal vari-
ation in the PSF, and also because the alternative (EAM,
§ 2.1.1) model is implemented in XAPA for source classifi-
cation (§ 3.2.3): to keep the simulations fair, we cannot use
the same model for blurring as we do for extent classifica-
tion. The convolution with the PSF creates a probability
density function (PDF) for the synthetic cluster profile. We
note that the shape of the synthetic cluster profile depends
on the user’s specific requirements, and we will discuss some
examples in § 3.4.3 & 3.4.4.
Next, an ObsID is chosen for the synthetic cluster to be
placed in. The choice of ObsID will depend on the particular
test being undertaken. For example, one might want to know
the detection sensitivity in a particular ObsID, or one might
want to know the detection sensitivity for a set of ObsIDs,
e.g. those with similar nH or exposure times. The synthetic
cluster is added to the chosen ObsID as follows:
(i) The absorbed count-rate of the cluster profile is de-
termined from the gridded LCFs (§ 2.5) for that ObsID, so
that it matches the synthetic cluster’s luminosity, tempera-
ture and redshift.
(ii) The cluster PDF is normalised to the LCF predicted
count-rate, thus creating a count-rate-image.
(iii) The count-rate-image is converted into a count-image
by multiplying by the appropriate exposure map.
(iv) The synthetic count-images for the individual cam-
eras are then added to the respective real images (§ 2.4).
(v) The individual images are then added to make a
merged image.
The resulting merged image, containing the synthetic
cluster, is then processed by XAPA in the standard way.
There are two criteria that must be met in order for an in-
put synthetic cluster to be deemed successfully “recovered”
by XAPA: the detection software must identify a source at
the synthetic cluster location, and that source must be clas-
sified as extended. This has to be a new source; if the syn-
thetic cluster happens to have been placed at random close
to a previously detected real extended source, then the syn-
thetic cluster is not classed as having being recovered (even
if its “counts” dominate those from the real source). De-
pending on the application, we might further require that
the new detection not be flagged (see § 3.2.5). It is not suf-
ficient to perform the synthetic cluster recovery test only
once, rather one must perform it multiple times to ensure
an accurate measurement of the recovery probability for a
given set of input parameters. There is so much parameter
space to be tested (see below) that the number of selection
function tests can run into the millions for certain applica-
tions. Determining the survey selection function is by far the
most computationally demanding part of XCS.
c ? 2010 RAS, MNRAS 000, 1–??
View other sources
Hide other sources
-
Available from Martin Sahlén · 29 Jan 2013
-
Available from ArXiv