On the detection of Lorentzian profiles in a power spectrum: A Bayesian approach using ignorance priors
ABSTRACT Aims. Deriving accurate frequencies, amplitudes, and mode lifetimes from stochastically driven pulsation is challenging, more so, if one demands that realistic error estimates be given for all model fitting parameters. As has been shown by other authors, the traditional method of fitting Lorentzian profiles to the power spectrum of time-resolved photometric or spectroscopic data via the Maximum Likelihood Estimation (MLE) procedure delivers good approximations for these quantities. We, however, show that a conservative Bayesian approach allows one to treat the detection of modes with minimal assumptions (i.e., about the existence and identity of the modes). Methods. We derive a conservative Bayesian treatment for the probability of Lorentzian profiles being present in a power spectrum and describe an efficient implementation that evaluates the probability density distribution of parameters by using a Markov-Chain Monte Carlo (MCMC) technique. Results. Potentially superior to "best-fit" procedure like MLE, which only provides formal uncertainties, our method samples and approximates the actual probability distributions for all parameters involved. Moreover, it avoids shortcomings that make the MLE treatment susceptible to the built-in assumptions of a model that is fitted to the data. This is especially relevant when analyzing solar-type pulsation in stars other than the Sun where the observations are of lower quality and can be over-interpreted. As an example, we apply our technique to CoRoT observations of the solar-type pulsator HD 49933. Comment: 12 pages, 11 figures, accepted for publication in Astronomy and Astrophysics
-
Citations (0)
- Cited In (1)
-
Article: The excitation of solar-like oscillations in a δSct star by efficient envelope convection.
V Antoci, G Handler, T L Campante, A O Thygesen, A Moya, T Kallinger, D Stello, A Grigahcène, H Kjeldsen, T R Bedding, [......], P De Cat, K Uytterhoeven, H Bruntt, G Houdek, D W Kurtz, P Lenz, A Kaiser, J Van Cleve, C Allen, B D Clarke[show abstract] [hide abstract]
ABSTRACT: Delta Scuti (δSct) stars are opacity-driven pulsators with masses of 1.5-2.5 M⊙, their pulsations resulting from the varying ionization of helium. In less massive stars such as the Sun, convection transports mass and energy through the outer 30 per cent of the star and excites a rich spectrum of resonant acoustic modes. Based on the solar example, with no firm theoretical basis, models predict that the convective envelope in δSct stars extends only about 1 per cent of the radius, but with sufficient energy to excite solar-like oscillations. This was not observed before the Kepler mission, so the presence of a convective envelope in the models has been questioned. Here we report the detection of solar-like oscillations in the δSct star HD187547, implying that surface convection operates efficiently in stars about twice as massive as the Sun, as the ad hoc models predicted.Nature 09/2011; 477(7366):570-3. · 36.28 Impact Factor
Page 1
arXiv:0811.3345v3 [astro-ph] 23 Aug 2009
Astronomy & Astrophysics manuscript no. 11203
August 23, 2009
c ? ESO 2009
On the detection of Lorentzian profiles in a power spectrum:
A Bayesian approach using ignorance priors
M. Gruberbauer1,2, T. Kallinger1, W.W. Weiss1, and D.B. Guenther2
1Institute for Astronomy (IfA), University of Vienna, T¨ urkenschanzstrasse 17, A-1180 Vienna, Austria
e-mail: last name@astro.univie.ac.at
2Department of Astronomy and Physics, Saint Marys University, Halifax, NS B3H 3C3, Canada
e-mail: guenther@ap.stmarys.ca
Received / Accepted
ABSTRACT
Aims. Deriving accurate frequencies, amplitudes, and mode lifetimes from stochastically driven pulsation is challenging, more so,
if one demands that realistic error estimates be given for all model fitting parameters. As has been shown by other authors, the
traditional method of fitting Lorentzian profiles to the power spectrum of time-resolved photometric or spectroscopic data via the
Maximum Likelihood Estimation (MLE) procedure delivers good approximations for these quantities. We, however, show that a
conservative Bayesian approach allows one to treat the detection of modes with minimal assumptions (i.e., about the existence and
identity of the modes).
Methods. We derive a conservative Bayesian treatment for the probability of Lorentzian profiles being present in a power spectrum
and describe an efficient implementation that evaluates the probability density distribution of parameters by using a Markov-Chain
Monte Carlo (MCMC) technique.
Results. Potentially superior to “best-fit” procedure like MLE, which only provides formal uncertainties, our method samples and
approximates the actual probability distributions for all parameters involved. Moreover, it avoids shortcomings that make the MLE
treatment susceptible to the built-in assumptions of a model that is fitted to the data. This is especially relevant when analyzing solar-
type pulsation in stars other than the Sun where the observations are of lower quality and can be over-interpreted. As an example, we
apply our technique to CoRoT⋆observations of the solar-type pulsator HD 49933.
Key words. stars: oscillations – methods: statistical – stars: individual: HD 49933
1. Introduction
Ourunderstandingofthe Sun’sstructurehas beenrevolutionized
overthelast threedecadesbytheapplicationofhelioseismology,
where the surface manifestation of acoustic modes (p-modes)
are used to probe the interior. Seismology is now being applied
to stars using precise rapid photometry from space and high-
precision radial velocity measurements from the ground. Low
degree p modes were identified in several main-sequence and
sub-giant stars (see e.g., Bedding & Kjeldsen 2006).
Due to the lower signal to noise ratio of the data obtained
from stars, and the more poorlyconstrained stellar models, there
is a greater risk of getting things wrong. It is critical to a mean-
ingful analysis to have a measure of the reliability and accuracy
of the observed stellar frequency spectrum. Even a few incor-
rectly identified frequencies, i.e., frequencies that are not intrin-
sic to the star but an artifact of the data processing or of instru-
mental origin,will significantlycomplicatethe analysis andmay
even lead to a misinterpretation of the observations. Indeed, it is
a case where quality overrides quantity, that is, a few well ob-
served modes frequencies are more useful in model fitting than
a large number of poorly determined or unreliable frequencies.
Here we describea Bayesian approachto extractingfrequen-
cies from a noisy stellar oscillation spectrum. The Bayesian
⋆The CoRoT (COnvection, ROtation and planetary Transits space
mission,launched on2006December 27,wasdeveloped andisoperated
by the CNES, withparticipation of the Science Programs of ESA,ESAs
RSSD, Austria, Belgium, Brazil, Germany and Spain.
method treats the problem of identifying frequency peaks in
terms of probabilities. The probability that a peak in the spec-
trum is an oscillation mode is calculated using basic physical
assumptions about the shape of the modes profile and the na-
ture of the background noise. If the data warrants it, additional
prior knowledge can be included to increase the complexity of
the mode shape, aiding in mode identification and the analysis
of rotational effects (i.e., how frequencyis split by rotation).The
actual application of the Bayesian approach to compute the pos-
terior probability distributions of models evaluated using simu-
lated or real data is computationally complex and can quickly
exceed our computational resources. Therefore, we use an im-
plementation of the Markov-Chain Monte Carlo (MCMC) tech-
nique, which significantly reduces these demands. The applica-
tion of Bayesian treatments to solar-type oscillations is not new
(e.g., Brewer et al. 2007).The implementation of Markov-Chain
Monte Carlo techniques for a Bayesian analysis of Lorentzian
profiles has also already been described (Benomar 2008).
Because we are very concerned about the reliability of the
data (i.e., whether individual peaks in the spectrum are intrin-
sic or spurious)and specifically want to avoid interpretingnoise,
however, we advocate a more conservative approach. In particu-
lar, we use the mode height parameter of our models so that we
can quantify the probability that a frequency peak is real or sim-
ply due to noise. This eventually allows us to arrive at a model
that is just complexenoughto comply with the data and its noise
properties.
Page 2
2 M. Gruberbauer et al.: On the detection of Lorentzian profiles in a power spectrum
In the next section we describe the application of Bayes the-
oremto theproblemofidentifyingstellarmodesinanoscillation
spectrum.Insection 3we describeourapplicationofthe MCMC
technique to the computations. In section 4 we test our method
on simulated data and in section 5 we apply our methodology
to the recently obtained observations from CoRoT for the star
HD49933.
2. The probability of a p-mode power spectrum
model
In the following subsections we introduce the Bayesian for-
malisms including the likelihood function and the definition and
role of priors. But first we review the basic properties of solar-
type oscillations (e.g. (see e.g., Appourchauxet al. 1998).
Fromtheequationofadampedharmonicoscillatorforcedby
a random function, the average power of the Fourier transform
of the displacement can be approximated as,
?P(ω)? ≃
1
4ω2
0
?Pf(ω)?
(ω − ω0)2+ γ2,
(1)
where Pf is the power of the Fourier transform of the forcing
function, and ω0and γ are the angular mode frequency and the
damping term, respectively. If the spectrum of the forcing func-
tion varies slowly with the frequency, the resulting average p-
mode profile can be approximated by a Lorentzian profile as
P(f,ν,h,τ) =
h
1 + 4(f − ν)2/η(τ)2,
(2)
where ν is the mode frequency, f is some frequency value in the
spectrum, and h and η are the height and width (η(τ) = (πτ)−1,
with τ being the mode lifetime) of the profile in the power den-
sity spectrum, respectively. For non-radial modes (ℓ > 0) stellar
rotation will split the mode into a 2ℓ + 1 multiplet. This can be
modeled by adding 2ℓ Lorentzian profiles, spaced with integer
multiples of the rotation frequency around the central mode. In
this case, Equ.2 is replaced by
P(f,ν,νrot,h,τ) =
ℓ ?
m=−ℓ
hm
1 + 4(f − ν + mνrot)2/η(τ)2,
(3)
where νrot is the rotational splitting. The mode heights hm of
the additional profiles with m ? 0 are determined by the ob-
served geometry, specifically, the inclination angle. In general,
the mode height is related to the mode amplitude according to
a2= πηh. Usually a Maximum Likelihood Estimation (MLE)
algorithm is used to find a best fit between the observed power
spectrum and a model like
Pm(f) = b +?
with b being the background noise (which is assumed to be lo-
cally white) and P(f,νi,hi,τi) being the power of the individual
profiles at some frequency (-bin)
iP(f,νi,hi,τi),
2.1. Application of Bayes Theorem
The Bayes’ Theorem can be stated as follows:
p(A|D,I) =p(A|I)p(D|A,I)
p(D|I)
,
(4)
where p(A|D,I) is the probability for some hypothesis A given
the data D and the prior information I. Note that I is also called
theposteriorprobabilityof A. p(A|I)is the priorprobabilityof A,
p(D|A,I)is the likelihoodfunction,and p(D|I) is the globallike-
lihood. The latter serves as a normalizing constant and ensures
the total probability summed over all hypotheses equals one. An
extensive introduction to this field and the relevant methods can
be found in Gregory (2005).
In order to apply the Bayesian formulation to any problem,
one has to identify and assign probabilities for the individual
terms in Equ.4. In the following subsections we discuss our
choices.
2.2. The Likelihood Function
The statistic of a power spectrum is given by a χ2distribution
with 2 degrees of freedom. The probability density that a model
spectrum m matches an observed power spectrum o corresponds
to
1
Pm(f)e−(Po(f)/Pm(f)),
p(m) =
?
f
(5)
where Po(f) is the observed power at frequencyf, and Pm(f) de-
notes the corresponding expectation value, i.e., the power given
by the model (e.g., Toutain & Appourchaux 1994). Equ.5 is
used in the MLE method as a likelihood function and has al-
ready been suggested for a Bayesian treatment of solar-type os-
cillations (Appourchaux 2008; Benomar 2008). The Bayesian
formalism, though, can contain additional prior probabilities for
each of the parameters from which the model spectra are con-
structed.
2.3. The role of the priors - improving the MLE approach
The simplest model of a p-mode profile consists of four param-
eters: mode frequency, mode height, mode lifetime, and back-
ground offset. Physically, the mode height and mode lifetime
parameters are correlated. However, here we refrain from using
this information and assume them to be independent. This en-
ables us to use the mode height parameter as a device to distin-
guish between a detection and a non-detectionof p-modesignal.
We define the following two priors, called ignorance priors be-
cause they make no assumptionsabout the physicalpropertiesof
the object being modeled:
1. For a mode with frequency ν, mode lifetime τ, background
offset b we assume a uniform prior of the form
p(x|I) =
1
xmax− xmin,
(6)
where x stands for the respective parametersν, τ, and b. That
is we assume that the probability is equal across the range of
parameters and makes obvious the nomenclature, ignorance
prior. For the prior of the frequency parameter this is an
appropriate choice, since we cannot a priori exclude, for an
observed spectrum, any value within the range defined by
the upper and lower p-mode frequencies. The mode lifetime
and the backgroundoffset, as long as they do not vary across
the fitted part of the spectrum by more than a magnitude,
are mostly determined by the global properties of the power
spectrum and therefore also warrant a uniform prior.
2. For a mode with height h we adopt a modified Jeffrey’s prior
of the form
1
kh(h + h′
p(h|I) =
min),
(7)
Page 3
M. Gruberbauer et al.: On the detection of Lorentzian profiles in a power spectrum3
where
kh= log
?h′
min+ hmax
h′
min
?
,
(8)
and h′
values of h′
In contrast to the uniform prior, the Jeffrey’s prior is nor-
mally employed to ensure a uniform probability density per
decade range (e.g., the prior probability between 0.001 and
0.01isthesameasbetween0.01and0.1).Themodifiedprior
behaves like a Jeffrey’s prior for h > h′
ues smaller than h′
uniform prior and avoids a logarithm of 0.
min(> 0) expresses the “strength” of the prior. Lower
mindecrease the probability associated with hmax.
min, yet allows for val-
min. In this regime it acts like a standard
With the Bayesian framework restricted to ignorance priors, any
evidence for significant power at some frequency f can be han-
dled by this combination of priors. The modified Jeffrey’s prior
of the mode height provides a solution for the problem that in
real observations power due to noise can be mistaken for ac-
tual intrinsic signal. But by allowing the likelihood function to
deliver significantly higher probabilities for mode heights well
above the noise level, it reduces the mode height prior’s pref-
erence for a non-detection. The choice of h′
is therefore nothing more than the expression of an “odds ratio
condition”.Considertheposteriorprobabilityofthemodeheight
at some value h, without considering the additional parameters,
which is
minin this context
p(h|D,I) ∝ p(h|I)p(D|h,I).
(9)
The ratio of the probabilities
Oh,h′
min=
p(h|I)
p(h′
min|I)
p(D|h,I)
p(D|h′
min,I)= Oprior× Olikelihood
(10)
is an odds ratio for the models “mode height= h” and “mode
height= h′
a likelihood ratio, or as a strength-of-evidence indicator similar
to a SNR. Oprioris the prior odds ratio, while Olikelihood is of-
ten called the “Bayes factor”. The following equation is then an
“oddsratio condition”for the definitedetectionof the Lorentzian
profile:
min”. It can also be seen as a Bayesian weighting of
Ocond=
1
Oprior
< Olikelihood,
(11)
If Ocondis larger than Olikelihood, the model “mode height= h′
is favoured,and there is not enoughevidencefor the detectionof
a Lorentzianprofilewith a height greaterthan this threshold.For
example, Ocond = 105requires a likelihood function value of a
given h that is at least 105times that of h′
file to considered as detected. That this condition arises for the
modeheightparameteris quiteintuitive,sincethemodeheightis
the parameter that determines whether a Lorentzian profile rises
above the noise level. In order to ensure a constant odds ratio
condition for a data set, independent of the number of modes to
be detected, the geometric mean of the mode height priors can
be used when more than one Lorentzian is evaluated.
To summarize, the mode height clearly is a scale parameter,
rather than a location parameter (see Gregory 2005). Choosing
h′
MLE procedure does not implicitly allow for such a restriction
and, thus, makes it prone to overfitting.
min”
minin order for the pro-
minallows one to set a lower limit for mode peak detection. The
2.4. Treatment of rotational effects using ignorance priors
The effects of stellar rotation complicate the picture tremen-
dously. If rotational splitting is suspected, one can replace sin-
gle Lorentzian profiles with a sum of profiles. The central
Lorentzian of the rotationally split mode can be treated just like
a single profile, i.e., it has the same parameters and correspond-
ing priors. The number of suspected split components, however,
depends on the spherical degree of the mode. The components
themselves are characterized by the rotational splitting and their
mode heights, which relative to each other and the central peak
depend on the stars inclination angle (see, e.g., Gizon & Solanki
2003) If the SNR conditions are very poor and/or a prelimi-
nary mode identification is not possible, we chose not to use
the inclination angle as a parameter but to set the heights of
the split components as individual free parameters (see Equ.3).
Each Lorentzian profile in the model can a priori be “equipped”
with a number of rotationally split components conforming to
the highest value of ℓ expected to be present in the data. The
corresponding height priors, then, only allow significant mode
heights of those components in the rotationally split multiplets
for which there is enough evidence in the data. This way, no
preliminarymodeidentificationis necessary.Alas, using this ap-
proachthe numberof parametersneeded for the model increases
tremendously.ForhighSNRdata,wherethedegreeofthemodes
becomes even visually apparent through the rotational splitting,
a preliminary mode identification is certainly a more sensible
approach. In such a case, though, our method is not needed any-
way.
3. Application using Markov-Chain Monte Carlo
The analytic evaluation of complex models in a large parame-
ter space in terms of Bayesian probability soon reaches the limit
of computer resources. Stochastic methods can provide a suf-
ficient sampling of the parameter regions of interest at a much
smaller cost. In particular applications of the MCMC-technique
have therefore gained increasing popularity and the (Bayesian)
problemsto which MCMC nowadaysare applied range from the
detection of planets (Gregory2007),to the analysis of solar-type
pulsators (Brewer et al. 2007), to spot modelling of active stellar
atmospheres (Croll 2006).
Basically, MCMC performs a sequential walk through the
parameter space of a specific problem. The MCMC technique
probes solutions to the Bayesian equations by generating a ran-
dom set of model parameters at some point and then progressing
throughparameter space via a biased random walk. This bias di-
recting the chain of steps through parameter space is provided
by the Bayesian probabilities themselves. Specifically, each pa-
rameter of the model is incremented or decremented by some
random fraction of a predefined step width and the procedure
then accepts or declines this combinationof steps. The condition
of acceptance is provided by the Metropolis-Hastings algorithm
(Hastings 1970), which is based on the ratio of probabilities of
subsequent parameter configurations before and after the step.
In order to comply with this algorithm, the random steps do not
necessarilyhavetobesampledfroma symmetricproposaldistri-
bution. However, using such a distribution (uniform, Gaussian,
etc.) is more intuitive and facilitates the necessary calibration
of the algorithm (see below). MCMC scales with an increasing
numberof parameters andit is an ideal tool for solvingour prob-
lem. We note that the Metropolis-Hastings algorithm evaluates
only the relative probabilitybetween different parameter config-
urations. The normalisation factor of Equ.4, which is often very
Page 4
4M. Gruberbauer et al.: On the detection of Lorentzian profiles in a power spectrum
difficult to evaluate, does not need to be calculated and will be
dropped here after.
3.1. Semi-automated MCMC calibration
In order for the MCMC-implementation to operate efficiently
the acceptance rate must be properly calibrated. When there are
more than two parameters it can be shown that the acceptance
rate should be about 0.25 in order to minimize correlations be-
tween the differentparameters(Roberts et al. 1997).Because the
acceptance rate depends mainly on the step width of each pa-
rameter, the step widths, themselves, have to be calibrated. The
calibration process becomes more and more cumbersome as the
numberof parametersincreases. We, therefore,implementedthe
following automated MCMC calibration scheme, similar to that
described in Gregory (2005).
Given an observed power spectrum, several non-overlapping
frequencywindows are definedthat envelopethe rangeof the in-
dividual mode frequency parameters. The number of Lorentzian
profiles to be investigatedper window is specified, and the lower
and upper constraints for the mode height(s), the mode life-
time(s), etc. are defined. The model spectrum is initialized with
appropriate starting parameters, such as equidistant frequencies
within the frequency windows and mean values for the mode
height, mode lifetime, and so on. The MCMC algorithm is
started with a step width of
σ(x) = 0.1(xmax− xmin)(12)
for each model parameter x, where xmax and xmin are the up-
per and lower limits already used to define the prior probabil-
ities. During this “burn-in” phase our MCMC algorithm eval-
uates, for every single step and for every single parameter, the
relative probability of the models according to Equ.4, using
the described likelihood function and priors. It will take several
hundreds to thousands of iterations to approach the parameter
regions of maximum likelihood. When the “burn-in” phase is
completed the model should be close to the global maximum of
probabilities. For the next few thousand iterations, the individ-
ual acceptance rate for each parameter is evaluated every hun-
dred or so steps. Simultaneously, the MCMC step width of the
respectiveparameteris slowly adjustedto approacha desired ac-
ceptance rate, which depends on the total number of parameters
in the model. For example, we found that in order to achieve a
combined acceptance rate of about 0.25 for a model consisting
of ∼ 70 parameters, the individual acceptance rates had to be set
to roughly 0.94.
Once the desired individual acceptance rates remain fairly
stable, our algorithm switches to the standard MCMC routine,
where the probability of a new configuration, and therefore the
probability of its acceptance, is evaluated after all parameters
have been slightly changed according to a random walk. At this
point the MCMC routine is set to automatically test the parame-
ter space.
3.2. Evaluating the MCMC results
Since the acceptance probability for each step is calculated us-
ing the Bayesian posterior probability, the MCMC routine will
compute distributions (marginal distributions) for each involved
parameter. After a sufficient number of iterations the marginal
distributions for all model frequencies, mode heights, mode life-
times, and backgroundoffsets can be estimated from histograms
that count how frequently respective values of each parameter
were visited. An estimate for the validity of the tested model
can be obtained by examining these distributions.If any strong
asymmetries, multiple local maxima, and other obviously non-
Gaussian features appear they can be used to guide one toward
an improved model. Regardless, the uncertainties for all param-
eters can be obtained from the cumulative distribution function
of the normalized histograms.
4. Simulations
0
0.05
0.1
power density
130 132134
frequency [d-1]
136138140
0
0.01
0.02
0.03
p(ν)
00.020.04 0.060.08
mode height
0
0.01
0.02
0.03
0.04
0.05
p(h)
0.20.4 0.60.81
mode lifetime [d]
0
0.01
0.02
0.03
0.04
p(τ)
a
b
Fig.1. Results of the Bayesian analysis for one of the 100 simulated
data sets. Upper panel: (a) the power density spectrum calculated from
the simulated data (dashed line) and the most probable model spectrum
derived from the analysis (solid line); (b) the marginal distributions
of the mode frequency parameters for both Lorentzian profiles of the
model. Middle panel: the corresponding marginal distributions of the
mode height parameters. Dashed and solid lines refer to the respective
distributions presented in the top panel (b). Bottom panel: the marginal
distribution of the mode lifetime parameter.
In order to test our method with parameters similar to a typi-
cal CoRoT run we generated 100 time series of 60 days duration
each following the procedure in Chaplin et al. (1997).Each sim-
ulated data run represents a different realisation of the signal de-
scribed in Tab.1. We calculated the power spectra of each time
series and applied our Bayesian analysis. The frequency range
was between 130 and 140d−1. Two Lorentzian profiles were fit-
ted, each with adjustable mode height but equal mode lifetime.
The upper and lower limits for all parameters, which also enter
the Bayesian formalism via the prior probabilities, are presented
in Tab.2. h′
minwas set to 10−6, and hmaxto 0.15, and the geomet-
Page 5
M. Gruberbauer et al.: On the detection of Lorentzian profiles in a power spectrum5
Table 1. Input parameters for the simulated solar-type pulsator.
f1
f2
frequency
rms amplitude
estimated mode height
mode lifetime
white noise model with SNR of 0.1 in the time domain
132.3
1
0.033
0.5 d
136.6
1
0.033
0.5 d
Table 2. Parameter limits for the analysis of the simulated solar-type
pulsator. According to Equ.7, h′
minwas set to 10−6.
min
130
0
0.1
0.001
max
140
0.15
1.5
0.01
ν[d−1]
h
τ[d−1]
b
ricmeanofthemodeheightpriorswas used.This correspondsto
anoddsratioconditionof∼ 105forthe maximumvaluehmaxand
to a condition of ∼ 104for the actual input mode heights. The
results are based on 300000 iterations of the MCMC routine.
4.1. Evaluation of the simulations
Fig.1 shows a typical power spectrum calculated from one of
the 100 test data sets, along with the most probable fit to the
data. All input parameter values fall within the borders of the
derivedmarginaldistributions. We present the median values for
all the parameters (but see discussion in Gregory (2005)) and,
importantly, the uncertainties. A closer inspection of Fig.1 re-
veals that not all the derived parameter values fall within the
1σ uncertainties of the parameter values of the simulated data.
Indeed, all the parameters for this example are underestimated:
The median of the f1distribution deviates from the input value
by 2.99σ. The corresponding mode height is underestimated by
0.37σ. f2is underestimated by 0.63σ, with the corresponding
mode height deviating by 1.2σ. The mode lifetime also deviates
by 0.84σ. Such deviations are expected and are a direct conse-
quence of the stochastic nature of the signal.
Fig.2 shows the results for all 100 simulations. The me-
dian values derived from our analysis are compared to the sim-
ulation input values and are plotted against the corresponding
normalized probabilities of these input values, evaluated via the
marginal distributions. As expected, the input values are not al-
ways reproduced to within 1σ (p > 0.682), but are consistently
found within 3σ. The scatter of the derived median values of
the parameters aroundthe input values behaves as expected.The
median of the frequency parameter is shown to be distributed
symmetrically,while the modelifetime and mode heights follow
a log-normal distribution. The figure also illustrates the overall
accuracywith which each parametercan be reproduced.The fre-
quency parameter shows a lower accuracy limit of about 0.1%.
The corresponding lower limits for the mode height and mode
lifetime parameters are much larger at about 40%.
In addition, Fig.2 shows cumulative histograms for the pa-
rameter value probabilities of the evaluated simulations. These
histograms indicate that the scatter of the probabilities of the
input values, for individual realisations, can be roughly approx-
imated by a normal distribution. As the derived marginal distri-
butions for all parameters provide a consistent picture, we argue
0 0.2 0.4 0.6 0.8 1
n(p)/ntot
0
0.2
0.4
0.6
0.8
1
p(νsim)
-0.050 0.05
∆ν/νsim [%]
0 0.2 0.4 0.6 0.8 1
n(p)/ntot
0.2
0.4
0.6
0.8
1
p(hsim)
-40-200204060
∆h/hsim [%]
0 0.2 0.4 0.6 0.8 1
n(p)/ntot
0
0.2
0.4
0.6
0.8
1
p(τsim)
-40-2002040 60
∆τ/τsim [%]
Fig.2. Right panels: distribution of the derived median mode parameter
values (ν,h,τ) compared to the input values. Positive/negative values
indicate under-/overestimation of the parameters. The open squares and
plus signs in the top and middle panels identify the results for the two
Lorentzian profiles in the model. Left panels: cumulative histograms for
the probabilities of the simulation input values determined from all 100
simulations.
that the probability densities obtained from a single data set can
be trusted.
4.2. The importance of the mode height prior
Aswas arguedinSec.2.3,themodeheightpriorinits givenform
allows one to adjust the fitting to better account for noise. The
simulations described in the previous Section provide us with an
excellent example for this rationale. In Fig.3 we show one of
the simulations that includes by chance an additional power ex-
cess in the frequency range of consideration due to noise and/or
the stochastic nature of the input signal. Using these data we
tested how looking for of a (non-existent) third Lorentzian pro-
file would be influenced by the h′
height prior. We evaluated the probabilities for the existence of
3 Lorentzian profiles in the spectrum and performed the analy-
sis with two different values of h′
The weaker prior with h′
the noise peak as noise. This is similar to the behavior of the
MLE method, or any Bayesian application that does not im-
plicitly treat the problem of over-fitting due to, e.g., noise. The
strong prior with h′
the noise peak and the two real profiles (see Fig.3) and, in fact,
determined a maximum in probability for a mode height of zero
for the third.
In the case of the strong prior, the marginal distribution of
the frequency parameters for profile 2 and profile 3 are good ap-
minparameter of the mode
min= 10−8and h′
min= 0.01 failed to correctly identify
min= 0.01.
min= 10−8, correctly distinguished between
Page 6
6M. Gruberbauer et al.: On the detection of Lorentzian profiles in a power spectrum
130130132132134134 136136138138140140
0.02
0.04
0.06
0.08
0.1
0.12
0.14
probability density
131 132 133 134 135 136
frequency [d-1]
0.020.02
0.040.04
0.060.06
0.080.08
0.1 0.1
131 132 133 134 135 136
136.4 136.4136.6136.6136.8136.8
0.010.01
0.020.02
0.030.03
0.040.04
1e-061e-060.00010.00010.010.01
00
0.02
0.04
0.06
0.08
0.1
0.1
0.12
1e-061e-060.00010.00010.010.01
mode height
00
0.01 0.01
0.02 0.02
0.030.03
0.04 0.04
0.050.05
0.01 0.010.10.1
00
0.010.01
0.020.02
0.030.03
0.040.04
130132134136138140
frequency [d-1]
0
0.05
0.1
power density
Simulation 45
profile 1
profile 2
profile 3
profile 1
profile 2
profile 3
Fig.3. 3 Lorentzian profiles are fittedto simulated data that contain 2 frequencies. Top panel: power density spectrum of one of the 100 simulations
mentioned inSec.4. The2 input frequencies aremarked by thick dashed lines. Additional power at ∼ 131.7d−1is either due to noise or, more likely,
due to the stochastic nature of the signal. Middle panels: marginal distributions of the frequency parameters calculated after 300000 iterations.
Each panel contains the results for one of the three fitted profiles. The thick lines show the results for h′
h′
values of h′
min= 10−8, the thin dashed lines indicate
min= 0.01. Lower panels: corresponding marginal distributions of the mode height parameters for each of the Lorentzian profiles and for both
min(10−8and 0.01).
proximations to a normal distribution and match the two input
frequencies. Indeed, profile 2 is not influenced by the additional
power excess in its vicinity. Although the marginal distribution
shows a maximumprobabilityfor the frequencyparameterat the
locationofthe additionalpowerexcess,the marginaldistribution
also assigns significant probabilitydensities to the whole param-
eter range. But more to the point the marginal distributions for
the height prior, which again are well-shaped for profile 2 and
profile 3, show a most probable mode height of zero for profile
1. That is, the profile 1 is unnecessary to fitting this range of
frequencies with only two profiles are clearly detected.
In the case of the weak prior, the marginal distributions of
the frequency and mode height parameters of profile 1 and pro-
file 2 intersect and mix. Hence, there seems to be evidence for
a third Lorentzian profile which is comparably strong to the ev-
idence for the actual Lorentzian profile at 132.3d−1. Moreover,
profile 1 and profile 2 cannot be easily separated. Thus, their
mode parameters cannot be unambiguouslydetermined.In sum-
mary, with a weak mode-height prior (mimicking MLE meth-
ods) a third mode is detected where none exist and the near by
mode information is distorted.
5. Application to the CoRoT target HD 49933
The question of detecting Lorentzian profiles is only relevant
for data sets, where the SNR is high enough to see evidence
for solar-type pulsation, but low enough to be questionably
tractable with methods used in helioseismology. The ground-
brakingCoRoT data haveall the qualities necessary for the tech-
nique to work:
1. good data quality devoid of strong instrumental signal or
aliasing
2. clear indication of Lorentzian profiles at least in the region
of maximum pulsation power
3. increasingly poor S/R further away from the region of maxi-
mum pulsation power.
With the application to such data, we will demonstrate the prac-
tical utility of our Bayesian method. We applied our technique
to the 60 days of CoRoT N2–data of HD49933 obtained dur-
ing the Initial Run from February to March, 2007. A detailed
description of the data extraction and reduction can be found in
the CoRoT book (see Baglin et al. 2006, and references therein).
The data set consists of more than 163000 measurements sam-
pled each 32s. A detailed summary of the stellar properties and
the CoRoT observations is given by Appourchauxet al. (2008).
Intrinsic stellar noise–like signal, mainly due to turbulent
convection, generates significant power in the frequency range
of pulsation and effects the accurate determination of mode pa-
rameters. For the Sun Harvey (1985) modeled the background
signal with a sum of power laws. Aigrain et al. (2004), and
more recently Michel et al. (2008), use P(ν) =
Pi = aiζ2
to model the solar granulation signal, where ν is the frequency,
τiis the characteristic times scale of the ith component and Ciis
the slope of the power law. The normalisation factor aiis cho-
sen such that ζ2
i=
?
?
iPi, with
iτi/(1 + (2πτiν)Ci), or hereafter Pi = Ai/(1 + (Biν)Ci),
Pi(ν)dν corresponds to the variance of the
Page 7
M. Gruberbauer et al.: On the detection of Lorentzian profiles in a power spectrum7
?
?
?
?
?
?
?
???? ????????
???
?
?
?
?
?
?
?
?
?
??
???????????µ???
????????????????????µ???
???
?
??
???
????
??????? ?
???
??????? ?
????
???
Fig.4. Top panel: power density spectrum of HD49933. Light grey:
original power density spectrum; dark grey: the original spectrum
smoothed with a 20µHz boxcar average; thick solid line: global fit ac-
cording to Equ.13; dotted line: global fit without the Gaussian term;
dashed lines; white noise and the 3 power law components of the global
fit. Middle panel: enlargement of the top panel. Dotted line: background
fit plus the Lorentzian profiles fitted to the residual power density spec-
trum. Bottom panel: original power density spectrum with the granula-
tion signal removed. Solid line: Gaussian term of the global fit.
stochastic signal in the time series. Whereas the slope of the
power laws was arbitrary fixed to 2 in Harvey’s original model,
Appourchaux et al. (2002) were the first to suggest a different
value. Recently, Aigrain et al. (2004) and Michel et al. (2008)
have shown that, at least for the Sun, the slope is closer to 4.
Each powerlaw representsa differentphysicalprocesswith time
scales for the Sun ranging from minutes in the case of granula-
tion to days in the case of stellar activity.
Appourchaux et al. (2008) used a single power law for the
background signal in the frequency range of pulsation and fitted
the corresponding parameters simultaneously with the p-mode
parameters to the observed power density spectrum. We have
separated the analysis of the background signal from the analy-
sis of the pulsation signal, in which case one has to model the
pulsation signal in the power density spectrum with a dedicated
term in the global background fit. Kallinger et al. (2008) have
shown that the power excess hump due to pulsation can be ap-
proximated by a Gaussian function. Hence, our global model to
fit the heavily smoothed power density spectrum of HD49933
consists of a superposition of white noise, three power law com-
Table 3. Global fit parameters. The amplitudes of the power laws (Ai),
the height of the Gaussian part (Pg), and the white noise components
(Pn) are given in ppm2/µHz. The power law time scales (Bi) are given
in seconds and the center (νmax) and width (σ) of the Gaussian part are
in µHz. One-sigma error estimates are given in brackets.
A1
B1
A2
B2
A3
B3
Power laws25743 (6200)
252311 (25000)
Pg
0.500 (0.015)
22 (6)
16477 (1950)
2.2 (0.11)
1637 (62)
νmax
1657 (28)
σ
538 (26)Gaussian term
White noisePn= 0.33 (0.005)
ponents, and a power excess hump approximated by a Gaussian
function:
P(ν) = Pn+
3 ?
i=1
Ai
1 + (Biν)4+ Pg· e−(ν−νmax)2/(2σ2),
(13)
where Pnis the white noise component, Aiand Biare the ampli-
tudes and characteristic time scales of the power laws, Pg, νmax,
and σ are the height, central frequency, and width, respectively,
of the Gaussian term. The resulting global fit (and its compo-
nents) is shown in the top and middle panel of Fig.4 along with
the original and heavily smoothed power density spectrum. The
corresponding parameters of the global fit are given in Tab.3.
The fit without the Gaussian component (dotted line in top and
middle panel of Fig.4) enables us to estimate the local noise and
is used to separate the backgroundsignal form the pulsation sig-
nal. The residual power density spectrum rescaled to the white
noise level, shown in the bottom panel of Fig.4, is used for the
subsequent analysis.
5.1. MCMC analysis
The frequency range between the lower and upper limits be-
yondwhich we see the pulsation signal disappearingin the noise
was subdivided into 16 windows. To each, a chosen number of
Lorentzian profiles was fitted, based on our Bayesian algorithm.
The model parameters are presented in Tab.4. This subdivision
into windows was only chosen to accelerate the burn-in phase. It
has no influence on the results, as long as the resulting marginal
distributions for the mode frequency parameters are not inter-
sected by the window borders. Yet, it still enables us to perform
a global fit which includes the influence of Lorentzian profiles
also from adjacent windows.
Table 4. Parameter limitsfor the analysis of HD49933. The height prior
parameter was set to h′
min= 10−8.
min
1219.9
0
0
0.001
max
2618.1
6
1.5
0.5
ν[µHz]
h[ppm2/µHz]
τ[d−1]
b[ppm2/µHz]
For the first analysis we decided to consider only 2
Lorentzian profiles in each window. Moreover, we also kept the
Page 8
8M. Gruberbauer et al.: On the detection of Lorentzian profiles in a power spectrum
mode lifetime uniform for all Lorentzian profiles, expecting that
the marginal distribution of this parameter would tell us whether
this assumption is valid.
Fig.5 shows the power density spectrum, on which our anal-
ysis is based, together with the fitting windows we defined,
and the most probable model derived after ∼ 1.3 million iter-
ations. The model manages to reproduce the observations quite
well, and there remain only very few frequency ranges where
a visual inspection seems to indicate additional power. Beyond
ν > 2450µHz several Lorentzian profiles are fitted with fairly
well defined frequencies, but with a probability maximum for a
mode height of zero. Two profiles, listed in Tab.5 as P25 and
P26, indicated in the figure by dashed lines, have ambiguous
frequency distributions ranging over the whole fitting window.
What is shown is a biased choice to fit the general picture.
171017121714
frequency [µHz]
171617181720
0.005
0.01
0.015
0.02
0.025
0.03
0123
mode height [ppm2/µHz]
0
0.005
0.01
0.015
0.02
0.025
0.03
probability density
0 0.1 0.2 0.30.40.50.60.70.8 0.9
mode lifetime [d]
0
0.005
0.01
0.015
0.02
0.025
Fig.6. Top panel: marginal distribution of the mode frequency param-
eter of one of the profiles fitted to the CoRoT data. The median value,
and the lower and upper 1σ-values are indicated by the dashed lines.
Middle panel: same as top panel but for the corresponding mode height
parameter. Bottom panel: same as top panel but for the mode lifetime
parameter for a simultaneous fit to all Lorentzian profiles.
Fig.6 presents an example for the marginal frequency distri-
butionandmodeheightforoneofthe Lorentzianprofiles.As ex-
pected,the formercan beverywell approximatedbya Gaussian,
while the latter follows a log-normal distribution. The bottom
panel shows a quite narrow and smooth modeheight distribution
for a simultaneous fit to all Lorentzianprofiles, indicating a con-
stant mode lifetime. The data therefore do not seem to warrant
different mode lifetimes. It is thus save to assume that any varia-
tions amongthe variousprofilesforthis parameterare accounted
for within the uncertainty given by the mode lifetime distribu-
tion. Surprisingly, we do not find well-defined multi-modal dis-
tributions for any of the frequency parameters involved. Some
ambiguities in the distributions are only apparent at very high
frequencies and with the poorest SNR. Additional profiles, ei-
ther due to modes of higher degree or rotational splitting, still
could be present, if the frequencies are very well separated.
0
2
4
6
8
power density [ppm2/µHz]
1650
1700
1750
frequency [µHz]
1800
1850
1900
0
0.01
0.02
probability density
0.01 0.11
mode height [ppm2/µHz]
0
0.01
0.02
probability density
a
b
Fig.7. Top panel: (a) the fitted region in the power density spectrum,
where the p modes have the highest SNR. The most probable model
(thick solid line) is presented in comparison to the observations (dot-
ted line). The borders of the fitting windows are indicated by the thick
dashed lines. (b) The marginal mode frequency distributions for 3 pro-
files per window. Six profiles are identified (black), while the additional
profiles are ambiguous across the whole frequency range (gray lines).
Frequencies identified by Appourchaux et al. (2008) are shown as thin
dashed lines. Bottom panel: The corresponding mode height distribu-
tions.
5.2. On the evidence for ℓ = 2 modes
We repeated our analysis for the region with the highest SNR,
but included for each fitting window a third Lorentzian pro-
file in our model. The results are shown in presented in Fig.7.
The probability distributions for the additional profiles consis-
tently favours a mode height of zero. Moreover, the correspond-
ing mode frequency distributions only vaguely contain regions
of higher probability. We therefore conclude that, based on our
conservative approach, any additional power in the power den-
sity spectrum is due to noise or to the stochastic nature of the
clearly detected modes. From our perspective, focussing on the
“detection” of profiles, the data do not present convincing evi-
dence for ℓ = 2 modes.
5.3. On the evidence for rotational splitting
The failed detection of additional profiles, and the single-mode
nature of the frequency distributions of detected profiles, sug-
gests that the effects expected from rotational splitting can
most likely be neglected in the analysis of the CoRoT data of
HD49933. Nonetheless, in order to perform a more rigourous
test, we analysed the same frequency region shown in Fig.7 in-
cluding these effects. Our model used 2 Lorentzian profiles per
fitting window, each of which contains 2 additional features cor-
responding to rotational splitting of ℓ = 1 modes (see Equ.3).
Accordingly,anewglobalparameter,therotationfrequency,was
introduced. Appourchaux et al. (2008) report the detection of
a particular spectral feature at 3.4 µHz which apparently corre-
sponds to the HD49933’s rotation frequency. We therefore used
this value as a reference and allowed a scatter of about 0.2 µHz
in either direction. This range corresponds to twice the classical
Page 9
M. Gruberbauer et al.: On the detection of Lorentzian profiles in a power spectrum9
?
?
?
?
?
?
???? ???????? ???? ???? ????
?
?
?
?
?
?
???? ???? ???? ???? ???? ????
?
?
?
?
?
?
???????? ???? ???? ????????
?
?
?
?
?
?
???????????? ???? ????????
???????????µ???
????????????????????µ???
Fig.5. Observed power density spectrum (gray) and fitted Lorentzian profiles.Vertical lines indicate the borders of the fitting windows (see text).
The dashed lines in the bottom panel show two profiles for which our Bayesian technique results in ambiguous frequency distributions ranging
over the whole fitting window
frequency resolution (∆T)−1of the CoRoT time series, where
∆T is the length of the data set, and was used to define the bor-
ders of a uniform prior. The heights of the rotation profiles were
implemented as usual, using modified Jeffrey’s priors for these
parameters. We expected a well defined distribution for the ro-
tation frequency to emerge from the analysis. However, we find
no preferred value for this parameter, as is shown in Fig.8. The
distribution is mostly consistent with sampling noise.
Concerning the mode heights, alternating detections and
non-detections of the rotation profiles would indicate a dif-
ference in their central profiles between radial and non-radial
modes. We cannot find any such differences, since the mode
height distributions for the rotation profiles are consistent with a
null result. While the star certainly does rotate, it seems that the
signal is too close to the noise level to supportany need for these
parameters. In other words, rotational effects, if present, seem to
be ill–definedand donot needto be consideredfor the frequency
analysis of this data set. As an example of a near ideal case, the
bottom panel of Fig.8 shows corresponding results from a sin-
gle mode using 60 days of data of the Sun observed as a star by
the VIRGO (green band) instrument on board the SOHO space-
craft (see Fr¨ ohlich et al. 1997, and references therein). Fitting
only the small range between 3090 and 3110 µHz and using
the same profile model as for HD49933, the rotational splitting
becomes apparent.
3.25
3.3
3.35
3.4
νrot [µHz]
3.453.5 3.55
2
2.5
3
3.5
probability density [10-3]
0.25
0.3
0.35
0.4
νrot [µHz]
0.450.50.550.6
0
0.5
1
1.5
2
probabiliy density [10-2]
Fig.8. Top panel: the marginal distribution of the rotational splitting
parameter derived as explained in Sec.5.3. The thick black line is a
boxcar average using 10 data points. Bottom panel: the same but for the
analysis of VIRGO data of the Sun.
Page 10
10M. Gruberbauer et al.: On the detection of Lorentzian profiles in a power spectrum
10 2030 40
50 60
708090
frequeny mod 86.3 µHz
1200
1400
1600
1800
2000
2200
2400
2600
frequency [µHz]
Fig.9. Echelle diagram showing the identified frequencies (solid error
bars). Values belonging toprofile 25 and 26 (seeTab.5) are indicated by
the dashed error bars. Frequencies found by Appourchaux et al. (2008)
are displayed in grey.
5.4. Results
As a consequence of the results from the previous two sections,
we use only 2 modes per window without any rotational split-
ting for our global fit. For 6 of the 32 fitted profiles, the iden-
tification is ambiguous (or the most probable mode heights are
close to zero) due to the low SNR. All other profiles are clearly
detected and their parameters have smooth, single-mode distri-
butions. The results of our analysis are presented in Tab.5. The
1σ-errors have been derived from the cumulative distribution
functions, which are evaluated using the marginal distributions
for all parameters. We find that the data are consistent with a
mode lifetime of roughly0.5 days which does not appear to vary
significantly across the spectrum. The confidence for this over-
all result canbe calculated via the modeheightparameter.As we
have used h′
Ocond= 4.07 × 106.
Therefore,theobtainedparameterdistributionsareatleast 4.07×
106times more probable than a model spectrum that only con-
sists of the background offset. The profiles 25 and 26 are only
listed for sake of completeness. Due to the ambiguity in their
marginal frequency distribution, apparent in their given uncer-
tainties, we do not assign credibility to these values. The final
EchellediagramisdisplayedinFig.9.Thefrequenciesestimated
as ℓ = 1 by Appourchaux et al. (2008) agree very well with our
corresponding values, and the 1σ-uncertainties are comparable
in both studies. The remaining frequencies, however, differ due
to the cited authors’ explicit assumption of ℓ = 2 modes, which
we do not detect.
Although we deliberately did not include fixed mode height
ratios for modes of different degree in our model, the resulting
mode height ratios are mostly consistent within the uncertainties
with a fixed ratio. The obvious outlier in Fig.10 at ∼ 2280µHz
is due to profiles 25 and 26, which we already marked as highly
suspicious.
min= 10−8, the result gives an odds ratio condition
Table 5. Results of the CoRoT data for HD 49933. Profiles P 25 and P
26 are ambiguous detections. Profiles P 29 to P 32, which are not listed,
have a most probable mode height of zero.
P
ν
[µHz]
σν
h
[ppm2/µHz]
σh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
1244.53
1289.02
1329.60
1371.94
1413.85
1457.63
1501.40
1544.11
1585.86
1629.99
1670.18
1714.71
1756.42
1799.91
1840.10
1884.81
1927.41
1972.70
2013.88
2059.04
2101.88
2146.16
2188.69
2233.96
2279.03
2322.60
2362.40
2403.61
(-1.21 / +1.21)
(-1.36 / +1.36)
(-1.50 / +1.64)
(-1.80 / +1.53)
(-1.34 / +1.03)
(-1.17 / +1.02)
(-1.11 / +0.95)
(-1.63 / +1.22)
(-0.90 / +0.78)
(-0.81 / +0.70)
(-0.96 / +0.89)
(-0.94 / +0.88)
(-0.82 / +0.76)
(-1.02 / +0.89)
(-1.00 / +0.94)
(-0.89 / +0.89)
(-1.50 / +1.27)
(-1.10 / +0.94)
(-0.89 / +0.89)
(-1.49 / +1.49)
(-1.66 / +1.66)
(-1.32 / +1.22)
(-2.23 / +2.39)
(-2.03 / +1.87)
(-1.96 / +41.67)
(-44.02 / +2.06)
(-2.74 / +2.06)
(-1.58 / +1.43)
0.60
0.81
0.45
0.47
0.67
0.74
0.76
0.83
1.08
1.35
1.08
1.53
2.01
1.61
1.04
1.38
0.93
0.97
0.94
0.97
0.79
0.74
0.55
0.56
0.84
0.38
0.50
0.46
(-0.15 / +0.17)
(-0.19 / +0.19)
(-0.14 / +0.14)
(-0.14 / +0.15)
(-0.15 / +0.17)
(-0.17 / +0.17)
-0.17 / +0.20)
(-0.19 / +0.20)
(-0.20 / +0.22)
(-0.23 / +0.26)
(-0.22 / +0.23)
(-0.27 / +0.29)
(-0.34 / +0.37)
(-0.28 / +0.32)
(-0.19 / +0.23)
(-0.25 / +0.27)
(-0.20 / +0.23)
(-0.19 / +0.22)
(-0.19 / +0.21)
(-0.21 / +0.22)
(-0.19 / +0.19)
(-0.16 / +0.18)
(-0.15 / +0.15)
(-0.15 / +0.15)
(-0.41 / +0.24)
(-0.16 / +0.39)
(-0.15 / +0.15)
(-0.14 / +0.14)
τ
[d]
στ
η
[µHz]
ση
all0.47(-0.04 / + 0.06) 7.8(-0.8 / +0.8)
b
[ppm2/µHz]
σb
all0.42(−0.02/ + 0.02)
6. Discussion
With highqualitydataand evidenceforadditionaland/orclosely
spaced modes, such as might appear for nonradial modes split
by rotation, we believe that it might be advantageous to change
the peak Lorentzian model by combining the corresponding
Lorentzian profiles and fitting them as a group. Eventually, once
the quality of the data becomes even better, we think that a
moreclassical Bayesian treatmentreplacingthe ignorancepriors
with specific information from theory or previous observations,
is more appropriate because the potential gain in information
outweighs the danger of over-fitting. Benomar (2008) has per-
formed such an analysis for the data set also presented in this
paper, and obviously obtained different results. It thus remains
questionable whether the Initial Run CoRoT data of HD 49933
is already good enough for such an approach.
6.1. On the global likelihood of p-mode profiles
In this paper we have used Bayes’ theorem to solve a parameter
estimation problem. One of the biggest advantages of Bayesian
Page 11
M. Gruberbauer et al.: On the detection of Lorentzian profiles in a power spectrum 11
12001400
1600
1800200022002400
νodd [µHz]
0.5
1
1.5
2
2.5
heven / hodd
Fig.10. The mode height ratios for subsequent modes listed in Tab.5
(e.g., hP2/hP1) are shown as filled circles. The dotted lines correspond
to the 68.2% confidence limits. The dashed line indicates a height ratio
of unity. The outlier at ∼ 2280µHz is due to profiles 25 and 26, which
are ambiguous detections.
analysis, however, is to perform model comparison. This is
achieved by first calculating the global likelihood via integra-
tion over all model parameters. The global likelihoods are sub-
sequently used to compare the different models themselves. It
would be interesting to use parallel tempering (e.g., Benomar
2008),an algorithmwhich allows the Markovchain to more eas-
ily sample the completeparameter space, to performsuch model
selectionbycalculatingthegloballikelihoodofamodelviainte-
gration over the tempering parameter (see, e.g., Gregory 2007).
However, we think that the validity of applying this kind of
model selection to p-mode analysis needs to be carefully tested
first.
As a first trial run, we have investigated how the global like-
lihood of three simple models behaves in a region of pure noise.
We studied the regionbetween4600and 4700 µHz in the power
spectrum of HD 49933 (see Fig.11) which is far beyond the re-
gion of pulsation. At such high frequencies, the power spectrum
should be dominated by the noise properties of the instrument.
Following the description given in Gregory (2005) on how to
obtain the global likelihood, we evaluated three models
– M1: constant noise level,
– M2: constant noise level + 1 Lorentzian profile, and
– M3: constant noise level + 2 Lorentzian profiles
The noise level was allowed to vary between 0.001 and 0.5
ppm2µHz−1. All Lorentzian profiles were allowed a mode life-
time between 0 and 1.5 days, mode heights between 0 and 6
ppm2µHz−1, and frequencies between 4600 and 4700µHz. We
used10chainswithdecreasingbutequidistanttemperingparam-
eter. In orderto studythe effects ofthe likelihoodfunctionon the
gobal likelihood we neglected all priors in the calculations. The
results are best expressed using the odds ratios
p(M1|D,I)
p(M2|D,I)≈ exp(−240) and
p(M2|D,I)
p(M3|D,I)≈ exp(−27).
Depending on the choice of the tempering parameters used in
the parallel tempering process, as well as on differences in the
numerical integration scheme required to calculate these num-
bers, the end result varies, but the overall magnitudes of dif-
ference in probability remain the same. The likelihood function
alone seems to prefer models with (more) Lorentzian profiles,
even if there are none (related to solar-type oscillations) present
in the data.
4600 46204640
frequency [µHz]
46604680
4700
1
2
3
4
5
power density [ppm2/ µHz]
Fig.11. The high-frequency region devoid of p-mode signal used for
testing the global likelihood comparison. Note that the scale is similar
to Fig.5.
Thisis anissuedemandingclarification.Itwill berequiredto
test the influence of the allowed parameter ranges in order to see
where this preference arises in the integration over the param-
eter space. Moreover, the influence from different kind of prior
probabilities on the global likelihoods also needs to be studied.
To conclude, our initial test suggests that we cannot yet trust the
global likelihood completely when comparing models of power
spectra(torealdataatleast) untilweareableto clarifythis issue.
7. Conclusion
We have presented a conservative approach (using only igno-
rance priors) for a Bayesian analysis of solar-type p modes with
a minimum of parameters. The method uses a mode-heightprior
that allows us to more easily and reliably distinguish real peaks
from those that arise stochastically from noise, in reference to a
threshold set by the user. We show how the marginal distribu-
tions of all the parameters can be obtained. Our tests with sim-
ulated data succeeded by reproducing the simulation input val-
ues and not being confused by randomly occurring noise peaks.
Furthermore, we have applied our method to the Initial Run
CoRoT data of HD 49933, where we can easily and unambigu-
ously identify at least 26 p modes. We fail to detect ℓ = 2 modes
or effects produced by rotation. We also approached the possi-
bility of performing Bayesian model comparison in the analysis
of solar-type p modes. First results suggest, however, that more
tests need to be performed in order to understand how model
comparison works in a power spectrum. Finally, we would like
to point out a companion paper (Kallinger et al. 2009). Therein,
we show that frequencies extracted in this work are in excellent
agreement with model frequencies of a solar-calibrated model
that coincides with the spectroscopically determined position of
HD49933 in the H-R diagram.
Acknowledgements. We would like to especially thank Daniel Huber (Sydney
Institute for Astronomy, Sydney, Australia) for interesting discussions. We
also thank Peter Reegen (Institute for Astronomy, Vienna, Austria) and Tim
Bedding (Sydney Institute for Astronomy, Sydney, Australia) for their sug-
gestions. MG, TK and WW have received financial support by the Austrian
Fonds zur F¨ orderung der wissenschaftlichen Forschung (P17890-N02). DBG ac-
knowledges funding from the Natural Sciences & Engineering Research Council
(NSERC) Canada. Last but not least, we are very thankful for the constructive
comments provided by the anonymous referee who helped to greatly improve
this article. We are grateful for the VIRGO data being publicly available.
Page 12
12M. Gruberbauer et al.: On the detection of Lorentzian profiles in a power spectrum
References
Aigrain, S., Favata, F., & Gilmore, G. 2004, A&A, 414, 1139
Appourchaux, T. 2008, Astronomische Nachrichten, 329, 485
Appourchaux, T., Andersen, B., & Sekii, T. 2002, in ESA Special Publication,
Vol. 508, From Solar Min to Max: Half a Solar Cycle with SOHO, ed.
A. Wilson, 47–50
Appourchaux, T., Gizon, L., & Rabello-Soares, M.-C. 1998, A&AS, 132, 107
Appourchaux, T., Michel, E., Auvergne, M., et al. 2008, A&A, 488, 705
Baglin, A., Michel, E., Auvergne, M., et al. 2006, in ESA Special Publication,
Vol. 1306, ESA Special Publication, 39–50
Bedding, T. & Kjeldsen, H. 2006, in ESA Special Publication, Vol. 624,
Proceedings of SOHO 18/GONG 2006/HELAS I, Beyond the spherical Sun
Benomar, O. 2008, Communications in Asteroseismology, 157, 98
Brewer, B. J., Bedding, T. R., Kjeldsen, H., & Stello, D. 2007, ApJ, 654, 551
Chaplin, W. J., Elsworth, Y., Howe, R., et al. 1997, MNRAS, 287, 51
Croll, B. 2006, PASP, 118, 1351
Fr¨ ohlich, C., Crommelynck, D. A., Wehrli, C., et al. 1997, Sol. Phys., 175, 267
Gizon, L. & Solanki, S. K. 2003, ApJ, 589, 1009
Gregory, P. C. 2005, Bayesian Logical Data Analysis for the Physical Sciences:
A Comparative Approach with ‘Mathematica’ Support (Bayesian Logical
Data Analysis for the Physical Sciences: A Comparative Approach with
‘Mathematica’ Support. Edited by P. C. Gregory. ISBN 0 521 84150 X
(hardback); QA279.5.G74 2005 519.5’42 – dc22; 200445930. Published by
Cambridge University Press, Cambridge, UK, 2005.)
Gregory, P. C. 2007, MNRAS, 374, 1321
Harvey, J. 1985, in ESA Special Publication, Vol. 235, Future Missions in Solar,
Heliospheric & Space Plasma Physics, ed. E. Rolfe & B. Battrick, 199–+
Hastings. 1970, Biometrika, 57, 97
Kallinger, T., Gruberbauer, M., Guenther, D. B., Fossati, L., & Weiss, W. W.
2009, ArXiv e-prints, 0811.4686v1
Kallinger, T., Weiss, W. W., Barban, C., et al. 2008, A&A, submitted
Michel, E., Samadi, R., Baudin, F., et al. 2008, ArXiv e-prints (0809.1078)
Roberts, G. O., Gelman, A., & Gilks, W. R. 1997, Ann. Appl. Prob., 7, 110
Toutain, T. & Appourchaux, T. 1994, A&A, 289, 649
List of Objects
‘HD 49933’ on page 6
View other sources
Hide other sources
-
Available from Werner W. Weiss · 29 Jan 2013
-
Available from ArXiv