Page 1
arXiv:hep-th/0505147v1 16 May 2005
HUTP-05/A0020
HD-THEP-05-09
UCB-PTH-05/14
LBNL-57558
Ghosts in Massive Gravity
Paolo Creminellia, Alberto Nicolisa, Michele Papuccib, and Enrico Trincherinic
aJefferson Physical Laboratory,
Harvard University, Cambridge, MA 02138, USA
bDepartment of Physics, University of California, Berkeley and Theoretical Physics Group, Ernest
Orlando Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
cInstitute for Theoretical Physics,
Heidelberg University, D-69120 Heidelberg, Germany
Abstract
In the context of Lorentz-invariant massive gravity we show that classical solutions around heavy sources
are plagued by ghost instabilities. The ghost shows up in the effective field theory at huge distances
from the source, much bigger than the Vainshtein radius. Its presence is independent of the choice of the
non-linear terms added to the Fierz-Pauli Lagrangian. At the Vainshtein radius the mass of the ghost
is of order of the inverse radius, so that the theory cannot be trusted inside this region, not even at the
classical level.
1 Introduction
In recent years there has been renewed interest in the possibility of giving a mass to the graviton. This
idea belongs to a broader class of proposals for modifying gravity at large distances.
theoretical interest, these models could be phenomenologically relevant as possible alternatives to dark
matter and dark energy. In this paper we reconsider the issue of the range of validity of massive gravity;
in particular we concentrate on the stability of classical solutions around massive sources.
The problem we want to address has a long history. Already in the first paper [1] Fierz and Pauli
observed that the mass term must be of the form m2
spectrum, besides the 5 degrees of freedom of the massive spin-2 graviton. A different structure would
result in an instability at an energy scale ∼ mg. Unfortunately Boulware and Deser showed that this
additional degree of freedom propagates when nonlinearities in the action are taken into account [2].
However, from an effective field theory point of view this is not necessarily a problem, until one specifies
the scale at which the ghost shows up, i.e. its mass. If this scale is above the UV cutoff Λ of the effective
theory this instability can be consistently disregarded.
Besides their
g(h2− hµνhµν), otherwise a ghost appears in the
1
Page 2
On the other hand, non-linearities of the classical theory are also the solution to the problem raised by
van Dam, Veltman, and Zakharov (vDVZ) [3, 4]: in the linearized theory predictions are not continuous
in the limit mg→ 0, because the helicity-0 component of the graviton does not decouple from matter.
However, Vainshtein [5] observed that the vDVZ discontinuity might not be relevant for macroscopic
sources because the linearized approximation around a source of mass M∗ breaks down at a distance
RV = (M∗M−2
full non-linear solution could be in perfect agreement with experiments. This is still an open issue in
massive gravity, but the Vainshtein effect has been shown to work in a closely related model, the DGP
model [6, 7, 8].
More recently massive gravity has been reconsidered in the effective field theory language, which
provides a systematic framework for dealing with quantum effects [9]. For this purpose it is useful to
restore the broken diffeomorphism invariance by introducing a set of Goldstone bosons. With this method
it is easy to see that the scalar longitudinal component of the graviton becomes strongly coupled at a
very low energy scale, much lower than what naively expected by analogy with the spin-1 case. In the
Fierz-Pauli theory the strong interaction scale is Λ5∼ (m4
Hubble parameter is (1011km)−1. By adding to the Fierz-Pauli Lagrangian a set of properly tuned
interactions of the form hn
In both cases the theory seems to lose predictivity at very large distances. For instance one can
wonder how this strong coupling affects the gravitational potential generated by an astrophysical source.
Apparently the potential is uncalculable at distances smaller than 1000 km; but in principle this could not
be the case. After all the strong coupling takes place in the Goldstone sector, and inside the Vainshtein
radius, if the Vainshtein effect applies, one expects the Goldstone to give a negligible correction to
the Newtonian potential. Whatever quantum effects take place at the cutoff distance, they could be
sufficiently screened from experiments. Nevertheless, without further assumptions, from an effective
theory point of view one should include in the Lagrangian all the possible operators allowed by the
symmetries and weighted by the cutoff. In this case the effective theory loses predictivity at a much
larger length scale: these higher dimension operators all become important at a huge distance from the
source when evaluated on the classical solution. In the improved theory with cutoff Λ3 this happens
at the corresponding Vainshtein radius RV ∼ (M∗M−2
means that we are unable to compute the gravitational potential at distances shorter than RV without a
UV completion. As a consequence there is no range of distances where nonlinear effects can be reliably
computed in the effective field theory. If we restrict to the original theory with cutoff Λ5the situation is
even worse. In this case the infinite tower of higher dimension operators become important at a distance
which is parametrically larger than the corresponding Vainshtein radius [9].
The picture looks very similar to the DGP model [11], where the same problems have been pointed out
[12]. In the DGP model our world is the 4D boundary of an infinite 5D spacetime. Gravity is described
by a standard 5D Einstein-Hilbert action with Planck mass M5 in the bulk and by an additional 4D
Einstein-Hilbert action localized on the boundary, with a much larger Planck mass M4. The resulting
Newton’s law is 4-dimensional below the critical length scale LDGP= M2
distances. From the 4D viewpoint there is a scalar degree of freedom, the brane bending mode π, whose
dynamics is closely related to that of the longitudinal Goldstone boson φ of massive gravity. In particular
strong interactions show up in the π sector at a tiny energy scale ΛDGP∼ (MP/L2
(taking LDGPof order of the present Hubble horizon H−1
possible operators allowed by the symmetries and suppressed by ΛDGP, around a heavy source they all
become important at the Vainshtein radius RV ∼ (M∗M−2
solution.
Pm−4
g)1/5which diverges for mg→ 0. Classical nonlinearities become important, and the
gMP)1/5, which for mgof order of the present
µνthe cutoff can be raised up to Λ3∼ (m2
gMP)1/3∼ (1000 km)−1[9, 10].
Pm−2
g)1/3. For the sun RV is ∼ 1016km. This
4/M3
5and 5-dimensional at larger
DGP)1/3∼ (1000km)−1
0). Also if one includes in the Lagrangian all
PL2
DGP)1/3when evaluated on the classical
2
Page 3
However all these difficulties depend on assumptions about the UV completion. In the DGP model
one can consistently assume a UV completion such that the effective theory is predictive down to distances
significantly shorter than 1/ΛDGP. For instance on the surface of the earth the cutoff can be pushed up
to ∼ cm−1, not far from the smallest length scale ∼ 100 µm at which gravity has been experimentally
tested. Of course a necessary requirement for this to be possible is the consistency of the classical theory:
in particular in DGP no classical instability develops in the π sector for all relevant astrophysical sources
and for a large class of cosmological solutions [13]. In this paper we want to study whether the same
stability properties hold for massive gravity. This is a basic consistency requirement one has to satisfy
before analyzing the theory at the quantum level and looking for a mechanism, analogous to that working
in DGP, that can make the theory predictive in a phenomenologically interesting range of scales.
The most convenient way to study the dynamics of the theory is to use the Goldstone formalism, that
we review in section 3; for our purposes the main interesting features of the model are encoded in the
Lagrangian of the longitudinal component φ of the Goldstone vector. If we start with a Fierz-Pauli mass
term, the dominant interactions for this scalar degree of freedom are cubic self-couplings with 6 derivatives
of the form (∂2φ)3. In the presence of a macroscopic source the Goldstone gets a non-trivial configuration
Φ(x); in order to study the stability of such a solution it is necessary to expand the action at quadratic
order in the fluctuations around Φ(x). It is evident that in general, because of the cubic self-coupling, the
fluctuations will get a higher-derivative kinetic term. As we discuss in section 2, this signals the presence
of a ghost-like instability already at the classical level. In DGP this does not happen: although the π
cubic self-coupling has 4 derivatives, its tensorial structure is such that fluctuations around a background
get only a 2-derivative kinetic term [12, 13]. Unfortunately, this does not work for the Fierz-Pauli theory.
Still, we have a large freedom in choosing the non-linear extension of FP, and one can wonder if it is
possible to cancel all higher derivatives terms and end up with a ghost-free theory. Sections 4 and 5
contain the answer: despite the freedom we have, ghost-like instabilities are unavoidable.
In the Goldstone language it is also easy to compute the scale at which this instability appears. Even in
the most favorable setup in which the cutoff is Λ3, the ghost enters in the effective field theory at distances
from the source parametrically larger than the (already huge) Vainshtein radius RV. Furthermore, when
in approaching the source we reach r = RV the mass of the ghost has dropped to 1/RV! This means
that in no way the theory can be extrapolated inside the Vainshtein radius.
The last part of the paper is devoted to discuss how the sickness of the theory is interpreted in the
unitary gauge. Clearly we expect the ghost we found to be the troublesome sixth degree of freedom.
With the Fierz-Pauli mass term, at quadratic level this degree of freedom does not appear because the
trace of the Einstein equations gives a constraint instead of a propagating equation. In section 6 we show
that this equation becomes dynamical in the presence of a curved background; we qualitatively estimate
in this simple case the mass of this new excitation and the result agrees with the mass we find for the
ghost in the Goldstone computation. Then, in section 7, we show in the Hamiltonian formalism that
there exists no non-linear extension of the Fierz-Pauli theory that can forbid the propagation of the sixth
mode in the presence of a slightly curved background. The analysis in the unitary gauge is powerful for
counting the number of degrees of freedom, but, unlike the Goldstone analysis, it says nothing about the
typical scales of these modes. Also in order to address stability issues one should study the positivity
of the Hamiltonian. This in general is difficult, and the analysis has been carried out by Boulware and
Deser for non-linear extensions of the form f(hµνhµν− h2) [2]. On the contrary the Goldstone analysis
concentrates from the very beginning on the strong interacting degree of freedom: all the interesting and
troublesome features of the theory are encoded in the dynamics of a single scalar field. This enormously
simplifies the analysis.
Recently it has been realized that massive gravity models with Lorentz violating mass terms can be
3
Page 4
significantly ‘healthier’ than the traditional Lorentz-invariant theory; it particular they can avoid the
vDVZ discontinuity and the strong coupling problem, and they can be free of ghosts [14, 15, 16]. In this
paper we stick to the Lorentz-invariant massive gravity theory.
2 Ghosts from higher derivative kinetic terms
Let us first be very specific about why higher derivative kinetic terms give rise to ghost-like instabilities.
Take for instance a massless scalar field φ with Lagrangian density (note that we are using the (−,+,+,+)
signature!)
L = −1
where Λ is some energy scale, a = ±1, and Vintis a self-interaction term. We show that, independently
of the sign of the second term, the system is plagued by ghosts. To do so we want to reduce to a purely
two-derivative kinetic Lagrangian, from which we know how to read the stability properties of the system.
We therefore introduce an auxiliary scalar field χ and a new Lagrangian
2(∂φ)2+
a
2Λ2(?φ)2− Vint(φ) , (1)
L′= −1
2(∂φ)2− a ∂µχ∂µφ −1
2a Λ2χ2− Vint(φ) , (2)
which reduces exactly to L once χ is integrated out. L′is diagonalized by the substitution φ = φ′− aχ.
We get
L′= −1
which clearly signals the presence of a ghost: χ has a wrong-sign kinetic term. Notice in passing that χ
can also be a tachyon, for a = −1: in this case χ has exponentially growing modes. But let us neglect
this possibility and concentrate on the ghost instability, which is unavoidable. A ghost, unlike a tachyon,
is not unstable by itself: its equation of motion is perfectly healthy at the linear level, and does not
admit any exponentially growing solution. The problem is that its Hamiltonian is negative, so that
when couplings to ordinary ‘healthy’ matter are taken into account (the potential term in our example
above) the system is unstable: with zero net energy one can indiscriminately excite both sectors, and this
exchange of energy happens spontaneously already at classical level. In a quantum system with ghosts in
the physical spectrum this translates into an instability of the vacuum. The decay rate is UV divergent
due to an infinite degeneracy of the final state phase space. It is not clear how to cutoff this divergence
in a Lorentz invariant way [17].
However the situation is not as bad as it seems: our ghost χ in eq. (3) has a (normal or tachyonic)
mass Λ, so that it will show up only at energies above Λ, i.e. when the four derivative kinetic term in
eq. (1) starts dominating over the usual two derivative one. We can consistently use our scalar field
theory eq. (1) at energies below Λ, and postulate that some new degree of freedom enters at Λ and takes
care of the ghost instability. For example, we can add a term −(∂χ)2to eq. (2) (for simplicity we stick
to the non-tachyonic case a = +1 and set Vint= 0),
2(∂φ′)2+1
2(∂χ)2−1
2a Λ2χ2− Vint(φ′,χ) ,(3)
LUV= −1
2(∂φ)2− ∂µχ∂µφ − (∂χ)2−1
2Λ2χ2.(4)
This drastically changes the high-energy picture, since the resulting Lagrangian obtained by demixing
now describes two perfectly healthy scalars, one massless and the other with mass Λ. At the same time,
at energies below Λ the heavy field χ can be integrated out from LUV, thus giving the starting Lagrangian
eq. (1) up to terms suppressed by additional powers of (∂/Λ)2. This example shows that in principle the
ghost instability can be cured by proper new physics at the scale Λ. In other words, eq. (1) makes perfect
sense as an effective field theory with UV cutoff Λ.
4
Page 5
3 The Goldstone action
In this section we briefly re-derive the Lagrangian of massive gravity along the lines of [9], i.e. keeping
explicit the Goldstone bosons of broken diffeomorphism invariance.
To write down a mass term for gravity, in addition to the full dynamical metric gµν, we have to take
a reference fixed metric: for our purposes we will take the Minkowski metric ηµν. A mass term breaks
invariance under general coordinate transformations. However, as it has been shown in Ref. [9], one
can always restore local coordinate invariance by the St¨ uckelberg trick: in analogy with massive gauge
theories one introduces a set of Goldstone fields and requires that they transform non-linearly under a
local coordinate transformation. The fundamental object to be used for this purpose is a symmetric
tensor Hαβ, built in terms of the reference metric, the field describing metric fluctuations hµν= gµν−ηµν
and the four Goldstone fields πµ,
Hαβ= hαβ+ ∂απβ+ ∂βπα+ ∂απγ∂βπγ. (5)
Hµνtransforms as a covariant tensor under local diffeomorphisms xα→ xα+ξαprovided that παshifts,
πα→ πα− ξα. As in non-abelian massive gauge theories, since now local coordinate invariance is non
linearly realized on the π field, a Lagrangian built using H will be valid as an effective theory and its
breakdown will appear as the Goldstone sector becoming strongly coupled at some scale Λ. This indeed
has been shown in Ref. [9]. It is useful to further split the 4 παfields into a vector and a scalar as
πµ= Aµ+ ∂µφ (6)
together with an additional hidden U(1) gauge invariance for Aµunder which φ shifts. Note that since φ is
a Goldstone boson under this U(1) gauge symmetry and πµis a Goldstone boson of broken diffeomorphism
invariance φ will appear with two derivatives in the Lagrangian, as evident from eq. (5). A mass term
for hµνcan be written down in terms of Hµνas
√−ggµνgαβ(aHµαHνβ+ bHµνHαβ) .(7)
Expanding H using (5) one easily realizes that for a generic choice of a,b there is a quadratic term
in φ containing 4 derivatives. This term signals the presence of a ghost as we have seen in Section 2.
Only for the Fierz-Pauli choice a = −b ≡ m2
does not have a kinetic term on its own, but only a kinetic mixing with hµν: m2
A conformal rescaling of the metric hµν=ˆhµν+ m2
gηµνφ diagonalizes this mixing and generates a small
(i.e. proportional to m2
the kinetic term is the origin of the low strong coupling scale as it enhances the φ interactions once the
fields are canonically normalized.
One can easily see that the most relevant interactions are of the form
gM2
Pthis four derivative term exactly cancels. In this case φ
gM2
P(∂µ∂νφhµν− ?φh).
g) kinetic term for φ besides interactions of the form φ(∂2φ)n. The smallness of
m2M2
P(∂2φ)3=(∂2φc)3
MPm4g
(8)
where φcis the canonically normalized field. These interactions saturate perturbation theory at the tiny
energy scale E ∼ Λ5≡ (m4
One can slightly improve the situation canceling these cubic interactions by adding H3terms. Now
the most relevant interactions will be of the form (∂2φ)4and this procedure can be repeated at any
gMP)1/5.
5
Page 6
order. The dominant interactions will be (∂2φ)nand once all these are canceled the theory has the cutoff
Λ3≡ (m2
m2
andm2
gMP)1/3. In fact after this procedure the most relevant interactions are
gM2
P(∂A)2(∂2φ)n
gM2
P(ˆhµν+ m2
gηµνφ)(∂2φ)n, (9)
which are weighted by Λ3when expressed in terms of canonically normalized fields.
The interaction between matter and gravity is as usual described by the term1
transformation that demixes hµν from φ thus generates a direct coupling of φ to the trace T of the
stress-energy tensor. This implies that we will have a non-trivial φ background around any astrophysical
source. For a classical solution we will have two relevant scales: the first is the Schwarzschild radius
RS, the distance from the source at which linearized gravity breaks down, the second is the Vainshtein
radius RV [5] where nonlinearities for the scalar field φ become important. For a source of mass M∗this
distance is equal to 1/Λ5(M∗/MP)1/5(which is much larger than RS). At this scale the term (∂2φ)3
becomes as relevant as the kinetic term whereas all the other nonlinear terms are important only when
we reach the Schwarzschild radius so that they can be safely neglected. Therefore the action of φ in the
presence of sources is given by
2hαβTαβ. The Weyl
S =
?
d4x
?
3φc?φc+
1
Λ5
5
?(?φc)3− (?φc)(∂µ∂νφc)2?+
1
2MPφcT
?
;(10)
the structure of the trilinear terms can be changed by adding non-linear interactions to the Fierz-Pauli
Lagrangian.
The situation remains qualitatively unchanged even if the first N (∂2φ)ninteractions are tuned to
zero. The first non-vanishing term, (∂2φ)N+1, will set the Vainshtein radius, while higher order terms
will become relevant again at the Schwarzschild radius.
4Massive gravity in the presence of a source
Let us consider the Lagrangian eq. (10) in the presence of a macroscopic source, like the Sun. This
induces a classical background Φ(x), solution of the φ equation of motion. To study the stability of this
solution we expand the Lagrangian at the quadratic order in the fluctuation ϕ ≡ φ − Φ. The result is
schematically of the form
Lϕ= −(∂ϕc)2+(∂2Φc)
i.e. the background gives a four-derivative contribution to the ϕ kinetic term. As discussed in sect. 2,
this results in the appearance of a ghost with an x-dependent mass
Λ55
(∂2ϕc)2,(11)
m2
ghost(x) ∼
Λ55
∂2Φc(x). (12)
Remember that we are dealing with an effective theory with a tiny UV cutoff Λ5, therefore we should
not worry until the mass of the ghost drops below Λ5. In approaching the source from far away, this
happens at a distance Rghostfrom the source such that ∂2Φc∼ Λ53. Unfortunately this is a huge distance,
parametrically larger than the (already huge) Vainshtein radius RV. In fact for a source of mass M∗at
distances r ≫ RV the background field goes as Φc(r) ∼ (M∗/MP) · 1/r, so that
1
Λ5
MP
Rghost∼
?M∗
?1/3
≫ R(5)
V
∼
1
Λ5
?M∗
MP
?1/5
.(13)
6
Page 7
Therefore the ghost is going to show up in an extremely weak background field, when the latter is still
in its linear regime.
Inside Rghost, in the spirit of sect. 2, one is forced to postulate that additional physics lighter than
the local ghost mass cures the instability, that is the cutoff must be lowered from Λ5to mghost(x). A
byproduct of this in general would be that interactions strengthen, being weighted by the new cutoff scale
rather than by Λ5. But let us optimistically assume that, instead, the only effect of this new physics is to
cure the ghost instability. However, when the local ghost mass is of order of the inverse distance from the
source there is no way of proceeding further without specifying the UV completion of the theory, since
the background itself has a typical length scale of order of the UV cutoff. One can easily check that this
happens at the Vainshtein radius RV. There is no sense in which one can trust the classical solution below
RV. Since one can hope to recover General Relativity only in the region inside RV, where non-linear
effects can hide the scalar (Vainshtein effect), this also means that General Relativity is nowhere a good
approximation.
Notice that in the DGP model the dominant interaction of the Goldstone has the form ?π(∂π)2,
which suggests the same problem we are facing, as there are two derivatives acting on one of the π’s.
Nevertheless, in the equation of motion terms with more than two derivatives acting on a single field
cancel out and one is left with a (non-linear) second order differential equation [13]1. The same cannot
happen in our case since there are too many derivatives: the contribution of the trilinear term to the
equation of motion is a sum of terms with 2 φ’s and 6 derivatives. In any term there is at least one φ
carrying more than two derivatives.
One is thus led to consider the possibility of eliminating the unwanted trilinear interaction of the
Goldstone by adding appropriate cubic terms in Hµν to the Fierz-Pauli Lagrangian eq. (7). The three
independent contractions are H3, H(Hµν)2, and (Hµν)3, where the last stands for the cyclic contraction
of the indices. These contain interaction terms for the Goldstone of the form (∂2φ)3which, for the proper
choice of coefficients, cancel the trilinear interaction of eq. (10). However, in this way one introduces
further quartic interactions (∂2φ)4on top of those already present in the Fierz-Pauli mass term, because
of the non-linear relation between Hµνand φ of eq. (5). These are problematic for exactly the same reason
as before, and the same problem shows up at any order: an interaction term of the form (∂2φ)nevaluated
around a background gives a contribution to the equation of motion for the fluctuations with too many
derivatives. This signals the presence of a ghost instability2. Again one can check that the cutoff (i.e. the
1Equivalently, working at the level of the Lagrangian, one can expand the interaction term ?π(∂π)2to second
order in the fluctuation ϕ around a background πb. The worrisome term is ?ϕ∂µϕ ∂µπb, since it has 3 derivatives
acting on the ϕ’s. But by integration by parts one can shift one derivative from the fluctuations to the background,
thus obtaining an ordinary 2-derivative kinetic term for the fluctuations (whose positivity must however be checked)
[12, 13].
2The reader could wonder if there exists a choice of coefficients such that terms with 4 derivatives on a single
field cancel in the equation of motion. In this case one would be left only with 3-derivative terms and our
conclusions should be modified. But this is not the case: setting to zero all 4 derivative terms leads also to the
cancellation of those with 3 derivatives. To see this, consider the most general interaction Lagrangian of n-th
order, L(n)= Γα1β1···αnβn∂α1∂β1φ···∂αn∂βnφ, where Γ is a tensor constructed with the metric ηµν. Given the
structure of contractions, without loss of generality we can choose Γα1β1···αnβnto be symmetric under αi↔ βiand
(αi,βi) ↔ (αj,βj). Then, for symmetry reasons, the contribution of L(n)to the φ equation of motion is
Γα1β1···αnβn[An(∂α1∂β1∂α2∂β2φ)(∂α3∂β3φ) + Bn(∂β1∂α2∂β2φ)(∂α1∂α3∂β3φ)]∂α4∂β4φ···∂αn∂βnφ ,
where An,Bnare combinatoric factors. The four-derivative term (the first in brackets) identically vanishes only if
the totally symmetric part of Γα1β1···αnβnin the first four indices does, Γ(α1β1α2β2)···αnβn= 0. In this case, given
the symmetries of Γ, it is straightforward to check that Γα1(β1α2β2)···αnβn= Γ(α1β1α2β2)···αnβn= 0. This eliminates
(14)
7
Page 8
ghost mass) becomes of order of the inverse radius at the new Vainshtein scale (always defined as the
distance from the source at which non-linearities become relevant). Hence one never recovers General
Relativity.
The only possibility is therefore to concentrate on theories in which all the interactions of the form
(∂2φ)nare set to zero by properly choosing infinitely many coefficients. We can look at this procedure
as an extension at non-linear order of the Fierz-Pauli choice for the mass term which, as discussed in
sect. 3, leads to the cancellation of the (∂2φ)2terms. In the next section we prove that also in this case
we cannot avoid higher derivative kinetic terms for the fluctuations of the field φ around a non trivial
background: ghost-like instabilities are unavoidable.
5 Ghost instabilities are unavoidable
The cancellation of the (∂2φ)ninteractions has also been considered as a way to raise the strong interaction
scale to Λ3= (m2
form
m2
andm2
gMP)1/3≫ Λ5[9, 10]. In fact, after the cancellation, the leading interactions are of the
gM2
P(∂A)2(∂2φ)n
gM2
P(ˆhµν+ m2
gηµνφ)(∂2φ)n;(15)
when the fields are canonically normalized all these terms are suppressed by the scale Λ3, while additional
interactions are weighted by higher scales. Correspondingly the Vainshtein radius now shrinks to R(3)
1/Λ3(M∗/MP)1/3≪ R(5)
linear terms of the form (ˆhµν+ m2
linear in A in the Lagrangian, so that A is sourced neither by matter nor by the other fields. Interactions
involving this field are therefore irrelevant for our purposes, and can be consistently neglected.
The theory we are describing is not unique: there are different possible choices of coefficients that
cancel all the interactions (∂2φ)n. We can easily see why. Let us start with the canonical Fierz-Pauli
mass term3L2=√−g?[H2] − [H]2?; since Hµν= hµν+ 2∂µ∂νφ + ∂µ∂αφ∂ν∂αφ (setting Aµ= 0), L2
contains (∂2φ)3interactions. We can cancel them adding an appropriate combination of terms cubic in
H, L3=√−g?1
α3, the expression
LTD
because it gives (∂2φ)3terms in the combination
V
=
V
as the original leading interactions have been canceled. At this radius all non-
gηµνφ)(∂2φ)nbecome relevant; on the other hand there are no terms
2[H][H2] −1
2[H3]?. But at this point we can still add, with an arbitrary overall coefficient
=√−g?3[H][H2] − [H]3− 2[H3]?
3
, (16)
(?φ)3− 3 ?φ (∂µ∂νφ)2+ 2 (∂µ∂νφ)3,(17)
which is a total derivative (hence the superscript ‘TD’) and thus does not contribute to the equation of
motion. Now L2+ L3+ α3LTD
order terms are canceled by
3
contains (∂2φ)4interactions and we can repeat the procedure. Fourth
L4=√−g1
16
?
(5 + 24α3)[H4] − (1 + 12α3)[H2]2− (4 + 24α3)[H][H3] + 12α3[H2][H]2?
the 3 derivative term (the second in brackets) as well, i.e. eliminates the contribution of L(n)to the φ equation of
motion altogether.
3We use the notation [H] = gµνHµν, [H2] = gµνgαβHµαHνβ, and its straightforward generalization to higher
orders.
.(18)
8
Page 9
Again we have the possibility to introduce a second arbitrary coefficient, α4, in front of the ‘total deriva-
tive’ term
LTD
and we can go on at higher orders until all self-couplings are removed. Notice that these total derivative
terms LTD
interactions that reduce to a total derivative when expressed in terms of ∂µ∂νφ and therefore do not
contribute to high-derivative terms in φ. One can check that there is one of such combinations per order,
but for our purposes we will need them only up to fourth order4.
We now want to see if it is possible to get a kinetic structure without ghosts for the fluctuations
around a background. Note that this theory has potentially dangerous terms of the form hµν(∂2φ)n. We
could hope that the large freedom we have in the choice of higher-order terms, parameterized by the
coefficients α3,α4,..., helps us to obtain a ghost-free theory. Unfortunately, this is not the case and the
main reason is that the number of possible contractions of Hµνgrows very fast and soon we cannot cancel
all the dangerous kinetic terms. Let us see how this works explicitly. We call hb
fields5, and we study the quadratic Lagrangian for the scalar fluctuation ϕ. Actually, instead of working
at the level of the Lagrangian, the most direct approach is to look at the equations of motion linear in
the fluctuations, because in this case there is no ambiguity coming from integration by parts. We have
to check whether all the terms with more than 2 derivatives on ϕ can cancel in the equations of motion.
We start with the Lagrangian terms cubic in the fields: they can come only from L2+ L3+ α3LTD
Only terms schematically of the form h(∂2φ)2can give a ghost; terms with a higher number of hµνhave
no more than 2 derivatives. Using the explicit expression of the Lagrangian, it is immediate to verify
that i) the terms in the e.o.m. with more than 2 derivatives on ϕ cancel already with α3= 0, and ii)
they cancel also when originating from LTD
no higher derivative kinetic terms for ϕ and the parameter α3can still be varied arbitrarily.
What happens with the quartic Lagrangian? Let us consider the interactions hµν(∂2φ)3; now we must
include also L4(eq. (18)), and for simplicity we start with α4= 0. Again it is straightforward to write
down the equations of motion linear in ϕ. The terms with more than two derivatives on the fluctuation
can be divided into two classes: those containing (hb)µµand those in which (hb)µν is contracted with
derivatives of φ. Either class must cancel independently of the other, since, given the different tensor
structure, there is no possibility of cross-cancellation between the two. In the e.o.m. the terms belonging
to the first class automatically cancel. They come from a piece of the Lagrangian that is precisely 4α3[h]
times the ‘total derivative’ combination eq. (17). The remaining dangerous kinetic terms, which belong
4
=√−g?[H]4− 6[H2][H]2+ 8[H3][H] + 3[H2]2− 6[H4]?
, (19)
n
are the higher order analogue of the Fierz-Pauli mass term: they are combinations of hµν
µνand φbthe background
3.
3
alone. We conclude that at the cubic level in Hµνthere are
4It is easy to check that at n-th order a total derivative term is given by
?
π
(−1)πηα1π(β1)···ηαnπ(βn)∂α1∂β1φ···∂αn∂βnφ ,(20)
where the sum runs over all permutations π of the β indices, and (−1)πis the parity of the permutation. To prove
that this combination is the only total derivative term at a given order assume that there are two of them. One
then could construct a total derivative term that does not contain, say, (?φ)n. Imposing that the contribution to
the field equations is zero, it is straightforward to show that also all the other terms vanish.
5Remember that the field hµν is the graviton before the Weyl rescaling that demixes it from φ: hµν =ˆhµν+
m2
gηµνφ.
9
Page 10
to the second class, come from the Lagrangian6
(8 + 72α3) ([hφ3] − [hφ2][φ]) − 36α3([hφ][φ2] − [hφ][φ]2) .
Can we add now α4LTD
containing (h)µµ) in the equations of motion. Then, can we choose α4to get rid of the contributions
coming from eq. (21)? Unfortunately the answer is negative. In fact, expanding α4LTD
α4LTD
(21)
4
(eq. (19))? Yes, since this does not reintroduce terms of the first class (i.e.,
4,
4
⊃ −192α4([hφ3] − [hφ2][φ]) + 96α4([hφ][φ2] − [hφ][φ]2) , (22)
one immediately sees that either the first or the second pair of terms in eq. (21) (but not both) can be
canceled by properly choosing α3and α4. The other pair gives a non-zero, four derivative contribution
to the equation of motion for the fluctuation ϕ. The number of possible tensor structures is bigger than
the freedom we have in the Lagrangian. This completes the proof that massive gravity around a generic
background cannot have a purely two-derivative kinetic term for ϕ.
In the Λ3theory the local mass of the ghost goes as
m2
ghost(x) ∼
Λ36
Φc(x)∂2Φc(x), (23)
so that the ghost enters in the effective theory at a distance from the source much bigger than the
Vainshtein radius,
1
Λ3
MP
Rghost∼
?M∗
?1/2
≫ R(3)
V
∼
1
Λ3
?M∗
MP
?1/3
.(24)
These results should be compared with their analogues in the Λ5theory, eqs. (12) and (13). Again at
the Vainshtein scale the mass of the ghost is of order of the inverse radius and we cannot proceed further
towards the source.
There is a final subtlety that needs to be addressed. Is the presence of a 4-derivative kinetic term
enough to claim that there is a ghost? After all, the argument of sect. 2 strictly applies only to a Lorentz-
invariant background. It is clear that if the derivatives acting on the fluctuation ϕ are contracted with
a background tensor field different from ηµν the situation can be very different. For instance, if the
quadratic Lagrangian for ϕ involves 4 space derivatives but only 2 time derivatives there is no room
for an independent propagating extra scalar (χ, in the language of sect. 2), and thus there is no ghost,
provided that the ˙ ϕ2term has the healthy sign. Indeed, in most astrophysical situations macroscopic
sources have non-relativistic velocities v ≪ 1; the background φbfield they generate is therefore essentially
constant in time, its time derivatives being suppressed by positive powers of v with respect to its spatial
gradients. This, in a term like ∂µ∂νφb∂µ∂ρϕ∂ρ∂νϕ for instance, can suppress the magnitude of terms
with 4 time derivatives. Although this is a parametric increase of the ghost mass, for typical astrophysical
velocities v ∼ 10−4− 10−3the regime of validity of the theory is not significantly widened: inside the
Vainshtein radius the theory breaks down at r ∼ R(5)
becomes of order of the inverse radius.
Vv4/5∼ (10−3− 10−2)R(5)
V, where the ghost mass
6Inside brackets with φ we indicate the matrix ∂µ∂νφ.
ηµαηνβhµν∂α∂βφ(?φ)2.
For example the term [hφ][φ]2should be read as
10
Page 11
6 Unitary gauge description
In the previous section we argued in the Goldstone language that Fierz-Pauli massive gravity (together
with all its infinitely many higher order extensions) is unavoidably plagued by ghosts around a tinily
curved background. We now want to see how the sickness of the theory is interpreted in the unitary
gauge. In order to take into account the presence of a (slightly) curved background we need to consider
the field equations at second order in hµν,
Gµν+ m2
g[ahµν+ bhηµν+ O(h2
µν)] = 0 , (25)
where the quadratic terms come both from the mass term (that we take generic for the moment) and
from additional higher order interactions. As it is well known, the invariance of the Einstein-Hilbert
action under diffeomorphisms gµν→ gµν+ ∇µǫν+ ∇νǫµ,
?
δgµν
δSEH= 2d4xδSEH
∇µǫν= −2
?
d4x√−g ǫν∇µ
?
1
√−g
δSEH
δgµν
?
= 0 , (26)
implies the contracted Bianchi identities ∇µGµν= 0. As a consequence, from eq. (25) we get the four
constraints
∇µ[ahµν+ bhηµν+ O(h2
which reduce to six the number of propagating components of hµν. The presence of these constraints
is ensured by the gauge-invariance of the ‘kinetic’ action. This is in complete analogy to what happens
in the theory of a massive vector particle, where the gauge invariance of the kinetic term implies the
constraint ∂µAµ= 0, thus reducing to three the number of propagating degrees of freedom.
If we now restrict to a linear analysis, for the particular Fierz-Pauli choice of the mass term (b = −a)
we have an additional constraint equation. In fact the linearized Einstein tensor satisfies
µν)] = 0 , (27)
Gℓµµ= ∂µ∂ν(hµν− hηµν) ,(28)
so that eq. (27) forces Gµµto vanish at linear order, and thus the trace of the Einstein equations eq. (25)
becomes a constraint for the trace mode,
h = 0 .(29)
In the end one is left with five propagating degrees of freedom, the correct number for a massive spin-2
particle. However, it is clear that this last constraint is fundamentally different from the previous four,
since, unlike them, is not ensured by any symmetry, but it is based on a precise tuning in the structure
of the mass term and on the identity eq. (28), valid at linear order. It is then natural to expect that it
does not survive in a curved background. The ghost we found in the Goldstone language is nothing but
this hidden sixth mode that, although constrained in flat space, starts propagating around a non-zero
background. Let us check that the energy scale at which the additional mode appears is indeed the same
in the two descriptions.
At quadratic order Gµµwill not vanish anymore on the equations of motion, instead it will be of the
form Gµµ∼ ∂2h2
O(∂2h2
If we now write hµνas the sum of the background field and fluctuations around it we see that the equation
above describes the propagation of a mode with mass
µν. Therefore at quadratic order the constraint eq. (29) becomes a dynamical equation,
µν) + m2
g[h + O(hµν2)] = 0 .(30)
m2
6th-mode(x) ∼
m2
g
Hµν(x),(31)
11
Page 12
where Hµνis the background metric. Notice that in the limit of zero background the mass goes to infinity
and the mode decouples, as expected from the linear analysis7.
In order to compare this with our results in the Goldstone language we must relate the background
Hµν(x) in the unitary gauge to the Goldstone background Φc(x). This is easily done by getting rid of
the Goldstone, i.e. by performing a gauge transformation with parameter ǫµ=
for definiteness the case of a spherical source, the background in unitary gauge is therefore the sum
of the usual Schwarzschild solution of GR HS
vDVZ discontinuity) and of the pure-gauge longitudinal contribution
1
m2
gMPl∂µΦc. Considering
µν(r) (corrected by the kinetic mixing with Φ, hence the
1
m2
gMPl∂µ∂νΦc(r). Both HS
µνand
1
MPlΦcscale as RS/r outside the Vainshtein radius, so that the pure-gauge contribution coming from the
Goldstone is the dominant one in unitary gauge,
Hµν(r) ∼
1
m2
gMPl∂µ∂νΦc(r) ∼
1
(mgr)2
RS
r
≫ HS
µν(r) . (32)
This means that the mass of the ‘sixth mode’ eq. (31) is dominated, as expected, by the Φ background,
m2
6th-mode(x) ∼m4
gMPl
∂µ∂νΦc∼
Λ55
∂2Φc,(33)
which is exactly the mass of the ghost eq. (12) we found in the Goldstone language!
7 The sixth mode in the ADM formalism
As we did in the Goldstone language we now show directly in unitary gauge that it is not possible to
forbid the propagation of the sixth mode by adding properly tuned higher order terms to the action.
We do this in the Hamiltonian formalism, where the counting of degrees of freedom is explicit. Let us
introduce the ADM variables {N, Nj, ˆ gij} [19],
?
and ˆ gijis the 3D metric induced on spatial hypersurfaces of constant t. It is well known that in GR N
and Njare not dynamical fields: the Einstein-Hilbert Lagrangian does not contain their time derivatives,
so that their conjugate momenta vanish identically. Moreover in the Hamiltonian they appear linearly,
as Lagrange multipliers. Therefore their equations of motion are really constraint equations for the other
degrees of freedom ˆ gijand their conjugate momenta πij, rather than equations for N and Nj. These are
the so-called momentum and Hamiltonian constraints. The Hamiltonian system is thus reduced to two
independent (q,p) pairs, which describe the two graviton modes.
If we now perturb GR by adding a mass term for hµν, or generic interactions involving only hµνand
not its derivatives, the Lagrangian still does not contain time derivative of N and Nj. However in general
N and Njnow do not appear linearly in the Hamiltonian. In this case their equations of motion are now
determining their value rather than constraining other degrees of freedom. This raises to six the number
of d.o.f. of massive gravity. The Fierz-Pauli tuning of the mass term precisely corresponds to setting to
zero the N2term in the action, so that (at quadratic order) the variation with respect to N still gives a
N ≡ 1/
−g00,Nj≡ g0j,(34)
7A similar result has been obtained in ref. [18], where it has been interpreted as an instability of arbitrarily
short time-scale of Minkowski space. This conclusion is not justified from our effective theory point of view.
12
Page 13
constraint equation, eliminating the unwanted ‘sixth mode’. We want to see if similar tunings can work
at all orders.
Expressed in terms of the ADM variables the metric fluctuation hµν= gµν− ηµνtakes the form
?
Nj
hµν=
1 − N2+ ˆ gklNkNl
Ni
hij
?
, (35)
where hij≡ ˆ gij− δij, and ˆ gijis the inverse of ˆ gij. We are going to work perturbatively in δN ≡ N − 1,
Njand hij, so from now on spatial indices are contracted with the Euclidean 3D metric δij. Notice that,
given the non-linear relation between hµνand the ADM variables, a generic n-th order expression in hµν
also contributes to orders higher than n when expressed in ADM variables.
Quadratic Terms. At quadratic order in hµνthe most general Lagrangian is8
L2= a2[h2] + b2[h]2. (36)
By plugging eq. (35) in this expression we find the term proportional to δN2,
L2⊃ 4(a2+ b2)δN2, (37)
hence the Fierz-Pauli tuning b2= −a2to set it to zero. The coefficient a2fixes the mass of the graviton,
and for our purposes we can take it to be 1. At quadratic level (in ADM variables) the problem is solved,
but L2also contributes to third and fourth order terms; in particular it contains a term −2hiiδN2, which
upon variation with respect to δN gives rise to an equation for δN itself rather than to a constraint. We
are therefore forced to introduce cubic terms in hµν.
Cubic Terms. The cubic Lagrangian is
L3= a3[h3] + b3[h][h2] + c3[h]3. (38)
Cubic terms in ADM variables come both from this and from L2. In particular, those involving more
than one δN are
L2+ L3⊃ (12c3+ 4b3− 2)hiiδN2+ 8(a3+ b3+ c3)δN3.
We can set both of them to zero by choosing
(39)
a3= 2c3−1
2,b3=1
2− 3c3. (40)
This agrees with what we found in the Goldstone language, and again the coefficient c3is still undeter-
mined. Now we are forced to introduce quartic terms in hµνto cancel undesired quartic terms containing
δNncoming both from L2and L3.
Quartic Terms. The quartic Lagrangian is
L4= a4[h4] + b4[h][h3] + c4[h2]2+ d4[h]2[h2] + e4[h]4. (41)
Again, we are only interested in terms involving powers of δN larger than 1. For symmetry reasons these
must be of the form
L2+ L3+ L4⊃?Ah2
ii+ B hijhij+ C NjNj
?δN2+ DhiiδN3+ E δN4.(42)
8In this section all contractions are done with the flat metric ηµν and there is no√−g in the action. Different
conventions are equivalent in unitary gauge: they just reshuffle the coefficients in the expansion in hµν.
13
Page 14
After a straightforward but lengthy computation we find the relationship between the coefficients (A,...,E)
and (c3,a4,...,e4),
A
B
C
D
E
=
30
0
0
0
0
8
4
4
24
0
0
32
16
−3
0
0
0
−16−12−16−8
16
16
080
161616
·
c3
a4
b4
c4
d4
e4
+
0
1/2
1/2
2
0
. (43)
We would like to set the vector (A,...,E) to zero. Since we have 6 free coefficients (c3,a4,...,e4)
to choose and only 5 conditions to satisfy, one naively expects this to be possible and one of the free
coefficients to remain undetermined. On the contrary, it is impossible. This is because the matrix above
has rank 4 and the space spanned by it does not contain the inhomogeneous term. There is no way of
choosing (c3,a4,...,e4) to make the unwanted expression eq. (42) vanish.
In summary, we tried to tune all interactions hn
µνin order to keep the Hamiltonian linear in N (or
equivalently in δN), this to ensure the presence of a constraint equation that eliminates the troublesome
sixth degree of freedom. We found that when fourth order terms are taken into account this tuning is
impossible. This agrees with our result in the Goldstone language, namely that it is impossible to tune
fourth order interactions to avoid the propagation of a ghost.
Notice however that the ADM analysis we sketched in this section is useful to explicitly count the
number of degrees of freedom but, unlike the Goldstone approach, gives us no clue on the typical mass of
these modes. From the effective field theory point of view this additional information is crucial: a massive
mode with a mass above the cutoff can be consistently discarded, even if it is a ghost or a tachyon. Also
we have not checked in this language that the sixth mode is a ghost. To do so one should compute the
Hamiltonian and see that it is not positive definite. This approach is rather cumbersome and it has been
carried out in ref. [2] only for a limited set of non-linear terms, namely functions of the Fierz-Pauli mass
term: f(h2
µν− h2).
8 Concluding remarks
It this paper we have shown that in massive gravity the classical solutions around a source are plagued
by ghost instabilities. This holds for any choice of the non-linear terms one can add to the Fierz-Pauli
action. It is known that massive gravity is an effective field theory whose UV cut-off can be pushed at
most up to Λ3= (m2
at a distance from the source Rghost∼ 1/Λ3· (M∗/MP)1/2. This distance is huge, much bigger than the
Vainshtein radius RV ∼ 1/Λ3· (M∗/MP)1/3. For instance taking mg∼ H0, for an astrophysical source
like the Sun Rghost∼ 1022km, of order of the cosmological horizon! One could optimistically postulate
that new physics enters at energies of order of the (local) ghost mass and cures the instability; even under
this hypothesis, when the mass of the ghost becomes of order of the inverse distance from the source
there is no way of proceeding further without specifying the UV completion of the theory. This happens
at the Vainshtein radius (of order 1016km for the Sun); as the vDVZ discontinuity can be cured only
inside the Vainshtein radius, General Relativity is never a good approximation.
One could argue that the very low cut-off of massive gravity is already bad enough to disregard the
theory. For instance, if one assumes that the theory has a generic series of higher dimension operators
suppressed by the cut-off, these will become important when evaluated on a classical solution at huge
gMP)1/3. In this optimal case the ghost instability enters in the effective field theory
14
Page 15
length scales, parametrically bigger than the inverse cut-off [9]. This implies that it is impossible to
calculate the classical solution around a source without specifying the UV completion. However this
problem depends on the high energy completion of the theory, and its precise formulation is subtler than
what one can argue at first sight. For example in the DGP model, which in this respect is very similar
to massive gravity, one can make consistent assumptions on the higher dimension terms which make
predictions independent of the UV completion [13]. Of course a prerequisite for this to work is that
the classical theory is free from pathologies and instabilities. As we have shown this is not the case in
massive gravity: ghost instabilities are unavoidable. The theory is inconsistent already at the classical
level, before taking into account quantum effects.
Acknowledgments
We warmly thank N. Arkani-Hamed, S. Dubovsky, M. Luty, F. Piazza, L. Pilo, R. Rattazzi, M. D. Schwartz,
R. Sundrum, T. Wiseman, and A. Zaffaroni for useful discussions and comments. We also would like to
thank the CERN Theoretical Physics Division for hospitality during this project.
References
[1] M. Fierz and W. Pauli, “On Relativistic Wave Equations For Particles Of Arbitrary Spin In An Electromag-
netic Field,” Proc. Roy. Soc. Lond. A 173 (1939) 211.
[2] D. G. Boulware and S. Deser, “Can Gravitation Have A Finite Range?,” Phys. Rev. D 6, 3368 (1972).
[3] H. van Dam and M. J. G. Veltman, “Massive And Massless Yang-Mills And Gravitational Fields,” Nucl. Phys.
B 22 (1970) 397.
[4] V. I. Zakharov, “Linearized gravitation theory and the graviton mass,” Sov. Phys. JETP Lett. 12 (1970) 312.
[5] A. I. Vainshtein, “To The Problem Of Nonvanishing Gravitation Mass,” Phys. Lett. B 39 (1972) 393.
[6] C. Deffayet, G. R. Dvali, G. Gabadadze and A. I. Vainshtein, “Nonperturbative continuity in graviton mass
versus perturbative discontinuity,” Phys. Rev. D 65, 044026 (2002) [hep-th/0106001].
[7] A. Gruzinov, “On the graviton mass,” astro-ph/0112246.
[8] M. Porrati, “Fully covariant van Dam-Veltman-Zakharov discontinuity, and absence thereof,” Phys. Lett. B
534, 209 (2002) [hep-th/0203014].
[9] N. Arkani-Hamed, H. Georgi and M. D. Schwartz, “Effective field theory for massive gravitons and gravity in
theory space,” Annals Phys. 305 (2003) 96 [hep-th/0210184].
[10] M. D. Schwartz, “Constructing gravitational dimensions,” Phys. Rev. D 68, 024029 (2003) [hep-th/0303114].
[11] G. R. Dvali, G. Gabadadze and M. Porrati, “4D gravity on a brane in 5D Minkowski space,” Phys. Lett. B
485 (2000) 208 [hep-th/0005016].
[12] M. A. Luty, M. Porrati and R. Rattazzi, “Strong interactions and stability in the DGP model,” JHEP 0309
(2003) 029 [hep-th/0303116].
[13] A. Nicolis and R. Rattazzi, “Classical and quantum consistency of the DGP model,” JHEP 0406 (2004) 059
[hep-th/0404159].
[14] N. Arkani-Hamed, H. C. Cheng, M. A. Luty and S. Mukohyama, “Ghost condensation and a consistent
infrared modification of gravity,” JHEP 0405, 074 (2004) [hep-th/0312099].
[15] V. A. Rubakov, “Lorentz-violating graviton masses: Getting around ghosts, low strong coupling scale and
VDVZ discontinuity,” hep-th/0407104.
15
Page 16
[16] S. L. Dubovsky, “Phases of massive gravity,” JHEP 0410, 076 (2004) [hep-th/0409124].
[17] R. Rattazzi, “A new dimension at ultra large scales and its price,” talk at SUSY2K, unpublished,
http://wwwth.cern.ch/susy2k/susy2kfinalprog.html.
[18] G. Gabadadze and A. Gruzinov, “Graviton mass or cosmological constant?,” hep-th/0312074.
[19] R. Arnowitt, S. Deser and C. W. Misner, “Canonical Variables for General Relativity,” Phys. Rev. 117, 1595
(1960); R. Arnowitt, S. Deser and C. W. Misner, “The Dynamics Of General Relativity,” gr-qc/0405109.
16