Page 1

arXiv:hep-th/0505147v1 16 May 2005

HUTP-05/A0020

HD-THEP-05-09

UCB-PTH-05/14

LBNL-57558

Ghosts in Massive Gravity

Paolo Creminellia, Alberto Nicolisa, Michele Papuccib, and Enrico Trincherinic

aJefferson Physical Laboratory,

Harvard University, Cambridge, MA 02138, USA

bDepartment of Physics, University of California, Berkeley and Theoretical Physics Group, Ernest

Orlando Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA

cInstitute for Theoretical Physics,

Heidelberg University, D-69120 Heidelberg, Germany

Abstract

In the context of Lorentz-invariant massive gravity we show that classical solutions around heavy sources

are plagued by ghost instabilities. The ghost shows up in the effective field theory at huge distances

from the source, much bigger than the Vainshtein radius. Its presence is independent of the choice of the

non-linear terms added to the Fierz-Pauli Lagrangian. At the Vainshtein radius the mass of the ghost

is of order of the inverse radius, so that the theory cannot be trusted inside this region, not even at the

classical level.

1 Introduction

In recent years there has been renewed interest in the possibility of giving a mass to the graviton. This

idea belongs to a broader class of proposals for modifying gravity at large distances.

theoretical interest, these models could be phenomenologically relevant as possible alternatives to dark

matter and dark energy. In this paper we reconsider the issue of the range of validity of massive gravity;

in particular we concentrate on the stability of classical solutions around massive sources.

The problem we want to address has a long history. Already in the first paper [1] Fierz and Pauli

observed that the mass term must be of the form m2

spectrum, besides the 5 degrees of freedom of the massive spin-2 graviton. A different structure would

result in an instability at an energy scale ∼ mg. Unfortunately Boulware and Deser showed that this

additional degree of freedom propagates when nonlinearities in the action are taken into account [2].

However, from an effective field theory point of view this is not necessarily a problem, until one specifies

the scale at which the ghost shows up, i.e. its mass. If this scale is above the UV cutoff Λ of the effective

theory this instability can be consistently disregarded.

Besides their

g(h2− hµνhµν), otherwise a ghost appears in the

1

Page 2

On the other hand, non-linearities of the classical theory are also the solution to the problem raised by

van Dam, Veltman, and Zakharov (vDVZ) [3, 4]: in the linearized theory predictions are not continuous

in the limit mg→ 0, because the helicity-0 component of the graviton does not decouple from matter.

However, Vainshtein [5] observed that the vDVZ discontinuity might not be relevant for macroscopic

sources because the linearized approximation around a source of mass M∗ breaks down at a distance

RV = (M∗M−2

full non-linear solution could be in perfect agreement with experiments. This is still an open issue in

massive gravity, but the Vainshtein effect has been shown to work in a closely related model, the DGP

model [6, 7, 8].

More recently massive gravity has been reconsidered in the effective field theory language, which

provides a systematic framework for dealing with quantum effects [9]. For this purpose it is useful to

restore the broken diffeomorphism invariance by introducing a set of Goldstone bosons. With this method

it is easy to see that the scalar longitudinal component of the graviton becomes strongly coupled at a

very low energy scale, much lower than what naively expected by analogy with the spin-1 case. In the

Fierz-Pauli theory the strong interaction scale is Λ5∼ (m4

Hubble parameter is (1011km)−1. By adding to the Fierz-Pauli Lagrangian a set of properly tuned

interactions of the form hn

In both cases the theory seems to lose predictivity at very large distances. For instance one can

wonder how this strong coupling affects the gravitational potential generated by an astrophysical source.

Apparently the potential is uncalculable at distances smaller than 1000 km; but in principle this could not

be the case. After all the strong coupling takes place in the Goldstone sector, and inside the Vainshtein

radius, if the Vainshtein effect applies, one expects the Goldstone to give a negligible correction to

the Newtonian potential. Whatever quantum effects take place at the cutoff distance, they could be

sufficiently screened from experiments. Nevertheless, without further assumptions, from an effective

theory point of view one should include in the Lagrangian all the possible operators allowed by the

symmetries and weighted by the cutoff. In this case the effective theory loses predictivity at a much

larger length scale: these higher dimension operators all become important at a huge distance from the

source when evaluated on the classical solution. In the improved theory with cutoff Λ3 this happens

at the corresponding Vainshtein radius RV ∼ (M∗M−2

means that we are unable to compute the gravitational potential at distances shorter than RV without a

UV completion. As a consequence there is no range of distances where nonlinear effects can be reliably

computed in the effective field theory. If we restrict to the original theory with cutoff Λ5the situation is

even worse. In this case the infinite tower of higher dimension operators become important at a distance

which is parametrically larger than the corresponding Vainshtein radius [9].

The picture looks very similar to the DGP model [11], where the same problems have been pointed out

[12]. In the DGP model our world is the 4D boundary of an infinite 5D spacetime. Gravity is described

by a standard 5D Einstein-Hilbert action with Planck mass M5 in the bulk and by an additional 4D

Einstein-Hilbert action localized on the boundary, with a much larger Planck mass M4. The resulting

Newton’s law is 4-dimensional below the critical length scale LDGP= M2

distances. From the 4D viewpoint there is a scalar degree of freedom, the brane bending mode π, whose

dynamics is closely related to that of the longitudinal Goldstone boson φ of massive gravity. In particular

strong interactions show up in the π sector at a tiny energy scale ΛDGP∼ (MP/L2

(taking LDGPof order of the present Hubble horizon H−1

possible operators allowed by the symmetries and suppressed by ΛDGP, around a heavy source they all

become important at the Vainshtein radius RV ∼ (M∗M−2

solution.

Pm−4

g)1/5which diverges for mg→ 0. Classical nonlinearities become important, and the

gMP)1/5, which for mgof order of the present

µνthe cutoff can be raised up to Λ3∼ (m2

gMP)1/3∼ (1000 km)−1[9, 10].

Pm−2

g)1/3. For the sun RV is ∼ 1016km. This

4/M3

5and 5-dimensional at larger

DGP)1/3∼ (1000km)−1

0). Also if one includes in the Lagrangian all

PL2

DGP)1/3when evaluated on the classical

2

Page 3

However all these difficulties depend on assumptions about the UV completion. In the DGP model

one can consistently assume a UV completion such that the effective theory is predictive down to distances

significantly shorter than 1/ΛDGP. For instance on the surface of the earth the cutoff can be pushed up

to ∼ cm−1, not far from the smallest length scale ∼ 100 µm at which gravity has been experimentally

tested. Of course a necessary requirement for this to be possible is the consistency of the classical theory:

in particular in DGP no classical instability develops in the π sector for all relevant astrophysical sources

and for a large class of cosmological solutions [13]. In this paper we want to study whether the same

stability properties hold for massive gravity. This is a basic consistency requirement one has to satisfy

before analyzing the theory at the quantum level and looking for a mechanism, analogous to that working

in DGP, that can make the theory predictive in a phenomenologically interesting range of scales.

The most convenient way to study the dynamics of the theory is to use the Goldstone formalism, that

we review in section 3; for our purposes the main interesting features of the model are encoded in the

Lagrangian of the longitudinal component φ of the Goldstone vector. If we start with a Fierz-Pauli mass

term, the dominant interactions for this scalar degree of freedom are cubic self-couplings with 6 derivatives

of the form (∂2φ)3. In the presence of a macroscopic source the Goldstone gets a non-trivial configuration

Φ(x); in order to study the stability of such a solution it is necessary to expand the action at quadratic

order in the fluctuations around Φ(x). It is evident that in general, because of the cubic self-coupling, the

fluctuations will get a higher-derivative kinetic term. As we discuss in section 2, this signals the presence

of a ghost-like instability already at the classical level. In DGP this does not happen: although the π

cubic self-coupling has 4 derivatives, its tensorial structure is such that fluctuations around a background

get only a 2-derivative kinetic term [12, 13]. Unfortunately, this does not work for the Fierz-Pauli theory.

Still, we have a large freedom in choosing the non-linear extension of FP, and one can wonder if it is

possible to cancel all higher derivatives terms and end up with a ghost-free theory. Sections 4 and 5

contain the answer: despite the freedom we have, ghost-like instabilities are unavoidable.

In the Goldstone language it is also easy to compute the scale at which this instability appears. Even in

the most favorable setup in which the cutoff is Λ3, the ghost enters in the effective field theory at distances

from the source parametrically larger than the (already huge) Vainshtein radius RV. Furthermore, when

in approaching the source we reach r = RV the mass of the ghost has dropped to 1/RV! This means

that in no way the theory can be extrapolated inside the Vainshtein radius.

The last part of the paper is devoted to discuss how the sickness of the theory is interpreted in the

unitary gauge. Clearly we expect the ghost we found to be the troublesome sixth degree of freedom.

With the Fierz-Pauli mass term, at quadratic level this degree of freedom does not appear because the

trace of the Einstein equations gives a constraint instead of a propagating equation. In section 6 we show

that this equation becomes dynamical in the presence of a curved background; we qualitatively estimate

in this simple case the mass of this new excitation and the result agrees with the mass we find for the

ghost in the Goldstone computation. Then, in section 7, we show in the Hamiltonian formalism that

there exists no non-linear extension of the Fierz-Pauli theory that can forbid the propagation of the sixth

mode in the presence of a slightly curved background. The analysis in the unitary gauge is powerful for

counting the number of degrees of freedom, but, unlike the Goldstone analysis, it says nothing about the

typical scales of these modes. Also in order to address stability issues one should study the positivity

of the Hamiltonian. This in general is difficult, and the analysis has been carried out by Boulware and

Deser for non-linear extensions of the form f(hµνhµν− h2) [2]. On the contrary the Goldstone analysis

concentrates from the very beginning on the strong interacting degree of freedom: all the interesting and

troublesome features of the theory are encoded in the dynamics of a single scalar field. This enormously

simplifies the analysis.

Recently it has been realized that massive gravity models with Lorentz violating mass terms can be

3

Page 4

significantly ‘healthier’ than the traditional Lorentz-invariant theory; it particular they can avoid the

vDVZ discontinuity and the strong coupling problem, and they can be free of ghosts [14, 15, 16]. In this

paper we stick to the Lorentz-invariant massive gravity theory.

2 Ghosts from higher derivative kinetic terms

Let us first be very specific about why higher derivative kinetic terms give rise to ghost-like instabilities.

Take for instance a massless scalar field φ with Lagrangian density (note that we are using the (−,+,+,+)

signature!)

L = −1

where Λ is some energy scale, a = ±1, and Vintis a self-interaction term. We show that, independently

of the sign of the second term, the system is plagued by ghosts. To do so we want to reduce to a purely

two-derivative kinetic Lagrangian, from which we know how to read the stability properties of the system.

We therefore introduce an auxiliary scalar field χ and a new Lagrangian

2(∂φ)2+

a

2Λ2(?φ)2− Vint(φ) , (1)

L′= −1

2(∂φ)2− a ∂µχ∂µφ −1

2a Λ2χ2− Vint(φ) , (2)

which reduces exactly to L once χ is integrated out. L′is diagonalized by the substitution φ = φ′− aχ.

We get

L′= −1

which clearly signals the presence of a ghost: χ has a wrong-sign kinetic term. Notice in passing that χ

can also be a tachyon, for a = −1: in this case χ has exponentially growing modes. But let us neglect

this possibility and concentrate on the ghost instability, which is unavoidable. A ghost, unlike a tachyon,

is not unstable by itself: its equation of motion is perfectly healthy at the linear level, and does not

admit any exponentially growing solution. The problem is that its Hamiltonian is negative, so that

when couplings to ordinary ‘healthy’ matter are taken into account (the potential term in our example

above) the system is unstable: with zero net energy one can indiscriminately excite both sectors, and this

exchange of energy happens spontaneously already at classical level. In a quantum system with ghosts in

the physical spectrum this translates into an instability of the vacuum. The decay rate is UV divergent

due to an infinite degeneracy of the final state phase space. It is not clear how to cutoff this divergence

in a Lorentz invariant way [17].

However the situation is not as bad as it seems: our ghost χ in eq. (3) has a (normal or tachyonic)

mass Λ, so that it will show up only at energies above Λ, i.e. when the four derivative kinetic term in

eq. (1) starts dominating over the usual two derivative one. We can consistently use our scalar field

theory eq. (1) at energies below Λ, and postulate that some new degree of freedom enters at Λ and takes

care of the ghost instability. For example, we can add a term −(∂χ)2to eq. (2) (for simplicity we stick

to the non-tachyonic case a = +1 and set Vint= 0),

2(∂φ′)2+1

2(∂χ)2−1

2a Λ2χ2− Vint(φ′,χ) ,(3)

LUV= −1

2(∂φ)2− ∂µχ∂µφ − (∂χ)2−1

2Λ2χ2.(4)

This drastically changes the high-energy picture, since the resulting Lagrangian obtained by demixing

now describes two perfectly healthy scalars, one massless and the other with mass Λ. At the same time,

at energies below Λ the heavy field χ can be integrated out from LUV, thus giving the starting Lagrangian

eq. (1) up to terms suppressed by additional powers of (∂/Λ)2. This example shows that in principle the

ghost instability can be cured by proper new physics at the scale Λ. In other words, eq. (1) makes perfect

sense as an effective field theory with UV cutoff Λ.

4

Page 5

3 The Goldstone action

In this section we briefly re-derive the Lagrangian of massive gravity along the lines of [9], i.e. keeping

explicit the Goldstone bosons of broken diffeomorphism invariance.

To write down a mass term for gravity, in addition to the full dynamical metric gµν, we have to take

a reference fixed metric: for our purposes we will take the Minkowski metric ηµν. A mass term breaks

invariance under general coordinate transformations. However, as it has been shown in Ref. [9], one

can always restore local coordinate invariance by the St¨ uckelberg trick: in analogy with massive gauge

theories one introduces a set of Goldstone fields and requires that they transform non-linearly under a

local coordinate transformation. The fundamental object to be used for this purpose is a symmetric

tensor Hαβ, built in terms of the reference metric, the field describing metric fluctuations hµν= gµν−ηµν

and the four Goldstone fields πµ,

Hαβ= hαβ+ ∂απβ+ ∂βπα+ ∂απγ∂βπγ. (5)

Hµνtransforms as a covariant tensor under local diffeomorphisms xα→ xα+ξαprovided that παshifts,

πα→ πα− ξα. As in non-abelian massive gauge theories, since now local coordinate invariance is non

linearly realized on the π field, a Lagrangian built using H will be valid as an effective theory and its

breakdown will appear as the Goldstone sector becoming strongly coupled at some scale Λ. This indeed

has been shown in Ref. [9]. It is useful to further split the 4 παfields into a vector and a scalar as

πµ= Aµ+ ∂µφ (6)

together with an additional hidden U(1) gauge invariance for Aµunder which φ shifts. Note that since φ is

a Goldstone boson under this U(1) gauge symmetry and πµis a Goldstone boson of broken diffeomorphism

invariance φ will appear with two derivatives in the Lagrangian, as evident from eq. (5). A mass term

for hµνcan be written down in terms of Hµνas

√−ggµνgαβ(aHµαHνβ+ bHµνHαβ) .(7)

Expanding H using (5) one easily realizes that for a generic choice of a,b there is a quadratic term

in φ containing 4 derivatives. This term signals the presence of a ghost as we have seen in Section 2.

Only for the Fierz-Pauli choice a = −b ≡ m2

does not have a kinetic term on its own, but only a kinetic mixing with hµν: m2

A conformal rescaling of the metric hµν=ˆhµν+ m2

gηµνφ diagonalizes this mixing and generates a small

(i.e. proportional to m2

the kinetic term is the origin of the low strong coupling scale as it enhances the φ interactions once the

fields are canonically normalized.

One can easily see that the most relevant interactions are of the form

gM2

Pthis four derivative term exactly cancels. In this case φ

gM2

P(∂µ∂νφhµν− ?φh).

g) kinetic term for φ besides interactions of the form φ(∂2φ)n. The smallness of

m2M2

P(∂2φ)3=(∂2φc)3

MPm4g

(8)

where φcis the canonically normalized field. These interactions saturate perturbation theory at the tiny

energy scale E ∼ Λ5≡ (m4

One can slightly improve the situation canceling these cubic interactions by adding H3terms. Now

the most relevant interactions will be of the form (∂2φ)4and this procedure can be repeated at any

gMP)1/5.

5

Page 6

order. The dominant interactions will be (∂2φ)nand once all these are canceled the theory has the cutoff

Λ3≡ (m2

m2

andm2

gMP)1/3. In fact after this procedure the most relevant interactions are

gM2

P(∂A)2(∂2φ)n

gM2

P(ˆhµν+ m2

gηµνφ)(∂2φ)n, (9)

which are weighted by Λ3when expressed in terms of canonically normalized fields.

The interaction between matter and gravity is as usual described by the term1

transformation that demixes hµν from φ thus generates a direct coupling of φ to the trace T of the

stress-energy tensor. This implies that we will have a non-trivial φ background around any astrophysical

source. For a classical solution we will have two relevant scales: the first is the Schwarzschild radius

RS, the distance from the source at which linearized gravity breaks down, the second is the Vainshtein

radius RV [5] where nonlinearities for the scalar field φ become important. For a source of mass M∗this

distance is equal to 1/Λ5(M∗/MP)1/5(which is much larger than RS). At this scale the term (∂2φ)3

becomes as relevant as the kinetic term whereas all the other nonlinear terms are important only when

we reach the Schwarzschild radius so that they can be safely neglected. Therefore the action of φ in the

presence of sources is given by

2hαβTαβ. The Weyl

S =

?

d4x

?

3φc?φc+

1

Λ5

5

?(?φc)3− (?φc)(∂µ∂νφc)2?+

1

2MPφcT

?

;(10)

the structure of the trilinear terms can be changed by adding non-linear interactions to the Fierz-Pauli

Lagrangian.

The situation remains qualitatively unchanged even if the first N (∂2φ)ninteractions are tuned to

zero. The first non-vanishing term, (∂2φ)N+1, will set the Vainshtein radius, while higher order terms

will become relevant again at the Schwarzschild radius.

4Massive gravity in the presence of a source

Let us consider the Lagrangian eq. (10) in the presence of a macroscopic source, like the Sun. This

induces a classical background Φ(x), solution of the φ equation of motion. To study the stability of this

solution we expand the Lagrangian at the quadratic order in the fluctuation ϕ ≡ φ − Φ. The result is

schematically of the form

Lϕ= −(∂ϕc)2+(∂2Φc)

i.e. the background gives a four-derivative contribution to the ϕ kinetic term. As discussed in sect. 2,

this results in the appearance of a ghost with an x-dependent mass

Λ55

(∂2ϕc)2,(11)

m2

ghost(x) ∼

Λ55

∂2Φc(x). (12)

Remember that we are dealing with an effective theory with a tiny UV cutoff Λ5, therefore we should

not worry until the mass of the ghost drops below Λ5. In approaching the source from far away, this

happens at a distance Rghostfrom the source such that ∂2Φc∼ Λ53. Unfortunately this is a huge distance,

parametrically larger than the (already huge) Vainshtein radius RV. In fact for a source of mass M∗at

distances r ≫ RV the background field goes as Φc(r) ∼ (M∗/MP) · 1/r, so that

1

Λ5

MP

Rghost∼

?M∗

?1/3

≫ R(5)

V

∼

1

Λ5

?M∗

MP

?1/5

.(13)

6

Page 7

Therefore the ghost is going to show up in an extremely weak background field, when the latter is still

in its linear regime.

Inside Rghost, in the spirit of sect. 2, one is forced to postulate that additional physics lighter than

the local ghost mass cures the instability, that is the cutoff must be lowered from Λ5to mghost(x). A

byproduct of this in general would be that interactions strengthen, being weighted by the new cutoff scale

rather than by Λ5. But let us optimistically assume that, instead, the only effect of this new physics is to

cure the ghost instability. However, when the local ghost mass is of order of the inverse distance from the

source there is no way of proceeding further without specifying the UV completion of the theory, since

the background itself has a typical length scale of order of the UV cutoff. One can easily check that this

happens at the Vainshtein radius RV. There is no sense in which one can trust the classical solution below

RV. Since one can hope to recover General Relativity only in the region inside RV, where non-linear

effects can hide the scalar (Vainshtein effect), this also means that General Relativity is nowhere a good

approximation.

Notice that in the DGP model the dominant interaction of the Goldstone has the form ?π(∂π)2,

which suggests the same problem we are facing, as there are two derivatives acting on one of the π’s.

Nevertheless, in the equation of motion terms with more than two derivatives acting on a single field

cancel out and one is left with a (non-linear) second order differential equation [13]1. The same cannot

happen in our case since there are too many derivatives: the contribution of the trilinear term to the

equation of motion is a sum of terms with 2 φ’s and 6 derivatives. In any term there is at least one φ

carrying more than two derivatives.

One is thus led to consider the possibility of eliminating the unwanted trilinear interaction of the

Goldstone by adding appropriate cubic terms in Hµν to the Fierz-Pauli Lagrangian eq. (7). The three

independent contractions are H3, H(Hµν)2, and (Hµν)3, where the last stands for the cyclic contraction

of the indices. These contain interaction terms for the Goldstone of the form (∂2φ)3which, for the proper

choice of coefficients, cancel the trilinear interaction of eq. (10). However, in this way one introduces

further quartic interactions (∂2φ)4on top of those already present in the Fierz-Pauli mass term, because

of the non-linear relation between Hµνand φ of eq. (5). These are problematic for exactly the same reason

as before, and the same problem shows up at any order: an interaction term of the form (∂2φ)nevaluated

around a background gives a contribution to the equation of motion for the fluctuations with too many

derivatives. This signals the presence of a ghost instability2. Again one can check that the cutoff (i.e. the

1Equivalently, working at the level of the Lagrangian, one can expand the interaction term ?π(∂π)2to second

order in the fluctuation ϕ around a background πb. The worrisome term is ?ϕ∂µϕ ∂µπb, since it has 3 derivatives

acting on the ϕ’s. But by integration by parts one can shift one derivative from the fluctuations to the background,

thus obtaining an ordinary 2-derivative kinetic term for the fluctuations (whose positivity must however be checked)

[12, 13].

2The reader could wonder if there exists a choice of coefficients such that terms with 4 derivatives on a single

field cancel in the equation of motion. In this case one would be left only with 3-derivative terms and our

conclusions should be modified. But this is not the case: setting to zero all 4 derivative terms leads also to the

cancellation of those with 3 derivatives. To see this, consider the most general interaction Lagrangian of n-th

order, L(n)= Γα1β1···αnβn∂α1∂β1φ···∂αn∂βnφ, where Γ is a tensor constructed with the metric ηµν. Given the

structure of contractions, without loss of generality we can choose Γα1β1···αnβnto be symmetric under αi↔ βiand

(αi,βi) ↔ (αj,βj). Then, for symmetry reasons, the contribution of L(n)to the φ equation of motion is

Γα1β1···αnβn[An(∂α1∂β1∂α2∂β2φ)(∂α3∂β3φ) + Bn(∂β1∂α2∂β2φ)(∂α1∂α3∂β3φ)]∂α4∂β4φ···∂αn∂βnφ ,

where An,Bnare combinatoric factors. The four-derivative term (the first in brackets) identically vanishes only if

the totally symmetric part of Γα1β1···αnβnin the first four indices does, Γ(α1β1α2β2)···αnβn= 0. In this case, given

the symmetries of Γ, it is straightforward to check that Γα1(β1α2β2)···αnβn= Γ(α1β1α2β2)···αnβn= 0. This eliminates

(14)

7

Page 8

ghost mass) becomes of order of the inverse radius at the new Vainshtein scale (always defined as the

distance from the source at which non-linearities become relevant). Hence one never recovers General

Relativity.

The only possibility is therefore to concentrate on theories in which all the interactions of the form

(∂2φ)nare set to zero by properly choosing infinitely many coefficients. We can look at this procedure

as an extension at non-linear order of the Fierz-Pauli choice for the mass term which, as discussed in

sect. 3, leads to the cancellation of the (∂2φ)2terms. In the next section we prove that also in this case

we cannot avoid higher derivative kinetic terms for the fluctuations of the field φ around a non trivial

background: ghost-like instabilities are unavoidable.

5 Ghost instabilities are unavoidable

The cancellation of the (∂2φ)ninteractions has also been considered as a way to raise the strong interaction

scale to Λ3= (m2

form

m2

andm2

gMP)1/3≫ Λ5[9, 10]. In fact, after the cancellation, the leading interactions are of the

gM2

P(∂A)2(∂2φ)n

gM2

P(ˆhµν+ m2

gηµνφ)(∂2φ)n;(15)

when the fields are canonically normalized all these terms are suppressed by the scale Λ3, while additional

interactions are weighted by higher scales. Correspondingly the Vainshtein radius now shrinks to R(3)

1/Λ3(M∗/MP)1/3≪ R(5)

linear terms of the form (ˆhµν+ m2

linear in A in the Lagrangian, so that A is sourced neither by matter nor by the other fields. Interactions

involving this field are therefore irrelevant for our purposes, and can be consistently neglected.

The theory we are describing is not unique: there are different possible choices of coefficients that

cancel all the interactions (∂2φ)n. We can easily see why. Let us start with the canonical Fierz-Pauli

mass term3L2=√−g?[H2] − [H]2?; since Hµν= hµν+ 2∂µ∂νφ + ∂µ∂αφ∂ν∂αφ (setting Aµ= 0), L2

contains (∂2φ)3interactions. We can cancel them adding an appropriate combination of terms cubic in

H, L3=√−g?1

α3, the expression

LTD

because it gives (∂2φ)3terms in the combination

V

=

V

as the original leading interactions have been canceled. At this radius all non-

gηµνφ)(∂2φ)nbecome relevant; on the other hand there are no terms

2[H][H2] −1

2[H3]?. But at this point we can still add, with an arbitrary overall coefficient

=√−g?3[H][H2] − [H]3− 2[H3]?

3

, (16)

(?φ)3− 3 ?φ (∂µ∂νφ)2+ 2 (∂µ∂νφ)3,(17)

which is a total derivative (hence the superscript ‘TD’) and thus does not contribute to the equation of

motion. Now L2+ L3+ α3LTD

order terms are canceled by

3

contains (∂2φ)4interactions and we can repeat the procedure. Fourth

L4=√−g1

16

?

(5 + 24α3)[H4] − (1 + 12α3)[H2]2− (4 + 24α3)[H][H3] + 12α3[H2][H]2?

the 3 derivative term (the second in brackets) as well, i.e. eliminates the contribution of L(n)to the φ equation of

motion altogether.

3We use the notation [H] = gµνHµν, [H2] = gµνgαβHµαHνβ, and its straightforward generalization to higher

orders.

.(18)

8

Page 9

Again we have the possibility to introduce a second arbitrary coefficient, α4, in front of the ‘total deriva-

tive’ term

LTD

and we can go on at higher orders until all self-couplings are removed. Notice that these total derivative

terms LTD

interactions that reduce to a total derivative when expressed in terms of ∂µ∂νφ and therefore do not

contribute to high-derivative terms in φ. One can check that there is one of such combinations per order,

but for our purposes we will need them only up to fourth order4.

We now want to see if it is possible to get a kinetic structure without ghosts for the fluctuations

around a background. Note that this theory has potentially dangerous terms of the form hµν(∂2φ)n. We

could hope that the large freedom we have in the choice of higher-order terms, parameterized by the

coefficients α3,α4,..., helps us to obtain a ghost-free theory. Unfortunately, this is not the case and the

main reason is that the number of possible contractions of Hµνgrows very fast and soon we cannot cancel

all the dangerous kinetic terms. Let us see how this works explicitly. We call hb

fields5, and we study the quadratic Lagrangian for the scalar fluctuation ϕ. Actually, instead of working

at the level of the Lagrangian, the most direct approach is to look at the equations of motion linear in

the fluctuations, because in this case there is no ambiguity coming from integration by parts. We have

to check whether all the terms with more than 2 derivatives on ϕ can cancel in the equations of motion.

We start with the Lagrangian terms cubic in the fields: they can come only from L2+ L3+ α3LTD

Only terms schematically of the form h(∂2φ)2can give a ghost; terms with a higher number of hµνhave

no more than 2 derivatives. Using the explicit expression of the Lagrangian, it is immediate to verify

that i) the terms in the e.o.m. with more than 2 derivatives on ϕ cancel already with α3= 0, and ii)

they cancel also when originating from LTD

no higher derivative kinetic terms for ϕ and the parameter α3can still be varied arbitrarily.

What happens with the quartic Lagrangian? Let us consider the interactions hµν(∂2φ)3; now we must

include also L4(eq. (18)), and for simplicity we start with α4= 0. Again it is straightforward to write

down the equations of motion linear in ϕ. The terms with more than two derivatives on the fluctuation

can be divided into two classes: those containing (hb)µµand those in which (hb)µν is contracted with

derivatives of φ. Either class must cancel independently of the other, since, given the different tensor

structure, there is no possibility of cross-cancellation between the two. In the e.o.m. the terms belonging

to the first class automatically cancel. They come from a piece of the Lagrangian that is precisely 4α3[h]

times the ‘total derivative’ combination eq. (17). The remaining dangerous kinetic terms, which belong

4

=√−g?[H]4− 6[H2][H]2+ 8[H3][H] + 3[H2]2− 6[H4]?

, (19)

n

are the higher order analogue of the Fierz-Pauli mass term: they are combinations of hµν

µνand φbthe background

3.

3

alone. We conclude that at the cubic level in Hµνthere are

4It is easy to check that at n-th order a total derivative term is given by

?

π

(−1)πηα1π(β1)···ηαnπ(βn)∂α1∂β1φ···∂αn∂βnφ ,(20)

where the sum runs over all permutations π of the β indices, and (−1)πis the parity of the permutation. To prove

that this combination is the only total derivative term at a given order assume that there are two of them. One

then could construct a total derivative term that does not contain, say, (?φ)n. Imposing that the contribution to

the field equations is zero, it is straightforward to show that also all the other terms vanish.

5Remember that the field hµν is the graviton before the Weyl rescaling that demixes it from φ: hµν =ˆhµν+

m2

gηµνφ.

9

Page 10

to the second class, come from the Lagrangian6

(8 + 72α3) ([hφ3] − [hφ2][φ]) − 36α3([hφ][φ2] − [hφ][φ]2) .

Can we add now α4LTD

containing (h)µµ) in the equations of motion. Then, can we choose α4to get rid of the contributions

coming from eq. (21)? Unfortunately the answer is negative. In fact, expanding α4LTD

α4LTD

(21)

4

(eq. (19))? Yes, since this does not reintroduce terms of the first class (i.e.,

4,

4

⊃ −192α4([hφ3] − [hφ2][φ]) + 96α4([hφ][φ2] − [hφ][φ]2) , (22)

one immediately sees that either the first or the second pair of terms in eq. (21) (but not both) can be

canceled by properly choosing α3and α4. The other pair gives a non-zero, four derivative contribution

to the equation of motion for the fluctuation ϕ. The number of possible tensor structures is bigger than

the freedom we have in the Lagrangian. This completes the proof that massive gravity around a generic

background cannot have a purely two-derivative kinetic term for ϕ.

In the Λ3theory the local mass of the ghost goes as

m2

ghost(x) ∼

Λ36

Φc(x)∂2Φc(x), (23)

so that the ghost enters in the effective theory at a distance from the source much bigger than the

Vainshtein radius,

1

Λ3

MP

Rghost∼

?M∗

?1/2

≫ R(3)

V

∼

1

Λ3

?M∗

MP

?1/3

.(24)

These results should be compared with their analogues in the Λ5theory, eqs. (12) and (13). Again at

the Vainshtein scale the mass of the ghost is of order of the inverse radius and we cannot proceed further

towards the source.

There is a final subtlety that needs to be addressed. Is the presence of a 4-derivative kinetic term

enough to claim that there is a ghost? After all, the argument of sect. 2 strictly applies only to a Lorentz-

invariant background. It is clear that if the derivatives acting on the fluctuation ϕ are contracted with

a background tensor field different from ηµν the situation can be very different. For instance, if the

quadratic Lagrangian for ϕ involves 4 space derivatives but only 2 time derivatives there is no room

for an independent propagating extra scalar (χ, in the language of sect. 2), and thus there is no ghost,

provided that the ˙ ϕ2term has the healthy sign. Indeed, in most astrophysical situations macroscopic

sources have non-relativistic velocities v ≪ 1; the background φbfield they generate is therefore essentially

constant in time, its time derivatives being suppressed by positive powers of v with respect to its spatial

gradients. This, in a term like ∂µ∂νφb∂µ∂ρϕ∂ρ∂νϕ for instance, can suppress the magnitude of terms

with 4 time derivatives. Although this is a parametric increase of the ghost mass, for typical astrophysical

velocities v ∼ 10−4− 10−3the regime of validity of the theory is not significantly widened: inside the

Vainshtein radius the theory breaks down at r ∼ R(5)

becomes of order of the inverse radius.

Vv4/5∼ (10−3− 10−2)R(5)

V, where the ghost mass

6Inside brackets with φ we indicate the matrix ∂µ∂νφ.

ηµαηνβhµν∂α∂βφ(?φ)2.

For example the term [hφ][φ]2should be read as

10

Page 11

6 Unitary gauge description

In the previous section we argued in the Goldstone language that Fierz-Pauli massive gravity (together

with all its infinitely many higher order extensions) is unavoidably plagued by ghosts around a tinily

curved background. We now want to see how the sickness of the theory is interpreted in the unitary

gauge. In order to take into account the presence of a (slightly) curved background we need to consider

the field equations at second order in hµν,

Gµν+ m2

g[ahµν+ bhηµν+ O(h2

µν)] = 0 , (25)

where the quadratic terms come both from the mass term (that we take generic for the moment) and

from additional higher order interactions. As it is well known, the invariance of the Einstein-Hilbert

action under diffeomorphisms gµν→ gµν+ ∇µǫν+ ∇νǫµ,

?

δgµν

δSEH= 2d4xδSEH

∇µǫν= −2

?

d4x√−g ǫν∇µ

?

1

√−g

δSEH

δgµν

?

= 0 , (26)

implies the contracted Bianchi identities ∇µGµν= 0. As a consequence, from eq. (25) we get the four

constraints

∇µ[ahµν+ bhηµν+ O(h2

which reduce to six the number of propagating components of hµν. The presence of these constraints

is ensured by the gauge-invariance of the ‘kinetic’ action. This is in complete analogy to what happens

in the theory of a massive vector particle, where the gauge invariance of the kinetic term implies the

constraint ∂µAµ= 0, thus reducing to three the number of propagating degrees of freedom.

If we now restrict to a linear analysis, for the particular Fierz-Pauli choice of the mass term (b = −a)

we have an additional constraint equation. In fact the linearized Einstein tensor satisfies

µν)] = 0 , (27)

Gℓµµ= ∂µ∂ν(hµν− hηµν) ,(28)

so that eq. (27) forces Gµµto vanish at linear order, and thus the trace of the Einstein equations eq. (25)

becomes a constraint for the trace mode,

h = 0 .(29)

In the end one is left with five propagating degrees of freedom, the correct number for a massive spin-2

particle. However, it is clear that this last constraint is fundamentally different from the previous four,

since, unlike them, is not ensured by any symmetry, but it is based on a precise tuning in the structure

of the mass term and on the identity eq. (28), valid at linear order. It is then natural to expect that it

does not survive in a curved background. The ghost we found in the Goldstone language is nothing but

this hidden sixth mode that, although constrained in flat space, starts propagating around a non-zero

background. Let us check that the energy scale at which the additional mode appears is indeed the same

in the two descriptions.

At quadratic order Gµµwill not vanish anymore on the equations of motion, instead it will be of the

form Gµµ∼ ∂2h2

O(∂2h2

If we now write hµνas the sum of the background field and fluctuations around it we see that the equation

above describes the propagation of a mode with mass

µν. Therefore at quadratic order the constraint eq. (29) becomes a dynamical equation,

µν) + m2

g[h + O(hµν2)] = 0 .(30)

m2

6th-mode(x) ∼

m2

g

Hµν(x),(31)

11

Page 12

where Hµνis the background metric. Notice that in the limit of zero background the mass goes to infinity

and the mode decouples, as expected from the linear analysis7.

In order to compare this with our results in the Goldstone language we must relate the background

Hµν(x) in the unitary gauge to the Goldstone background Φc(x). This is easily done by getting rid of

the Goldstone, i.e. by performing a gauge transformation with parameter ǫµ=

for definiteness the case of a spherical source, the background in unitary gauge is therefore the sum

of the usual Schwarzschild solution of GR HS

vDVZ discontinuity) and of the pure-gauge longitudinal contribution

1

m2

gMPl∂µΦc. Considering

µν(r) (corrected by the kinetic mixing with Φ, hence the

1

m2

gMPl∂µ∂νΦc(r). Both HS

µνand

1

MPlΦcscale as RS/r outside the Vainshtein radius, so that the pure-gauge contribution coming from the

Goldstone is the dominant one in unitary gauge,

Hµν(r) ∼

1

m2

gMPl∂µ∂νΦc(r) ∼

1

(mgr)2

RS

r

≫ HS

µν(r) . (32)

This means that the mass of the ‘sixth mode’ eq. (31) is dominated, as expected, by the Φ background,

m2

6th-mode(x) ∼m4

gMPl

∂µ∂νΦc∼

Λ55

∂2Φc,(33)

which is exactly the mass of the ghost eq. (12) we found in the Goldstone language!

7 The sixth mode in the ADM formalism

As we did in the Goldstone language we now show directly in unitary gauge that it is not possible to

forbid the propagation of the sixth mode by adding properly tuned higher order terms to the action.

We do this in the Hamiltonian formalism, where the counting of degrees of freedom is explicit. Let us

introduce the ADM variables {N, Nj, ˆ gij} [19],

?

and ˆ gijis the 3D metric induced on spatial hypersurfaces of constant t. It is well known that in GR N

and Njare not dynamical fields: the Einstein-Hilbert Lagrangian does not contain their time derivatives,

so that their conjugate momenta vanish identically. Moreover in the Hamiltonian they appear linearly,

as Lagrange multipliers. Therefore their equations of motion are really constraint equations for the other

degrees of freedom ˆ gijand their conjugate momenta πij, rather than equations for N and Nj. These are

the so-called momentum and Hamiltonian constraints. The Hamiltonian system is thus reduced to two

independent (q,p) pairs, which describe the two graviton modes.

If we now perturb GR by adding a mass term for hµν, or generic interactions involving only hµνand

not its derivatives, the Lagrangian still does not contain time derivative of N and Nj. However in general

N and Njnow do not appear linearly in the Hamiltonian. In this case their equations of motion are now

determining their value rather than constraining other degrees of freedom. This raises to six the number

of d.o.f. of massive gravity. The Fierz-Pauli tuning of the mass term precisely corresponds to setting to

zero the N2term in the action, so that (at quadratic order) the variation with respect to N still gives a

N ≡ 1/

−g00,Nj≡ g0j,(34)

7A similar result has been obtained in ref. [18], where it has been interpreted as an instability of arbitrarily

short time-scale of Minkowski space. This conclusion is not justified from our effective theory point of view.

12

Page 13

constraint equation, eliminating the unwanted ‘sixth mode’. We want to see if similar tunings can work

at all orders.

Expressed in terms of the ADM variables the metric fluctuation hµν= gµν− ηµνtakes the form

?

Nj

hµν=

1 − N2+ ˆ gklNkNl

Ni

hij

?

, (35)

where hij≡ ˆ gij− δij, and ˆ gijis the inverse of ˆ gij. We are going to work perturbatively in δN ≡ N − 1,

Njand hij, so from now on spatial indices are contracted with the Euclidean 3D metric δij. Notice that,

given the non-linear relation between hµνand the ADM variables, a generic n-th order expression in hµν

also contributes to orders higher than n when expressed in ADM variables.

Quadratic Terms. At quadratic order in hµνthe most general Lagrangian is8

L2= a2[h2] + b2[h]2. (36)

By plugging eq. (35) in this expression we find the term proportional to δN2,

L2⊃ 4(a2+ b2)δN2, (37)

hence the Fierz-Pauli tuning b2= −a2to set it to zero. The coefficient a2fixes the mass of the graviton,

and for our purposes we can take it to be 1. At quadratic level (in ADM variables) the problem is solved,

but L2also contributes to third and fourth order terms; in particular it contains a term −2hiiδN2, which

upon variation with respect to δN gives rise to an equation for δN itself rather than to a constraint. We

are therefore forced to introduce cubic terms in hµν.

Cubic Terms. The cubic Lagrangian is

L3= a3[h3] + b3[h][h2] + c3[h]3. (38)

Cubic terms in ADM variables come both from this and from L2. In particular, those involving more

than one δN are

L2+ L3⊃ (12c3+ 4b3− 2)hiiδN2+ 8(a3+ b3+ c3)δN3.

We can set both of them to zero by choosing

(39)

a3= 2c3−1

2,b3=1

2− 3c3. (40)

This agrees with what we found in the Goldstone language, and again the coefficient c3is still undeter-

mined. Now we are forced to introduce quartic terms in hµνto cancel undesired quartic terms containing

δNncoming both from L2and L3.

Quartic Terms. The quartic Lagrangian is

L4= a4[h4] + b4[h][h3] + c4[h2]2+ d4[h]2[h2] + e4[h]4. (41)

Again, we are only interested in terms involving powers of δN larger than 1. For symmetry reasons these

must be of the form

L2+ L3+ L4⊃?Ah2

ii+ B hijhij+ C NjNj

?δN2+ DhiiδN3+ E δN4.(42)

8In this section all contractions are done with the flat metric ηµν and there is no√−g in the action. Different

conventions are equivalent in unitary gauge: they just reshuffle the coefficients in the expansion in hµν.

13

Page 14

After a straightforward but lengthy computation we find the relationship between the coefficients (A,...,E)

and (c3,a4,...,e4),

A

B

C

D

E

=

30

0

0

0

0

8

4

4

24

0

0

32

16

−3

0

0

0

−16−12−16−8

16

16

080

161616

·

c3

a4

b4

c4

d4

e4

+

0

1/2

1/2

2

0

. (43)

We would like to set the vector (A,...,E) to zero. Since we have 6 free coefficients (c3,a4,...,e4)

to choose and only 5 conditions to satisfy, one naively expects this to be possible and one of the free

coefficients to remain undetermined. On the contrary, it is impossible. This is because the matrix above

has rank 4 and the space spanned by it does not contain the inhomogeneous term. There is no way of

choosing (c3,a4,...,e4) to make the unwanted expression eq. (42) vanish.

In summary, we tried to tune all interactions hn

µνin order to keep the Hamiltonian linear in N (or

equivalently in δN), this to ensure the presence of a constraint equation that eliminates the troublesome

sixth degree of freedom. We found that when fourth order terms are taken into account this tuning is

impossible. This agrees with our result in the Goldstone language, namely that it is impossible to tune

fourth order interactions to avoid the propagation of a ghost.

Notice however that the ADM analysis we sketched in this section is useful to explicitly count the

number of degrees of freedom but, unlike the Goldstone approach, gives us no clue on the typical mass of

these modes. From the effective field theory point of view this additional information is crucial: a massive

mode with a mass above the cutoff can be consistently discarded, even if it is a ghost or a tachyon. Also

we have not checked in this language that the sixth mode is a ghost. To do so one should compute the

Hamiltonian and see that it is not positive definite. This approach is rather cumbersome and it has been

carried out in ref. [2] only for a limited set of non-linear terms, namely functions of the Fierz-Pauli mass

term: f(h2

µν− h2).

8 Concluding remarks

It this paper we have shown that in massive gravity the classical solutions around a source are plagued

by ghost instabilities. This holds for any choice of the non-linear terms one can add to the Fierz-Pauli

action. It is known that massive gravity is an effective field theory whose UV cut-off can be pushed at

most up to Λ3= (m2

at a distance from the source Rghost∼ 1/Λ3· (M∗/MP)1/2. This distance is huge, much bigger than the

Vainshtein radius RV ∼ 1/Λ3· (M∗/MP)1/3. For instance taking mg∼ H0, for an astrophysical source

like the Sun Rghost∼ 1022km, of order of the cosmological horizon! One could optimistically postulate

that new physics enters at energies of order of the (local) ghost mass and cures the instability; even under

this hypothesis, when the mass of the ghost becomes of order of the inverse distance from the source

there is no way of proceeding further without specifying the UV completion of the theory. This happens

at the Vainshtein radius (of order 1016km for the Sun); as the vDVZ discontinuity can be cured only

inside the Vainshtein radius, General Relativity is never a good approximation.

One could argue that the very low cut-off of massive gravity is already bad enough to disregard the

theory. For instance, if one assumes that the theory has a generic series of higher dimension operators

suppressed by the cut-off, these will become important when evaluated on a classical solution at huge

gMP)1/3. In this optimal case the ghost instability enters in the effective field theory

14

Page 15

length scales, parametrically bigger than the inverse cut-off [9]. This implies that it is impossible to

calculate the classical solution around a source without specifying the UV completion. However this

problem depends on the high energy completion of the theory, and its precise formulation is subtler than

what one can argue at first sight. For example in the DGP model, which in this respect is very similar

to massive gravity, one can make consistent assumptions on the higher dimension terms which make

predictions independent of the UV completion [13]. Of course a prerequisite for this to work is that

the classical theory is free from pathologies and instabilities. As we have shown this is not the case in

massive gravity: ghost instabilities are unavoidable. The theory is inconsistent already at the classical

level, before taking into account quantum effects.

Acknowledgments

We warmly thank N. Arkani-Hamed, S. Dubovsky, M. Luty, F. Piazza, L. Pilo, R. Rattazzi, M. D. Schwartz,

R. Sundrum, T. Wiseman, and A. Zaffaroni for useful discussions and comments. We also would like to

thank the CERN Theoretical Physics Division for hospitality during this project.

References

[1] M. Fierz and W. Pauli, “On Relativistic Wave Equations For Particles Of Arbitrary Spin In An Electromag-

netic Field,” Proc. Roy. Soc. Lond. A 173 (1939) 211.

[2] D. G. Boulware and S. Deser, “Can Gravitation Have A Finite Range?,” Phys. Rev. D 6, 3368 (1972).

[3] H. van Dam and M. J. G. Veltman, “Massive And Massless Yang-Mills And Gravitational Fields,” Nucl. Phys.

B 22 (1970) 397.

[4] V. I. Zakharov, “Linearized gravitation theory and the graviton mass,” Sov. Phys. JETP Lett. 12 (1970) 312.

[5] A. I. Vainshtein, “To The Problem Of Nonvanishing Gravitation Mass,” Phys. Lett. B 39 (1972) 393.

[6] C. Deffayet, G. R. Dvali, G. Gabadadze and A. I. Vainshtein, “Nonperturbative continuity in graviton mass

versus perturbative discontinuity,” Phys. Rev. D 65, 044026 (2002) [hep-th/0106001].

[7] A. Gruzinov, “On the graviton mass,” astro-ph/0112246.

[8] M. Porrati, “Fully covariant van Dam-Veltman-Zakharov discontinuity, and absence thereof,” Phys. Lett. B

534, 209 (2002) [hep-th/0203014].

[9] N. Arkani-Hamed, H. Georgi and M. D. Schwartz, “Effective field theory for massive gravitons and gravity in

theory space,” Annals Phys. 305 (2003) 96 [hep-th/0210184].

[10] M. D. Schwartz, “Constructing gravitational dimensions,” Phys. Rev. D 68, 024029 (2003) [hep-th/0303114].

[11] G. R. Dvali, G. Gabadadze and M. Porrati, “4D gravity on a brane in 5D Minkowski space,” Phys. Lett. B

485 (2000) 208 [hep-th/0005016].

[12] M. A. Luty, M. Porrati and R. Rattazzi, “Strong interactions and stability in the DGP model,” JHEP 0309

(2003) 029 [hep-th/0303116].

[13] A. Nicolis and R. Rattazzi, “Classical and quantum consistency of the DGP model,” JHEP 0406 (2004) 059

[hep-th/0404159].

[14] N. Arkani-Hamed, H. C. Cheng, M. A. Luty and S. Mukohyama, “Ghost condensation and a consistent

infrared modification of gravity,” JHEP 0405, 074 (2004) [hep-th/0312099].

[15] V. A. Rubakov, “Lorentz-violating graviton masses: Getting around ghosts, low strong coupling scale and

VDVZ discontinuity,” hep-th/0407104.

15

Page 16

[16] S. L. Dubovsky, “Phases of massive gravity,” JHEP 0410, 076 (2004) [hep-th/0409124].

[17] R. Rattazzi, “A new dimension at ultra large scales and its price,” talk at SUSY2K, unpublished,

http://wwwth.cern.ch/susy2k/susy2kfinalprog.html.

[18] G. Gabadadze and A. Gruzinov, “Graviton mass or cosmological constant?,” hep-th/0312074.

[19] R. Arnowitt, S. Deser and C. W. Misner, “Canonical Variables for General Relativity,” Phys. Rev. 117, 1595

(1960); R. Arnowitt, S. Deser and C. W. Misner, “The Dynamics Of General Relativity,” gr-qc/0405109.

16