Content uploaded by David Brink

Author content

All content in this area was uploaded by David Brink on Nov 05, 2017

Content may be subject to copyright.

New light on Hensel’s lemma

David Brink

March 2006

Abstract: The historical development of Hensel’s lemma is brieﬂy discussed

(section 1). Using Newton polygons, a simple proof of a general Hensel’s lemma

for separable polynomials over Henselian ﬁelds is given (section 3). For polyno-

mials over algebraically closed, valued ﬁelds, best possible results on continuity

of roots (section 4) and continuity of factors (section 6) are demonstrated. Using

this and a general Krasner’s lemma (section 7), we give a short proof of a general

Hensel’s lemma and show that it is, in a certain sense, best possible (section 8).

All valuations here are non-archimedean and of arbitrary rank. The article is

practically self-contained.

1 Introduction and historical remarks

The p-adic numbers were introduced in 1904 by Hensel in Neue Grundlagen der

Arithmetik. In the same article, Hensel showed that if a monic polynomial fwith

integral p-adic coeﬃcients has an approximate factorisation f≈gh, meaning that

the coeﬃcients of the diﬀerence f−gh are p-adically smaller than the discriminant

of f, then there exists an exact factorisation f=g∗h∗. Four years later, in 1908,

Hensel gave a somewhat more general result in his book Theorie der algebraischen

Zahlen, where fis no longer assumed monic, and the discriminant of fis replaced

by the squared resultant of gand h.

Since then, many variations and generalisations of Hensel’s result have been

found, some of which bear only little resemblance to the original. Confusingly,

all these theorems are known today as “Hensel’s lemma”. We mention here the

most important. K¨urschak (1913) introduced real valuations on the abstract

1

ﬁelds recently deﬁned by Steinitz and indicated that Hensel’s arguments would

carry over to complete, non-archimedean valued ﬁelds. Rychl´ık (1919) under-

took these generalisations explicitly. Krull (1932) introduced general valuations,

gave a new concept of completeness, and showed that a weak Hensel’s lemma (g

and hare assumed relatively prime modulo the valuation ideal) holds for such

ﬁelds. Nagata (1953) showed that if a weak Hensel’s lemma holds in some ﬁeld

with a valuation v, then the original Hensel’s lemma holds too under the extra

assumption that v(f−gh)−2v(Res(g, h)) is not contained in the maximal con-

vex subgroup of the value group not containing v(Res(g, h)). Rim (1957) and

Rayner (1958) proved that the unique extension property implies weak forms of

Hensel’s lemma. Ribenboim (1985) showed the logical equivalence between these

and other “Hensel’s lemmas”. The reader is referred to the very interesting paper

of Roquette (2002) regarding the history of Hensel’s lemma and valuation theory

in general.

In the present paper, a new proof of Hensel’s lemma is presented that gener-

alises the original in another direction, namely with respect to the accuracy of the

approximate factorisation. It will be seen that the discriminant and the resultant

disappear completely. They are replaced by two new polynomial invariants, here

called the separant and the bipartitionant. The core of the proof is an analysis

of the continuous behaviour of the roots of a polynomial as functions of the co-

eﬃcients. These arguments, in contrast to earlier proofs, work equally well for

arbitrary as for real valuations and make Nagata’s extra assumption superﬂuous.

The only thing we need is that the valuation has the unique extension property.

After proving his lemma in Hensel (1908), Hensel demonstrated the following:

If the p-adic polynomial Fof degree νhas an approximate root ξ0satifying

ρ > max iρ0−ρ(i)

i−1

i= 2,3, . . . , ν(1)

where ρis the value of F(ξ0), and ρ(i)is the value of F(i)(ξ0)/i!, then Newton

approximation gives an exact root ξ, provided that the values ρ0, ρ00 , . . . , ρ(ν)remain

unchanged during the approximation process. In a short note from 1924, Rella

showed the last condition to follow from (1). Our general Hensel’s lemma will be

seen to cover this Hensel-Rella criterion.

As noted by Rella in 1927, the existence of ξis an almost immediate conse-

quence of the Newton polygon method, a ubiquitous theme of this article. The p-

2

adic Newton polygon was introduced by Dumas already in 1906 and later studied

by K¨ursch´ak, Rella, and Ostrowski, but surprisingly never mentioned by Hensel.

2 Valuations, Newton polygons, and the unique

extension property

Consider a ﬁeld K. By a valuation on Kwe understand a map vfrom Kinto a

totally ordered, additively written abelian group with inﬁnity Γ ∪ {∞} satisfying

v(0) = ∞,v(x)∈Γ if x6= 0, v(xy) = v(x) + v(y), and the strong triangle

inequality v(x+y)≥min{v(x), v(y)}. In this situation, the pair (K, v) is called

avalued ﬁeld,v(x) is called the value of x∈K, and xis called integral if

v(x)≥0. If Γ is order-isomorphic to a subgroup of R+, the valuation is called

real (the term “rank 1” is also standard). Sometimes we will use that Γ has

division from N. This may indeed be assumed without loss of generality, for

we can always embed Γ into some larger group Γ0having that property. For

a polynomial f=a0Xn+a1Xn−1+· · · +anwith coeﬃcients in K, we deﬁne

v(f) := min{v(a0), . . . , v(an)}.

The Newton polygon is a simple, yet powerful tool in valuation theory. It

seems to have been always restricted to the case of real valuations, so we give here

a deﬁnition for arbitrary valuations in the above sense. Consider a polynomial f=

a0Xn+a1Xn−1+· · ·+anof degree n > 0 with coeﬃcients and roots in a valued ﬁeld

(K, v). Usually, it is diﬃcult to compute the roots by means of the coeﬃcients,

but in contrast to this, it is easy to compute the values of the roots by means of

the values of the coeﬃcients. Deﬁne f’s Newton polygon as the maximal convex

map NP :{0,1, . . . , n} → Γ∪ {∞} satisfying NP(i)≤v(ai) for all i. By “convex”

is understood the obvious, i.e. that 2·NP(i)≤NP(i−1)+NP(i+1) for all i6= 0, n.

The diﬀerences NP(i)−NP(i−1), with the convention ∞−∞ =∞, are the slopes

of NP. They form an increasing sequence. Now write f=a0·Qn

i=1(X−αi) such

that v(α1)≤ · · · ≤ v(αn). Then v(αi) = NP(i)−NP(i−1) for all i= 1, . . . , n.

In words, the values of the roots of a polynomial equal the slopes of its Newton

polygon. The conceptually easy, but notationally cumbersome proof expresses the

aias elementary symmetric functions in the αiwhereupon the v(ai) are computed

from the v(αi) using the strong triangle inequality.

3

We call a valued ﬁeld (K, v)Henselian if it has the unique extension

property, i.e. if vhas a unique extension (also denoted v) to the algebraic closure

˜

Kof K. Note that the existence of a valuation extension is automatic with this

deﬁnition. The unique extension property is, as a matter of fact, equivalent to

many (maybe all) variants of Hensel’s lemma, see for instance Ribenboim (1985).

We actually only use a certain consequence of the unique extension property,

namely this: any K-automorphism σof ˜

Kis isometric with respect to v(since

otherwise v◦σwould be an extension diﬀerent from v). The slopes of the Newton

polygon of an irreducible polynomial over a Henselian ﬁeld are thus all the same,

an observation due to Ostrowski (1935).

3 The separant and the “separable Hensel’s lemma”

For a monic polynomial f=Qn

k=1(X−αk) of degree n > 1 with roots in a valued

ﬁeld (K, v), we deﬁne the polynomial invariant

S= max{v(f0(αk)) + v(αk−αl)|k6=l}

and call it f’s separant. Note f0(αk) = Ql6=k(αk−αl) and that S<∞iﬀ

fis separable (i.e. fhas no multiple roots). A monic polynomial with integral

coeﬃcients has integral roots. So if fhas integral coeﬃcients, Sis less than or

equal to the value of f’s discriminant disc(f) = Qk <l(αk−αl)2. Therefore, the

following “separable Hensel’s lemma” generalises the Hensel’s lemma of 1904.

Theorem 1 (separable Hensel’s lemma). Let fand f∗be monic polynomials

of common degree n > 1with integral coeﬃcients in a Henselian ﬁeld (K, v).

Assume v(f−f∗)>Swhere Sis the separant of f. Then fand f∗are both

separable, and we may write f=Qn

k=1(X−αk)and f∗=Qn

k=1(X−α∗

k)such

that K(αk) = K(α∗

k)for all k.

Proof. Since Sis ﬁnite, fis separable. Write f=Qn

k=1(X−αk) and ﬁx a k. The

Newton polygon NP of f(X+αk) has NP(n) = ∞and NP(n−1) = v(f0(αk)).

The root αkis integral, and therefore v(f(X+αk)−f∗(X+αk)) = v(f−f∗).

Consequently, the assumption v(f−f∗)>Simplies that the Newton polygon

4

NP∗of f∗(X+αk) satisﬁes

NP∗(i) =

NP(i) for i<n

v(f∗(αk)) >Sfor i=n

Hence, f∗has a root α∗

kwith v(α∗

k−αk) = NP∗(n)−NP∗(n−1) >S−v(f0(αk)) ≥

v(αk−αl) for all ldiﬀerent from k.

This way we get ndistinct roots α∗

1, . . . , α∗

nof f∗such that v(α∗

k−αk)>

v(αk−αl) for all distinct kand l. Now Krasner’s lemma (see section 7) gives

K(αk) = K(α∗

k) for all k. Naturally, f∗=Qn

k=1(X−α∗

k).

So if a polynomial fis separable, then any other polynomial f∗having coeﬃ-

cients suﬃciently close to those of fhas the same factorisation as f. This fails to

be true if fhas multiple roots. Over the ﬁeld of dyadic numbers Q2, for instance,

f=X2is reducible, but f∗=X2+ 2νis irreducible for any ν.

Example. Consider the polynomial f=X(X−2)(X−4) = X3−6X2+ 8X

over Q2. It has separant S= 5 (whereas the value of the discriminant is 8).

Hence, the polynomial f∗=f+ 2νhas 3 distinct roots in Q2for all ν > 5. For

ν= 5, however, f∗has an irreducible quadratic factor over Q2, showing that the

bound v(f−f∗)>Sis best possible.

4 Error functions and continuity of roots

Consider two monic polynomials fand f∗of common degree n > 1 with coef-

ﬁcients in an algebraically closed, valued ﬁeld (K, v). Since the coeﬃcients of

a polynomial can be expressed as elementary symmetric functions of the roots,

the coeﬃcients depend continuously on the roots. More precisely, if we write

f=Qn

k=1(X−αk) and f∗=Qn

k=1(X−α∗

k) in any way, then v(f−f∗)≥

min{v(α1−α∗

1), . . . , v(αn−α∗

n)}.

The opposite, that the roots depend continuously on the coeﬃcients, is less

evident – it is not even clear what is to be understood by a such statement. The

known results in this direction are of a qualitative nature and do not work well

for polynomials with multiple roots.

5

Deﬁne the error function of the root αof fas the map Φ : Γ ∪ {∞} →

Γ∪ {∞} given by

Φ(x) =

n

X

l=1

min{x, v(α−αl)}.

It is a strictly increasing, piecewise linear (i.e. piecewise of the form x7→ νx +γ),

bijective (since Γ is assumed to have division from N) map with decreasing slopes

νfrom the set {1,...n}. If Ψ is the error function of the root βof f, the strong

triangle inequality gives

Φ(x) = Ψ(x) for all x≤v(α−β).(2)

Using error functions, we can now bound the error on the roots of a polynomial

caused by an error on the coeﬃcients.

Theorem 2 (continuity of roots). Let fand f∗be monic polynomials of

common degree n > 1with integral coeﬃcients in an algebraically closed, valued

ﬁeld (K, v). We may then write f=Qn

k=1(X−αk)and f∗=Qn

k=1(X−α∗

k)such

that v(αk−α∗

k)≥Φ−1

k(v(f−f∗)) for each k. Here Φkdenotes the error function

of the root αkof f.

Proof. Write f=Qn

k=1(X−αk) and put ρk:= Φ−1

k(v(f−f∗)) for each k. We may

assume 0 < v(f−f∗)<∞, and hence 0 < ρk<∞for each k, since otherwise

the claim is trivial. We show, for each k, that fand f∗have the same number

of roots (counted with multiplicity) in the ball {x∈K|v(x−αk)≥ρk}. It will

then follow (for instance by assuming ρ1≥ρ2≥. . . and then choosing α∗

1, α∗

2, . . . ,

in that order, such that, for each k,α∗

kis a root of f∗/Qk−1

l=1 (X−α∗

l) and has

v(α∗

k−αk)≥ρk) that we can write f∗=Qn

k=1(X−α∗

k) such that v(α∗

k−αk)≥ρk

for each k.

So ﬁx a k. Let µbe the number of indices lwith v(αl−αk)< ρk. We must

show that the number of indices lwith v(α∗

l−αk)< ρkis also µ. Consider the

Newton polygon NP of f(X+αk) = Xn+a1Xn−1+· · · +an. The slopes of NP

are v(α1−αk), . . . , v(αn−αk), in increasing order. So NP(i)−NP(i−1) < ρk

for i≤µ, and NP(i)−NP(i−1) ≥ρkfor i>µ. Let `be the “line through

the point p= (µ, v(aµ)) with slope ρk”, i.e. the map {0, . . . , n} → Γ given by

`(i) = (i−µ)ρk+v(aµ). Then NP(i)> `(i) for i < µ,NP(µ) = `(µ), and

6

NP(i)≥`(i) for i>µ(see ﬁgure).

`(n)

v(aµ)

µ n

p

q

`

NP

r

r

If we can show the same for the Newton polygon NP∗of f∗(X+αk), we are done.

Consider to this end the point q= (n, `(n)) on `and compute `(n):

`(n) = (n−µ)ρk+v(aµ) = X

l∈{l|v(αl)≥ρk}

ρk+X

l∈{l|v(αl)<ρk}

v(αl)

=

n

X

l=1

min{ρk, v(αl)}= Φk(ρk) = v(f−f∗)

Since αkis integral, v(f(X+αk)−f∗(X+αk)) = v(f−f∗). It follows that

NP∗(i) = NP(i) for i≤µ, and NP∗(i)≥`(i) for i>µ. This ﬁnishes the proof.

Heuristically, Theorem 2 says, if a root αof fis far away from the other roots,

then an error on the coeﬃcients of fcauses an error on αof equal or smaller

magnitude; however the proximity of other roots makes αmore sensitive to errors

on the coeﬃcients. Let us note a consequence of Theorem 2 illustrating this. Fix

ak, and let µbe the root multiplicity of αkin fmodulo the valuation ideal. This

means that v(αk−αl) is 0 for all but µvalues of l. Hence Φk(v(αk−α∗

k)) =

Pn

l=1 min{v(αk−α∗

k), v(αk−αl)} ≤ µ·v(αk−α∗

k) and thus

v(αk−α∗

k)≥v(f−f∗)/µ . (3)

In particular, v(αk−α∗

k)≥v(f−f∗)/n holds for all k. In light of (3), we might say

that the root αk, as a function of f’s coeﬃcients, satisﬁes a Lipschitz condition

of order 1/µ.

We conclude the section with a typical example where the bound given by

Theorem 2 is best possible.

7

Example. Consider again the polynomial f=X(X−2)(X−4) over the ﬁeld of

dyadic numbers Q2. The roots α1= 0 and α3= 4 have the same error function

Φ1= Φ3:γ7→ γ+ min{γ, 2}+ min{γ , 1}. The root α2= 2 has error function

Φ2:γ7→ γ+ 2 ·min{γ, 1}. They are shown in Figure 1 and 2.

123

1

2

3

4

5

6

Φ1= Φ3

Figure 1 Figure 2 Figure 3

123

1

2

3

4

5

6

Φ2

123

1

2

3

4

5

6

s

s

s

s

s

s

Now put f∗=f+ 2νwith some ν≥0. By Theorem 2, we may write

f∗= (X−α∗

1)(X−α∗

2)(X−α∗

3) such that v(αk−α∗

k)≥Φ−1

k(ν) for k= 1,2,3.

If α∗is a root of f∗maximally close to α1= 0, then v(α∗) is the maximal slope

of the Newton polygon NP∗of f∗. Figure 3 shows the Newton polygon NP of f

(solid line) and NP∗for some values of ν(dotted lines). It is seen that v(α∗) equals

Φ−1

1(ν), and hence v(α∗

1)=Φ−1

1(ν). Similarly, one sees v(αk−α∗

k)=Φ−1

k(ν) for

k= 2,3. So Theorem 2 gives in fact an optimal bound.

Finally note that, for ν > 5, each root α∗

kof f∗is closer to αkthan to either

of the two other roots of f. This agrees with Theorem 1 and the fact that fhas

separant S= 5.

5 The bipartitionant and the induced factorisa-

tion

Consider a monic polynomial fof degree n > 1 with coeﬃcients in an algebraically

closed, valued ﬁeld (K, v) and write f=Qn

k=1(X−αk). Let Iand Jbe disjoint,

8

non-empty sets with union {1,2, . . . , n}and put g=Qi∈I(X−αi) and h=

Qj∈J(X−αj) so that f=gh. Deﬁne the bipartitionant of the polynomials g

and has

B:= max{Φi(v(αi−αj)) |i∈I , j ∈J}

where Φiis the error function of the root αiof f. Clearly, B<∞iﬀ gand hare

relatively prime. Equation (2) implies

B= max{Φj(v(αi−αj)) |i∈I , j ∈J},

showing that the deﬁnition is symmetric in gand h. The crucial property of the

bipartitionant is this:

Lemma 3. Suppose the coeﬃcients of fare integral. Let f∗be another monic

polynomial of degree nwith integral coeﬃcients in K, and assume v(f−f∗)>B.

Then we may write f∗=Qn

k=1(X−α∗

k)such that v(αi−α∗

i), v(αj−α∗

j)>

v(αi−αj)and thereby v(αi−αj) = v(α∗

i−α∗

j)for all i∈Iand all j∈J.

Proof. Write f∗=Qn

k=1(X−α∗

k) as in Theorem 2. Then v(αi−α∗

i)≥

Φ−1

i(v(f−f∗)) >Φ−1

i(B)≥v(αi−αj) and v(αj−α∗

j)≥Φ−1

j(v(f−f∗)) >

Φ−1

j(B)≥v(αi−αj) for all i∈Iand j∈J. The strong triangle inequality gives

v(αi−αj) = v(α∗

i−α∗

j).

So in the situation of Lemma 3, the roots of f∗may be “bipartitioned” into

two sets {α∗

i|i∈I}and {α∗

j|j∈J}. This bipartitioning only depends on

the factorisation f=gh and not on the representation f∗=Qn

k=1(X−α∗

k) from

Theorem 2. We say that the factorisation f∗=g∗h∗where g∗:= Qi∈I(X−α∗

i)

and h∗:= Qj∈J(X−α∗

j) is induced by the factorisation f=gh.

How does one compute B? If i0∈Iand j0∈Jare such that B= Φi0(v(αi0−

αj0)), then

v(αi0−αj0) = max{v(αi−αj0)|i∈I}= max{v(αi0−αj)|j∈J}(4)

9

since the Φ’s are strictly increasing. If, in turn, i0∈Iand j0∈Jsatisfy (4), then

Φi0(v(αi0−αj0)) =

n

X

k=1

min{v(αi0−αj0), v(αi0−αk)}

=X

i∈I

min{v(αi0−αj0), v(αi0−αi)}+

X

j∈J

min{v(αi0−αj0), v(αi0−αj)}

=X

i∈I

v(αi−αj0) + X

j∈J

v(αi0−αj)

=v(g(αj0)) + v(h(αi0))

where the third equality requires (4), the strong triangle inequality, and some

consideration. Now conclude

B= max{v(g(αj0)) + v(h(αi0))} | i0∈Iand j0∈Jsatisfy (4)}.(5)

Since the bipartitionant replaces twice the value of the resultant Res(g, h) =

Qi,j (αi−αj) in our Hensel’s lemma (Theorem 8), it is of interest to compare

these two invariants, and from (5) follows immediately

B≤X

j∈J

v(g(αj)) + X

i∈I

v(h(αi)) = 2v(Res(g, h))

when fhas integral coeﬃcients.

6 Continuity of factors

There is a remarkable analogue to the continuity of roots that could be called

continuity of factors. In words it says, if there is a factorisation f=gh such

that the roots of gare far away from the roots of h(but possibly close to each

other), then an error on the coeﬃcients of fcauses an error on the coeﬃcients of

gwhich is in general smaller than the error caused on the roots of gindividually.

It should be noted that the main part of Hensel’s lemma is proved in the next

section without the results of this section.

Consider two monic polynomials f, f ∗of common degree n > 1 with integral

coeﬃcients in an algebraically closed, valued ﬁeld (K, v), and write f=Qn

k=1(X−

10

αk). Let Iand Jbe disjoint, non-empty sets with union {1,2, . . . , n}and put

g=Qi∈I(X−αi) and h=Qj∈J(X−αj). Let us call gan isolated factor of f

if

∀i, i0∈I∀j∈J:v(αi−αi0)> v(αi−αj)

i.e. if there is a ball in Kcontaining all roots of gand no roots of h.

Lemma 4 (continuity of isolated factors). Assume v(f−f∗)>Bwhere Bis

the bipartitionant of gand h, and consider the induced factorisation f∗=g∗h∗.

If gis an isolated factor of f, then v(g−g∗)≥v(f−f∗)−B+ max{v(αi−αj)|

i∈I, j ∈J}.

Proof. The idea is to use a general form of Newton approximation to come from

gto g∗. We may assume g(0) = 0 by a change of variable. Put ν= deg(g) and

µ= deg(h). We may then further assume g=Qν

i=1(X−αi), h=Qn

j=ν+1(X−αj),

and

∞=v(α1)≥v(α2)≥ · · · ≥ v(αν)> v(αν+1)≥ · · · ≥ v(αn)

since gis isolated. Thus, u:= max{v(αi−αj)|i∈I , j ∈J}equals v(αν+1), and

Bequals ν·u+v(h(0)) by (5).

Deﬁne three polynomial sequences (gm)m∈N, (hm)m∈N, and (rm)m∈Nrecursively

like this: Put g1:= g. Given gm, deﬁne hmand rmsuch that f∗=gmhm+rm

and deg(rm)< ν. Given gm,hm, and rm, deﬁne gm+1 := gm+rm/hm(0).

The diﬃculty of the proof lies in ﬁnding the right thing to prove. For a ﬁxed

mand for i= 1, . . . , ν, let aiand cibe the values of the coeﬃcients to the terms

of degree ν−iin gmand rm, respectively. We claim:

(A) ai≥iu + ∆ where ∆ := min{v(αν)−u, v(f−f∗)−B}.

(B) The Newton polygon of hmequals the Newton polygon NP of h.

(C) ci≥v(h(0)) + iu +ki∆ where ki:= max{k∈N|k < (m+i+ν−1)/ν}.

The claims are shown by induction after m. Assume m= 1 for the induction

start. All roots of g1=ghave value at least v(αν), and hence

ai≥i·v(αν)≥i(u+ ∆) ≥iu + ∆ .

This shows (A). Write f∗−f=gh0+r0with deg(r0)< ν. Then f∗=g(h+h0) +r0

and thus h1=h+h0and r1=r0. Also, v(h0), v(r0)≥v(f−f∗). Adding h0to h

11

does not change the Newton polygon since v(h0)>B≥v(h(0)) = NP(µ). This

shows (B). Finally,

ci≥v(r0)≥v(f−f∗)≥B+ ∆ ≥v(h(0)) + iu +ki∆

since ki= 1 for m= 1, showing (C).

For the induction step, assume (A), (B), and (C) hold for some m, and let

(A’), (B’), and (C’) be the statements corresponding to m+ 1. (A’) follows

immediately from (A) and (C). Note f∗=gm+1hm−(hm/hm(0) −1)rmand hence

hm+1 =hm+h0and rm+1 =r0if we write

−(hm/hm(0) −1)rm=gm+1h0+r0(6)

with deg(r0)< ν. Let dibe the value of the coeﬃcient to the term of degree n−i

in the left hand side of (6). Using (A), (B), and (C) gives

d1≥NP(0) + u+k1∆

d2≥NP(1) + u+k1∆

.

.

.

dµ≥NP(µ−1) + u+k1∆

dµ+1 ≥NP(µ−1) + 2u+k2∆

.

.

.

dn−1≥NP(µ−1) + νu +kν∆

∞=dn≥NP(µ−1) + (ν+ 1)u+kν+1∆

The algorithm of polynomial division resulting in the expression (6) consists of

a number of steps in each of which a monomial times gm+1 is subtracted from

−(hm/hm(0) −1)rm. The key observation is that, in each step, the values of the

coeﬃcients of the remainder satify the same inequalities as the di. Let b0

ibe the

value of the coeﬃcient to the term of degree µ−iin h0. Then

b0

1≥NP(0) + u+k1∆>NP(0) + u≥NP(1)

.

.

.

b0

µ≥NP(µ−1) + u+k1∆>NP(µ−1) + u≥NP(µ)

12

Hence hm+1 =hm+h0has NP as its Newton polygon, showing (B’). Let c0

ibe the

value of the coeﬃcient to the term of degree ν−iin r0. Then

c0

1≥NP(µ−1) + 2u+k2∆ = v(h(0)) + u+k2∆

.

.

.

c0

ν≥NP(µ−1) + (ν+ 1)u+kν+1∆ = v(h(0)) + νu +kν+1∆

This shows (C’) and ﬁnishes the induction step.

By (C), v(rm)→ ∞ and hence gmhm→f∗. By the continuity of roots, the

roots of gmhmconverge to the roots of f∗(in a multiplicity-respecting way). By

assumption, the roots of ghave values > u, whereas the roots of hhave values

≤u. Lemma 3 then gives that the roots of g∗have values > u, whereas the roots

of h∗have values ≤u. By (A), the roots of gmhave values > u. It follows that

the roots of gmconverge to the roots of g∗, and thereby the coeﬃcients converge

too: gm→g∗. Finally, g∗=g+P∞

m=1 rm/hm(0) and therefore by (C),

v(g−g∗)≥min{v(rm)−v(h(0)) |m∈N}

≥u+ ∆

≥v(f−f∗)−B+ max{v(αi−αj)|i∈I, j ∈J}.

Let us show that Lemma 4 coincides with the Hensel-Rella criterion when g

is linear. Given is a polynomial Fwith an approximate root ξ0. Put g=X−ξ0

and h= (F−F(ξ0))/(X−ξ0). Then the left hand side of (1) is the value of

F(ξ0) = F−gh, and it can be seen that the right hand side of (1) equals the

bipartitionant of gand h. Hence, the gmconverge to a polynomial g∗=X−ξ

dividing F. In the proof of Lemma 4, we could as well have deﬁned gm+1 as

gm+rm/hm(ξm) where ξmis any root of gm(or any other element suﬃciently

close to 0). With this deﬁnition and with linear g, the approximation process

becomes identical with usual Newton approximation.

Theorem 5 (continuity of factors). Let fand f∗be monic polynomials of com-

mon degree n > 1with integral coeﬃcients in an algebraically closed, valued ﬁeld

(K, v). Consider a monic factorisation f=gh, and let Bbe the bipartitionant of

gand h. Assume v(f−f∗)>B, and let f∗=g∗h∗be the induced factorisation.

Then v(g−g∗), v(h−h∗)≥v(f−f∗)−B.

13

Proof. Write g=g1. . . grsuch that each glis a maximal (with respect to divisi-

bility) monic factor of gwhich is an isolated factor of f. The bipartitionant of gl

and ˜gl:= f/glis

Bl:= max{Φi(v(αi−αj)) |gl(αi) = ˜gl(αj) = 0}

= max{Φi(v(αi−αj)) |gl(αi)=0, j ∈J}

(last equality follows from the maximality of gl), implying

B= max{Φi(v(αi−αj)) |i∈I, j ∈J}

= max{B1,...,Br}.

Lemma 4 gives

v(g−g∗)≥min{v(f−f∗)−Bl+ Φ−1

i(Bl)|l= 1, . . . , r , gl(αi)=0}

≥min{v(f−f∗)−Bl|l= 1, . . . , l}

=v(f−f∗)−B.

The inequality for v(h−h∗) can be proved the same way, but also follows directly

by dividing f∗by g∗.

Example. Consider the polynomial f=X2(X−2)(X−4) = X4−6X3+8X2over

the ﬁeld of dyadic numbers Q2. The error function of the double root α1=α2= 0

is Φ(γ)=2·γ+ min{γ, 1}+ min{γ, 2}. The bipartitionant of the factors g=X2

and h= (X−2)(X−4) is B=v(g(4))+ v(h(0)) = 7. Let f∗=f+2νwith ν > 7

and consider the induced factorisation f∗=g∗h∗. By Lemma 4, v(g−g∗)≥ν−5.

The ﬁgure shows the inverse error function Φ−1and the line ν7→ ν−5 (dotted):

12345678910

1

2

3

4

5

Φ−1

r

r

14

Let us compute v(g−g∗) precisely. The Newton polygon of f∗shows that

the roots α∗

1, . . . , α∗

4of f∗have values v(α∗

1) = v(α∗

2) = (ν−3)/2, v(α∗

3) = 1, and

v(α∗

4) = 2. We have

g∗= (X−α∗

1)(X−α∗

2) = X2−(α∗

1+α∗

2)X+α∗

1α∗

2.

From the above follows v(α∗

1α∗

2) = ν−3. It is more tricky to compute v(α∗

1+α∗

2).

To this end, consider the polynomial f∗(X−α∗

1). It has roots 2α∗

1,α∗

1+α∗

2,

α∗

1+α∗

3,α∗

1+α∗

4and constant term f∗(−α∗

1) = f(α∗

1) + 12(α∗

1)3= 12(α∗

1)3. Thus,

v(α∗

1+α∗

2) = v(f∗(−α∗

1)) −v(2α∗

1)−v(α∗

1+α∗

3)−v(α∗

1+α∗

4)

= (3ν−5)/2−(ν−1)/2−1−2

=ν−5.

Conclude v(g−g∗) = min{ν−5, ν −3}=ν−5.

The moral of the story is that the bound on the coeﬃcients of g∗given by

Lemma 4 is best possible (contrary to that of Theorem 5) and better than the

bound on the roots of g∗given by Theorem 2.

One may wonder if there is also “continuity of factors” when v(f−f∗)≤B,

i.e. if there is a bound on the error on the coeﬃcients of gbetter than the bound

on the error on the roots of g. That is not likely to be the case. For when

v(f−f∗)≤B, it is no longer possible to bipartition the roots of f∗as in Lemma

3. In other words, the factorisation f=gh no longer gives rise to a natural

factorisation f∗=g∗h∗. This view is supported by the observation that, in the

limit v(f−f∗) = B, the bound on the error on gin the example above coincides

with the bound on the error on the roots of g.

7 Krasner’s lemma

The well-known Krasner’s lemma (see Corollaire 1, page 190 of Ribenboim (1968),

for instance) was in fact found by Ostrowski already in 1917. We give here a gen-

eralisation that will be used in the next section.

Theorem 6 (lemma `a la Krasner). Consider a monic polynomial f∗=Qn

k=1(X−

15

α∗

k)of degree n > 1with coeﬃcients in a Henselian ﬁeld (K, v)and roots in the

algebraic closure ˜

K. Let Iand Jbe two disjoint, non-empty sets with union

{1, . . . , n}. Moreover, consider a polynomial g=Qi∈I(X−αi)with coeﬃcients

and roots in ˜

K. Assume

∀i∈I∀j∈J:v(αi−α∗

i)> v(α∗

i−α∗

j).(7)

Then the coeﬃcients of the polynomials g∗:= Qi∈I(X−α∗

i)and h∗:= Qj∈J(X−

α∗

j)are contained in the ﬁeld extension of Kgenerated by the coeﬃcients of g.

Proof. Part A. First some preliminary observations. From (7) follows at once that

g∗and h∗are relatively prime. Since f∗=g∗h∗, the coeﬃcients of g∗generate

the same extension of Kas the coeﬃcients of h∗. We may assume without loss

of generality – and will do so – that ghas coeﬃcients in K. What is left to prove

is that g∗has coeﬃcients in K.

Now let Ksep be the separable algebraic closure of K. Since Ksep is a sepa-

rably closed ﬁeld, every irreducible polynomial over Ksep has only one (possibly

multiple) root. Since g∗h∗has coeﬃcients in Ksep, and g∗and h∗are relatively

prime, it follows that g∗and h∗have coeﬃcients in Ksep.

We show in part B that every K-automorphism σon ˜

Kpermutes the roots

of g∗. Hence, every such σﬁxes the coeﬃcients of g∗. The coeﬃcients of g∗are

therefore purely inseparable over K.

Since the coeﬃcients of g∗are both separable and purely inseparable over K,

they do in fact belong to K.

Part B. Let σbe a K-automorphism on ˜

K. Consider the sets A={αi|i∈I},

A∗={α∗

i|i∈I}, and A∗∗ ={α∗

j|j∈J}. Note that A∪A∗and A∗∗ are disjoint

by (7). Since gand f∗have coeﬃcients in K,σis a “multiplicity-preserving”

permutation on both Aand A∗∪A∗∗. Since Kis Henselian, σis isometric. We

show that (7) implies that σpermutes A∗and A∗∗ individually. This is really a

lemma on ﬁnite ultra-metric spaces.

For α∈A, let B(α) be the maximal ball in the ﬁnite ultra-metric space

A∪A∗∪A∗∗ containing αand being contained in A∪A∗. Then (7) implies

∀i∈I:αi∈ B(α)⇔α∗

i∈ B(α).(8)

Every α∗∈A∗is thereby contained in some B(α), so we are done if we can show

σ(B(α)) jA∪A∗.

16

For any αi∈A∩σ(B(α)), the balls σ(B(α)) and B(αi) have non-empty

intersection (both contain αi), hence one is contained in the other. If there is an

αi∈A∩σ(B(α)) such that σ(B(α)) jB(αi), then σ(B(α)) jA∪A∗and we are

done. So assume from now on σ(B(α)) ⊃ B(αi) for all αi∈A∩σ(B(α)).

For a subset Xof A∪A∗∪A∗∗ , let #Xdenote X’s cardinality “counted with

multiplicity”, i.e.

#X:= |{i∈I|αi∈X}| +|{k∈I∪J|α∗

k∈X}| .

We then have

#B(α) = 2 · |{i∈I|αi∈ B(α)}|

by (8). Since σpreserves multiplicity and permutes A,

#B(α) = #σ(B(α)) and |{i∈I|αi∈ B(α)}| =|{i∈I|αi∈σ(B(α))}|

hold. For i∈Iwith αi∈σ(B(α)), (8) implies α∗

i∈ B(αi)⊂σ(B(α)) and hence

|{i∈I|α∗

i∈σ(B(α))}| ≥ |{i∈I|αi∈σ(B(α))}| .

Putting everything together gives

#σ(B(α)) = |{i∈I|αi∈σ(B(α))}| +|{k∈I∪J|α∗

k∈σ(B(α))}|

≥2· |{i∈I|αi∈σ(B(α))}| +|{j∈J|α∗

j∈σ(B(α))}|

= 2 · |{i∈I|αi∈ B(α)}| +|{j∈J|α∗

j∈σ(B(α))}|

= #B(α) + |{j∈J|α∗

j∈σ(B(α))}|

= #σ(B(α)) + |{j∈J|α∗

j∈σ(B(α))}| .

Finally, conclude |{j∈J|α∗

j∈σ(B(α))}| = 0, i.e. σ(B(α)) jA∪A∗.

Theorem 6 has an immediate corollary which itself reduces to the usual Kras-

ner’s lemma when the element ais separable over K:

Corollary 7. Consider a Henselian ﬁeld Kand let aand bbe elements in

the algebraic closure ˜

K. Assume bis closer to athan to any of a’s conjugates.

Then K(b)contains the coeﬃcients of the polynomial (X−a)µwhere µis the

root multiplicity of ain its minimal polynomial over K.

17

Remark. In the application of Theorem 6 in the proof of Theorem 8 below, we

also have a polynomial h=Qj∈J(X−αj) satisfying

∀i∈I∀j∈J:v(αj−α∗

j)> v(α∗

i−α∗

j) (9)

and such that gh has coeﬃcients in K. In this situation, part B of the proof

of Theorem 6 can be replaced by the following simpler argument: Assume for

a contradiction that there are i∈Iand j∈Jsuch that σ(α∗

i) = α∗

j. By

symmetry, we may assume v(αi−α∗

i)≥v(αj−α∗

j) Then σ(αi) = αi0for some

i0∈I. Since σis isometric, v(αi0−α∗

j) = v(αi−α∗

i)≥v(αj−α∗

j). But now

v(αi0−αj)≥v(αj−α∗

j), in contradiction with (9).

8 Hensel’s lemma

We can now state and prove the promised general Hensel’s lemma.

Theorem 8 (monic Hensel’s lemma). Consider two monic polynomials f

and f∗of common degree n > 1with integral coeﬃcients in a Henselian ﬁeld

(K, v). Let there be given a factorisation f=gh with monic gand h. Assume

v(f−f∗)>Bwhere Bis the bipartitionant of gand h. Then there is a fac-

torisation f∗=g∗h∗where g∗and h∗are monic and have integral coeﬃcients,

deg(g∗) = deg(g), deg(h∗) = deg(h), and v(g−g∗), v(h−h∗)≥v(f−f∗)−B.

Proof. Consider the induced factorisation f∗=g∗h∗. The factors g∗and h∗have

coeﬃcients in Kby Lemma 3 and Theorem 6. The bound on v(g−g∗) and

v(h−h∗) follows from Theorem 5.

Example. Consider the polynomial f∗=X8(X+ 2)8+ 2νwith ν≥0 over the

ﬁeld of dyadic numbers Q2. The bipartitionant of g=X8and h= (X+ 2)8is

B= 16. By Theorem 8, f∗is reducible for all ν > 16. More precisely, there is

in this case a monic factorisation f∗=g∗h∗with v(g−g∗), v(h−h∗)≥ν−16

(using Lemma 4 instead of Theorem 5 gives in fact v(g−g∗), v(h−h∗)≥ν−15).

It can be shown that f∗is irreducible for ν= 0,1,3,4,5,7,8,9,11,12,13,15,16,

implying that the bound v(f−f∗)>Bis best possible. The dyadic value of

the resultant of gand his 64, so the Hensel’s lemma of 1908 gives a factorisation

18

f∗=g∗h∗with v(g−g∗), v(h−h∗)≥ν−64 for ν > 128.

To make life as easy as possible, we have so far solely studied monic poly-

nomials having integral coeﬃcients. This is indeed the situation in almost all

applications of Hensel’s lemma. Also, when a given non-monic polynomial F∗has

an approximate factorisation satisfying the conditions of the non-monic Hensel’s

lemma, the reducibility of F∗follows immediately from the observation that the

Newton polygon of F∗is not a straight line.

Nevertheless, we now turn our attention to the non-monic case. The proof of

the following theorem is entirely analogous to that of the monic Hensel’s lemma,

but the presence of non-monic polynomials forces us to reexamine the proofs of

earlier theorems.

Theorem 9 (Hensel’s lemma, ﬁnal form). Consider two polynomials Fand F∗

of common degree n > 1with integral coeﬃcients in a Henselian ﬁeld (K, v)and

with the same leading coeﬃcient c. Let there be given a factorisation F=gH

where gis monic and has integral coeﬃcients, and His primitive, i.e. v(H) = 0.

Assume v(F−F∗)>max{0,B+v(c)}where Bis the bipartitionant of gand

c−1H. Then there is a factorisation F∗=g∗H∗where g∗is monic and has

integral coeﬃcients, H∗is primitive, deg(g∗) = deg(g), deg(H∗) = deg(H), and

v(g−g∗), v(H−H∗)≥v(F−F∗)−max{0,B+v(c)}.

Proof. First introduce monic polynomials f:= c−1F,f∗:= c−1F∗, and h:=

c−1H. Note f=gh,v(f−f∗) = v(F−F∗)−v(c), and thus v(f−f∗)>

max{−v(c),B}.

Write f=Qn

k=1(X−αk) and let Iand Jbe the sets with g=Qi∈I(X−αi)

and h=Qj∈J(X−αj). Put ρi:= Φ−1

i(v(f−f∗)) for each i∈I. Note Φi(0) =

Pn

l=1 min{0, v(αi−αl)}=−v(c)< v(f−f∗) and hence 0 < ρi.

The proof of Theorem 2, word for word, shows that fand f∗have the same

number of roots (counted with multiplicity) in the ball {x∈˜

K|v(x−αi)≥ρi}

for any i∈I. It follows that we can write f∗=Qn

k=1(X−α∗

k) such that

v(αi−α∗

i)≥ρifor each i∈I. We have v(αi−αj)≤Φ−1

i(B)< ρifor i∈I

and j∈J, and therefore v(α∗

i−α∗

j)< ρifor i∈Iand j∈J. Conclude

v(αi−α∗

i)> v(α∗

i−α∗

j) for all i∈Iand j∈J.

19

By Theorem 6, g∗:= Qi∈I(X−α∗

i) and h∗:= Qj∈J(X−α∗

j) have coeﬃ-

cients in K. Reexamination of the proofs of Lemma 4 and Theorem 5 shows

v(g−g∗)≥v(f−f∗)−max{−v(c),B}. Now put H∗:= ch∗.

Notice that the resultant of gand Hhas value

v(Res(g, H )) = deg(g)·v(c) + v(Res(g, h))

= deg(g)·v(c) + X

i∈I,j ∈J

v(αi−αj)

=X

i∈I,j ∈J

max{0, v(αi−αj)}

By (5), the bipartitionant of gand his B=Pi∈Iv(αi−αj0) + Pj∈Jv(αi0−αj)

for suitable i0∈Iand j0∈J. There follows max{0,B+v(c)} ≤ 2v(Res(g, H)).

Hence, Theorem 9 generalises the Hensel’s lemma of 1908 as well as its in section

1 mentioned later reincarnations.

References

[1] G. Dumas, Sur quelques cas d’irr´eductibilit´e des polynomes `a coeﬃcients

rationnels, J. Math. Pures Appl. 61 (1906), 191–258.

[2] K. Hensel, Neue Grundlagen der Arithmetik, J. Reine Angew. Math. 127

(1904), 51–84.

[3] K. Hensel, Theorie der algebraischen Zahlen, Teubner, Leipzig, 1908.

[4] W. Krull, Allgemeine Bewertungstheorie, J. Reine Angew. Math. 167 (1932),

160–196.

[5] J. K¨ursch´ak, ¨

Uber Limesbildung und allgemeine K¨orpertheorie, J. Reine

Angew. Math. 142 (1913), 211–253.

[6] M. Nagata, On the Theory of Henselian Rings, Nagoya Math. J. 5(1953),

45–57.

[7] A. Ostrowski, Untersuchungen zur arithmetischen Theorie der K¨orper,

Math. Z. 39 (1935), 269–404.

20

[8] F. J. Rayner, Relatively Complete Fields, Proc. Edinburgh Math. Soc. 11

(1958), 131–133.

[9] T. Rella, Zur Newtonschen Approximationsmethode in der Theorie der p-

adischen Gleichungswurzeln, J. Reine Angew. Math. 153 (1924), 111–112.

[10] T. Rella, Ordnungsbestimmungen in Integrit¨atsbereichen und Newtonsche

Polygone, J. Reine Angew. Math. 158 (1927), 33–48.

[11] P. Ribenboim, Th´eorie des valuations, Les presses de l’Universit´e de

Montr´eal, Montreal, 1968.

[12] P. Ribenboim, Equivalent forms of Hensel’s lemma, Expo. Math. 3(1985),

3–24.

[13] D. S. Rim, Relatively complete ﬁelds, Duke Math. J. 24 (1957), 197–200.

[14] P. Roquette, History of Valuation Theory. Part I. In: F. V. Kuhlmann, S.

Kuhlmann, M. Marshall (ed.), Valuation Theory and its applications, vol. 1,

Fields Inst. Commun. 32 (2002), 291–355.

[15] K. Rychl´ık, Zur Bewertungstheorie der algebraischen K¨orper, J. Reine

Angew. Math. 153 (1924), 94–107.

21