ArticlePDF Available

Article

# New light on Hensel's lemma

## Abstract

The historical development of Hensel's lemma is briefly discussed (Section 1). Using Newton polygons, a simple proof of a general Hensel's lemma for separable polynomials over Henselian fields is given (Section 3). For polynomials over algebraically closed, valued fields, best possible results on continuity of roots (Section 4) and continuity of factors (Section 6) are demonstrated. Using this and a general Krasner's lemma (Section 7), we give a short proof of a general Hensel's lemma and show that it is, in a certain sense, best possible (Section 8). All valuations here are non-Archimedean and of arbitrary rank. The article is practically self-contained.
New light on Hensel’s lemma
David Brink
March 2006
Abstract: The historical development of Hensel’s lemma is brieﬂy discussed
(section 1). Using Newton polygons, a simple proof of a general Hensel’s lemma
for separable polynomials over Henselian ﬁelds is given (section 3). For polyno-
mials over algebraically closed, valued ﬁelds, best possible results on continuity
of roots (section 4) and continuity of factors (section 6) are demonstrated. Using
this and a general Krasner’s lemma (section 7), we give a short proof of a general
Hensel’s lemma and show that it is, in a certain sense, best possible (section 8).
All valuations here are non-archimedean and of arbitrary rank. The article is
practically self-contained.
1 Introduction and historical remarks
The p-adic numbers were introduced in 1904 by Hensel in Neue Grundlagen der
Arithmetik. In the same article, Hensel showed that if a monic polynomial fwith
integral p-adic coeﬃcients has an approximate factorisation fgh, meaning that
the coeﬃcients of the diﬀerence fgh are p-adically smaller than the discriminant
of f, then there exists an exact factorisation f=gh. Four years later, in 1908,
Hensel gave a somewhat more general result in his book Theorie der algebraischen
Zahlen, where fis no longer assumed monic, and the discriminant of fis replaced
by the squared resultant of gand h.
Since then, many variations and generalisations of Hensel’s result have been
found, some of which bear only little resemblance to the original. Confusingly,
all these theorems are known today as “Hensel’s lemma”. We mention here the
most important. urschak (1913) introduced real valuations on the abstract
1
ﬁelds recently deﬁned by Steinitz and indicated that Hensel’s arguments would
carry over to complete, non-archimedean valued ﬁelds. Rychl´ık (1919) under-
took these generalisations explicitly. Krull (1932) introduced general valuations,
gave a new concept of completeness, and showed that a weak Hensel’s lemma (g
and hare assumed relatively prime modulo the valuation ideal) holds for such
ﬁelds. Nagata (1953) showed that if a weak Hensel’s lemma holds in some ﬁeld
with a valuation v, then the original Hensel’s lemma holds too under the extra
assumption that v(fgh)2v(Res(g, h)) is not contained in the maximal con-
vex subgroup of the value group not containing v(Res(g, h)). Rim (1957) and
Rayner (1958) proved that the unique extension property implies weak forms of
Hensel’s lemma. Ribenboim (1985) showed the logical equivalence between these
and other “Hensel’s lemmas”. The reader is referred to the very interesting paper
of Roquette (2002) regarding the history of Hensel’s lemma and valuation theory
in general.
In the present paper, a new proof of Hensel’s lemma is presented that gener-
alises the original in another direction, namely with respect to the accuracy of the
approximate factorisation. It will be seen that the discriminant and the resultant
disappear completely. They are replaced by two new polynomial invariants, here
called the separant and the bipartitionant. The core of the proof is an analysis
of the continuous behaviour of the roots of a polynomial as functions of the co-
eﬃcients. These arguments, in contrast to earlier proofs, work equally well for
arbitrary as for real valuations and make Nagata’s extra assumption superﬂuous.
The only thing we need is that the valuation has the unique extension property.
After proving his lemma in Hensel (1908), Hensel demonstrated the following:
If the p-adic polynomial Fof degree νhas an approximate root ξ0satifying
ρ > max 0ρ(i)
i1
i= 2,3, . . . , ν(1)
where ρis the value of F(ξ0), and ρ(i)is the value of F(i)(ξ0)/i!, then Newton
approximation gives an exact root ξ, provided that the values ρ0, ρ00 , . . . , ρ(ν)remain
unchanged during the approximation process. In a short note from 1924, Rella
showed the last condition to follow from (1). Our general Hensel’s lemma will be
seen to cover this Hensel-Rella criterion.
As noted by Rella in 1927, the existence of ξis an almost immediate conse-
quence of the Newton polygon method, a ubiquitous theme of this article. The p-
2
adic Newton polygon was introduced by Dumas already in 1906 and later studied
by K¨ursch´ak, Rella, and Ostrowski, but surprisingly never mentioned by Hensel.
2 Valuations, Newton polygons, and the unique
extension property
Consider a ﬁeld K. By a valuation on Kwe understand a map vfrom Kinto a
totally ordered, additively written abelian group with inﬁnity Γ ∪ {∞} satisfying
v(0) = ,v(x)Γ if x6= 0, v(xy) = v(x) + v(y), and the strong triangle
inequality v(x+y)min{v(x), v(y)}. In this situation, the pair (K, v) is called
avalued ﬁeld,v(x) is called the value of xK, and xis called integral if
v(x)0. If Γ is order-isomorphic to a subgroup of R+, the valuation is called
real (the term “rank 1” is also standard). Sometimes we will use that Γ has
division from N. This may indeed be assumed without loss of generality, for
we can always embed Γ into some larger group Γ0having that property. For
a polynomial f=a0Xn+a1Xn1+· · · +anwith coeﬃcients in K, we deﬁne
v(f) := min{v(a0), . . . , v(an)}.
The Newton polygon is a simple, yet powerful tool in valuation theory. It
seems to have been always restricted to the case of real valuations, so we give here
a deﬁnition for arbitrary valuations in the above sense. Consider a polynomial f=
a0Xn+a1Xn1+· · ·+anof degree n > 0 with coeﬃcients and roots in a valued ﬁeld
(K, v). Usually, it is diﬃcult to compute the roots by means of the coeﬃcients,
but in contrast to this, it is easy to compute the values of the roots by means of
the values of the coeﬃcients. Deﬁne f’s Newton polygon as the maximal convex
map NP :{0,1, . . . , n} → Γ∪ {∞} satisfying NP(i)v(ai) for all i. By “convex”
is understood the obvious, i.e. that 2·NP(i)NP(i1)+NP(i+1) for all i6= 0, n.
The diﬀerences NP(i)NP(i1), with the convention ∞−∞ =, are the slopes
of NP. They form an increasing sequence. Now write f=a0·Qn
i=1(Xαi) such
that v(α1)≤ · · · ≤ v(αn). Then v(αi) = NP(i)NP(i1) for all i= 1, . . . , n.
In words, the values of the roots of a polynomial equal the slopes of its Newton
polygon. The conceptually easy, but notationally cumbersome proof expresses the
aias elementary symmetric functions in the αiwhereupon the v(ai) are computed
from the v(αi) using the strong triangle inequality.
3
We call a valued ﬁeld (K, v)Henselian if it has the unique extension
property, i.e. if vhas a unique extension (also denoted v) to the algebraic closure
˜
Kof K. Note that the existence of a valuation extension is automatic with this
deﬁnition. The unique extension property is, as a matter of fact, equivalent to
many (maybe all) variants of Hensel’s lemma, see for instance Ribenboim (1985).
We actually only use a certain consequence of the unique extension property,
namely this: any K-automorphism σof ˜
Kis isometric with respect to v(since
otherwise vσwould be an extension diﬀerent from v). The slopes of the Newton
polygon of an irreducible polynomial over a Henselian ﬁeld are thus all the same,
an observation due to Ostrowski (1935).
3 The separant and the “separable Hensel’s lemma”
For a monic polynomial f=Qn
k=1(Xαk) of degree n > 1 with roots in a valued
ﬁeld (K, v), we deﬁne the polynomial invariant
S= max{v(f0(αk)) + v(αkαl)|k6=l}
and call it f’s separant. Note f0(αk) = Ql6=k(αkαl) and that S<iﬀ
fis separable (i.e. fhas no multiple roots). A monic polynomial with integral
coeﬃcients has integral roots. So if fhas integral coeﬃcients, Sis less than or
equal to the value of f’s discriminant disc(f) = Qk <l(αkαl)2. Therefore, the
following “separable Hensel’s lemma” generalises the Hensel’s lemma of 1904.
Theorem 1 (separable Hensel’s lemma). Let fand fbe monic polynomials
of common degree n > 1with integral coeﬃcients in a Henselian ﬁeld (K, v).
Assume v(ff)>Swhere Sis the separant of f. Then fand fare both
separable, and we may write f=Qn
k=1(Xαk)and f=Qn
k=1(Xα
k)such
that K(αk) = K(α
k)for all k.
Proof. Since Sis ﬁnite, fis separable. Write f=Qn
k=1(Xαk) and ﬁx a k. The
Newton polygon NP of f(X+αk) has NP(n) = and NP(n1) = v(f0(αk)).
The root αkis integral, and therefore v(f(X+αk)f(X+αk)) = v(ff).
Consequently, the assumption v(ff)>Simplies that the Newton polygon
4
NPof f(X+αk) satisﬁes
NP(i) =
NP(i) for i<n
v(f(αk)) >Sfor i=n
Hence, fhas a root α
kwith v(α
kαk) = NP(n)NP(n1) >Sv(f0(αk))
v(αkαl) for all ldiﬀerent from k.
This way we get ndistinct roots α
1, . . . , α
nof fsuch that v(α
kαk)>
v(αkαl) for all distinct kand l. Now Krasner’s lemma (see section 7) gives
K(αk) = K(α
k) for all k. Naturally, f=Qn
k=1(Xα
k).
So if a polynomial fis separable, then any other polynomial fhaving coeﬃ-
cients suﬃciently close to those of fhas the same factorisation as f. This fails to
be true if fhas multiple roots. Over the ﬁeld of dyadic numbers Q2, for instance,
f=X2is reducible, but f=X2+ 2νis irreducible for any ν.
Example. Consider the polynomial f=X(X2)(X4) = X36X2+ 8X
over Q2. It has separant S= 5 (whereas the value of the discriminant is 8).
Hence, the polynomial f=f+ 2νhas 3 distinct roots in Q2for all ν > 5. For
ν= 5, however, fhas an irreducible quadratic factor over Q2, showing that the
bound v(ff)>Sis best possible.
4 Error functions and continuity of roots
Consider two monic polynomials fand fof common degree n > 1 with coef-
ﬁcients in an algebraically closed, valued ﬁeld (K, v). Since the coeﬃcients of
a polynomial can be expressed as elementary symmetric functions of the roots,
the coeﬃcients depend continuously on the roots. More precisely, if we write
f=Qn
k=1(Xαk) and f=Qn
k=1(Xα
k) in any way, then v(ff)
min{v(α1α
1), . . . , v(αnα
n)}.
The opposite, that the roots depend continuously on the coeﬃcients, is less
evident – it is not even clear what is to be understood by a such statement. The
known results in this direction are of a qualitative nature and do not work well
for polynomials with multiple roots.
5
Deﬁne the error function of the root αof fas the map Φ : Γ ∪ {∞}
Γ∪ {∞} given by
Φ(x) =
n
X
l=1
min{x, v(ααl)}.
It is a strictly increasing, piecewise linear (i.e. piecewise of the form x7→ νx +γ),
bijective (since Γ is assumed to have division from N) map with decreasing slopes
νfrom the set {1,...n}. If Ψ is the error function of the root βof f, the strong
triangle inequality gives
Φ(x) = Ψ(x) for all xv(αβ).(2)
Using error functions, we can now bound the error on the roots of a polynomial
caused by an error on the coeﬃcients.
Theorem 2 (continuity of roots). Let fand fbe monic polynomials of
common degree n > 1with integral coeﬃcients in an algebraically closed, valued
ﬁeld (K, v). We may then write f=Qn
k=1(Xαk)and f=Qn
k=1(Xα
k)such
that v(αkα
k)Φ1
k(v(ff)) for each k. Here Φkdenotes the error function
of the root αkof f.
Proof. Write f=Qn
k=1(Xαk) and put ρk:= Φ1
k(v(ff)) for each k. We may
assume 0 < v(ff)<, and hence 0 < ρk<for each k, since otherwise
the claim is trivial. We show, for each k, that fand fhave the same number
of roots (counted with multiplicity) in the ball {xK|v(xαk)ρk}. It will
then follow (for instance by assuming ρ1ρ2. . . and then choosing α
1, α
2, . . . ,
in that order, such that, for each k,α
kis a root of f/Qk1
l=1 (Xα
l) and has
v(α
kαk)ρk) that we can write f=Qn
k=1(Xα
k) such that v(α
kαk)ρk
for each k.
So ﬁx a k. Let µbe the number of indices lwith v(αlαk)< ρk. We must
show that the number of indices lwith v(α
lαk)< ρkis also µ. Consider the
Newton polygon NP of f(X+αk) = Xn+a1Xn1+· · · +an. The slopes of NP
are v(α1αk), . . . , v(αnαk), in increasing order. So NP(i)NP(i1) < ρk
for iµ, and NP(i)NP(i1) ρkfor i>µ. Let be the “line through
the point p= (µ, v(aµ)) with slope ρk”, i.e. the map {0, . . . , n} → Γ given by
(i) = (iµ)ρk+v(aµ). Then NP(i)> (i) for i < µ,NP(µ) = (µ), and
6
NP(i)(i) for i>µ(see ﬁgure).
(n)
v(aµ)
µ n
p
q

NP
r
r
If we can show the same for the Newton polygon NPof f(X+αk), we are done.
Consider to this end the point q= (n, (n)) on and compute (n):
(n) = (nµ)ρk+v(aµ) = X
l∈{l|v(αl)ρk}
ρk+X
l∈{l|v(αl)k}
v(αl)
=
n
X
l=1
min{ρk, v(αl)}= Φk(ρk) = v(ff)
Since αkis integral, v(f(X+αk)f(X+αk)) = v(ff). It follows that
NP(i) = NP(i) for iµ, and NP(i)(i) for i>µ. This ﬁnishes the proof.
Heuristically, Theorem 2 says, if a root αof fis far away from the other roots,
then an error on the coeﬃcients of fcauses an error on αof equal or smaller
magnitude; however the proximity of other roots makes αmore sensitive to errors
on the coeﬃcients. Let us note a consequence of Theorem 2 illustrating this. Fix
ak, and let µbe the root multiplicity of αkin fmodulo the valuation ideal. This
means that v(αkαl) is 0 for all but µvalues of l. Hence Φk(v(αkα
k)) =
Pn
l=1 min{v(αkα
k), v(αkαl)} ≤ µ·v(αkα
k) and thus
v(αkα
k)v(ff)/µ . (3)
In particular, v(αkα
k)v(ff)/n holds for all k. In light of (3), we might say
that the root αk, as a function of f’s coeﬃcients, satisﬁes a Lipschitz condition
of order 1.
We conclude the section with a typical example where the bound given by
Theorem 2 is best possible.
7
Example. Consider again the polynomial f=X(X2)(X4) over the ﬁeld of
dyadic numbers Q2. The roots α1= 0 and α3= 4 have the same error function
Φ1= Φ3:γ7→ γ+ min{γ, 2}+ min{γ , 1}. The root α2= 2 has error function
Φ2:γ7→ γ+ 2 ·min{γ, 1}. They are shown in Figure 1 and 2.
123
1
2
3
4
5
6
Φ1= Φ3
Figure 1 Figure 2 Figure 3
123
1
2
3
4
5
6
Φ2
123
1
2
3
4
5
6
s
s
s
s
s
s
Now put f=f+ 2νwith some ν0. By Theorem 2, we may write
f= (Xα
1)(Xα
2)(Xα
3) such that v(αkα
k)Φ1
k(ν) for k= 1,2,3.
If αis a root of fmaximally close to α1= 0, then v(α) is the maximal slope
of the Newton polygon NPof f. Figure 3 shows the Newton polygon NP of f
(solid line) and NPfor some values of ν(dotted lines). It is seen that v(α) equals
Φ1
1(ν), and hence v(α
1)=Φ1
1(ν). Similarly, one sees v(αkα
k)=Φ1
k(ν) for
k= 2,3. So Theorem 2 gives in fact an optimal bound.
Finally note that, for ν > 5, each root α
kof fis closer to αkthan to either
of the two other roots of f. This agrees with Theorem 1 and the fact that fhas
separant S= 5.
5 The bipartitionant and the induced factorisa-
tion
Consider a monic polynomial fof degree n > 1 with coeﬃcients in an algebraically
closed, valued ﬁeld (K, v) and write f=Qn
k=1(Xαk). Let Iand Jbe disjoint,
8
non-empty sets with union {1,2, . . . , n}and put g=QiI(Xαi) and h=
QjJ(Xαj) so that f=gh. Deﬁne the bipartitionant of the polynomials g
and has
B:= max{Φi(v(αiαj)) |iI , j J}
where Φiis the error function of the root αiof f. Clearly, B<iﬀ gand hare
relatively prime. Equation (2) implies
B= max{Φj(v(αiαj)) |iI , j J},
showing that the deﬁnition is symmetric in gand h. The crucial property of the
bipartitionant is this:
Lemma 3. Suppose the coeﬃcients of fare integral. Let fbe another monic
polynomial of degree nwith integral coeﬃcients in K, and assume v(ff)>B.
Then we may write f=Qn
k=1(Xα
k)such that v(αiα
i), v(αjα
j)>
v(αiαj)and thereby v(αiαj) = v(α
iα
j)for all iIand all jJ.
Proof. Write f=Qn
k=1(Xα
k) as in Theorem 2. Then v(αiα
i)
Φ1
i(v(ff)) >Φ1
i(B)v(αiαj) and v(αjα
j)Φ1
j(v(ff)) >
Φ1
j(B)v(αiαj) for all iIand jJ. The strong triangle inequality gives
v(αiαj) = v(α
iα
j).
So in the situation of Lemma 3, the roots of fmay be “bipartitioned” into
two sets {α
i|iI}and {α
j|jJ}. This bipartitioning only depends on
the factorisation f=gh and not on the representation f=Qn
k=1(Xα
k) from
Theorem 2. We say that the factorisation f=ghwhere g:= QiI(Xα
i)
and h:= QjJ(Xα
j) is induced by the factorisation f=gh.
How does one compute B? If i0Iand j0Jare such that B= Φi0(v(αi0
αj0)), then
v(αi0αj0) = max{v(αiαj0)|iI}= max{v(αi0αj)|jJ}(4)
9
since the Φ’s are strictly increasing. If, in turn, i0Iand j0Jsatisfy (4), then
Φi0(v(αi0αj0)) =
n
X
k=1
min{v(αi0αj0), v(αi0αk)}
=X
iI
min{v(αi0αj0), v(αi0αi)}+
X
jJ
min{v(αi0αj0), v(αi0αj)}
=X
iI
v(αiαj0) + X
jJ
v(αi0αj)
=v(g(αj0)) + v(h(αi0))
where the third equality requires (4), the strong triangle inequality, and some
consideration. Now conclude
B= max{v(g(αj0)) + v(h(αi0))} | i0Iand j0Jsatisfy (4)}.(5)
Since the bipartitionant replaces twice the value of the resultant Res(g, h) =
Qi,j (αiαj) in our Hensel’s lemma (Theorem 8), it is of interest to compare
these two invariants, and from (5) follows immediately
BX
jJ
v(g(αj)) + X
iI
v(h(αi)) = 2v(Res(g, h))
when fhas integral coeﬃcients.
6 Continuity of factors
There is a remarkable analogue to the continuity of roots that could be called
continuity of factors. In words it says, if there is a factorisation f=gh such
that the roots of gare far away from the roots of h(but possibly close to each
other), then an error on the coeﬃcients of fcauses an error on the coeﬃcients of
gwhich is in general smaller than the error caused on the roots of gindividually.
It should be noted that the main part of Hensel’s lemma is proved in the next
section without the results of this section.
Consider two monic polynomials f, f of common degree n > 1 with integral
coeﬃcients in an algebraically closed, valued ﬁeld (K, v), and write f=Qn
k=1(X
10
αk). Let Iand Jbe disjoint, non-empty sets with union {1,2, . . . , n}and put
g=QiI(Xαi) and h=QjJ(Xαj). Let us call gan isolated factor of f
if
i, i0IjJ:v(αiαi0)> v(αiαj)
i.e. if there is a ball in Kcontaining all roots of gand no roots of h.
Lemma 4 (continuity of isolated factors). Assume v(ff)>Bwhere Bis
the bipartitionant of gand h, and consider the induced factorisation f=gh.
If gis an isolated factor of f, then v(gg)v(ff)B+ max{v(αiαj)|
iI, j J}.
Proof. The idea is to use a general form of Newton approximation to come from
gto g. We may assume g(0) = 0 by a change of variable. Put ν= deg(g) and
µ= deg(h). We may then further assume g=Qν
i=1(Xαi), h=Qn
j=ν+1(Xαj),
and
=v(α1)v(α2)≥ · · · ≥ v(αν)> v(αν+1)≥ · · · ≥ v(αn)
since gis isolated. Thus, u:= max{v(αiαj)|iI , j J}equals v(αν+1), and
Bequals ν·u+v(h(0)) by (5).
Deﬁne three polynomial sequences (gm)mN, (hm)mN, and (rm)mNrecursively
like this: Put g1:= g. Given gm, deﬁne hmand rmsuch that f=gmhm+rm
and deg(rm)< ν. Given gm,hm, and rm, deﬁne gm+1 := gm+rm/hm(0).
The diﬃculty of the proof lies in ﬁnding the right thing to prove. For a ﬁxed
mand for i= 1, . . . , ν, let aiand cibe the values of the coeﬃcients to the terms
of degree νiin gmand rm, respectively. We claim:
(A) aiiu + ∆ where ∆ := min{v(αν)u, v(ff)B}.
(B) The Newton polygon of hmequals the Newton polygon NP of h.
(C) civ(h(0)) + iu +ki∆ where ki:= max{kN|k < (m+i+ν1)}.
The claims are shown by induction after m. Assume m= 1 for the induction
start. All roots of g1=ghave value at least v(αν), and hence
aii·v(αν)i(u+ ∆) iu + ∆ .
This shows (A). Write ff=gh0+r0with deg(r0)< ν. Then f=g(h+h0) +r0
and thus h1=h+h0and r1=r0. Also, v(h0), v(r0)v(ff). Adding h0to h
11
does not change the Newton polygon since v(h0)>Bv(h(0)) = NP(µ). This
shows (B). Finally,
civ(r0)v(ff)B+ ∆ v(h(0)) + iu +ki
since ki= 1 for m= 1, showing (C).
For the induction step, assume (A), (B), and (C) hold for some m, and let
(A’), (B’), and (C’) be the statements corresponding to m+ 1. (A’) follows
immediately from (A) and (C). Note f=gm+1hm(hm/hm(0) 1)rmand hence
hm+1 =hm+h0and rm+1 =r0if we write
(hm/hm(0) 1)rm=gm+1h0+r0(6)
with deg(r0)< ν. Let dibe the value of the coeﬃcient to the term of degree ni
in the left hand side of (6). Using (A), (B), and (C) gives
d1NP(0) + u+k1
d2NP(1) + u+k1
.
.
.
dµNP(µ1) + u+k1
dµ+1 NP(µ1) + 2u+k2
.
.
.
dn1NP(µ1) + νu +kν
=dnNP(µ1) + (ν+ 1)u+kν+1
The algorithm of polynomial division resulting in the expression (6) consists of
a number of steps in each of which a monomial times gm+1 is subtracted from
(hm/hm(0) 1)rm. The key observation is that, in each step, the values of the
coeﬃcients of the remainder satify the same inequalities as the di. Let b0
ibe the
value of the coeﬃcient to the term of degree µiin h0. Then
b0
1NP(0) + u+k1>NP(0) + uNP(1)
.
.
.
b0
µNP(µ1) + u+k1>NP(µ1) + uNP(µ)
12
Hence hm+1 =hm+h0has NP as its Newton polygon, showing (B’). Let c0
ibe the
value of the coeﬃcient to the term of degree νiin r0. Then
c0
1NP(µ1) + 2u+k2∆ = v(h(0)) + u+k2
.
.
.
c0
νNP(µ1) + (ν+ 1)u+kν+1∆ = v(h(0)) + νu +kν+1
This shows (C’) and ﬁnishes the induction step.
By (C), v(rm)→ ∞ and hence gmhmf. By the continuity of roots, the
roots of gmhmconverge to the roots of f(in a multiplicity-respecting way). By
assumption, the roots of ghave values > u, whereas the roots of hhave values
u. Lemma 3 then gives that the roots of ghave values > u, whereas the roots
of hhave values u. By (A), the roots of gmhave values > u. It follows that
the roots of gmconverge to the roots of g, and thereby the coeﬃcients converge
too: gmg. Finally, g=g+P
m=1 rm/hm(0) and therefore by (C),
v(gg)min{v(rm)v(h(0)) |mN}
u+ ∆
v(ff)B+ max{v(αiαj)|iI, j J}.
Let us show that Lemma 4 coincides with the Hensel-Rella criterion when g
is linear. Given is a polynomial Fwith an approximate root ξ0. Put g=Xξ0
and h= (FF(ξ0))/(Xξ0). Then the left hand side of (1) is the value of
F(ξ0) = Fgh, and it can be seen that the right hand side of (1) equals the
bipartitionant of gand h. Hence, the gmconverge to a polynomial g=Xξ
dividing F. In the proof of Lemma 4, we could as well have deﬁned gm+1 as
gm+rm/hm(ξm) where ξmis any root of gm(or any other element suﬃciently
close to 0). With this deﬁnition and with linear g, the approximation process
becomes identical with usual Newton approximation.
Theorem 5 (continuity of factors). Let fand fbe monic polynomials of com-
mon degree n > 1with integral coeﬃcients in an algebraically closed, valued ﬁeld
(K, v). Consider a monic factorisation f=gh, and let Bbe the bipartitionant of
gand h. Assume v(ff)>B, and let f=ghbe the induced factorisation.
Then v(gg), v(hh)v(ff)B.
13
Proof. Write g=g1. . . grsuch that each glis a maximal (with respect to divisi-
bility) monic factor of gwhich is an isolated factor of f. The bipartitionant of gl
and ˜gl:= f/glis
Bl:= max{Φi(v(αiαj)) |gl(αi) = ˜gl(αj) = 0}
= max{Φi(v(αiαj)) |gl(αi)=0, j J}
(last equality follows from the maximality of gl), implying
B= max{Φi(v(αiαj)) |iI, j J}
= max{B1,...,Br}.
Lemma 4 gives
v(gg)min{v(ff)Bl+ Φ1
i(Bl)|l= 1, . . . , r , gl(αi)=0}
min{v(ff)Bl|l= 1, . . . , l}
=v(ff)B.
The inequality for v(hh) can be proved the same way, but also follows directly
by dividing fby g.
Example. Consider the polynomial f=X2(X2)(X4) = X46X3+8X2over
the ﬁeld of dyadic numbers Q2. The error function of the double root α1=α2= 0
is Φ(γ)=2·γ+ min{γ, 1}+ min{γ, 2}. The bipartitionant of the factors g=X2
and h= (X2)(X4) is B=v(g(4))+ v(h(0)) = 7. Let f=f+2νwith ν > 7
and consider the induced factorisation f=gh. By Lemma 4, v(gg)ν5.
The ﬁgure shows the inverse error function Φ1and the line ν7→ ν5 (dotted):
12345678910
1
2
3
4
5
Φ1
r
r
14
Let us compute v(gg) precisely. The Newton polygon of fshows that
the roots α
1, . . . , α
4of fhave values v(α
1) = v(α
2) = (ν3)/2, v(α
3) = 1, and
v(α
4) = 2. We have
g= (Xα
1)(Xα
2) = X2(α
1+α
2)X+α
1α
2.
From the above follows v(α
1α
2) = ν3. It is more tricky to compute v(α
1+α
2).
To this end, consider the polynomial f(Xα
1). It has roots 2α
1,α
1+α
2,
α
1+α
3,α
1+α
4and constant term f(α
1) = f(α
1) + 12(α
1)3= 12(α
1)3. Thus,
v(α
1+α
2) = v(f(α
1)) v(2α
1)v(α
1+α
3)v(α
1+α
4)
= (3ν5)/2(ν1)/212
=ν5.
Conclude v(gg) = min{ν5, ν 3}=ν5.
The moral of the story is that the bound on the coeﬃcients of ggiven by
Lemma 4 is best possible (contrary to that of Theorem 5) and better than the
bound on the roots of ggiven by Theorem 2.
One may wonder if there is also “continuity of factors” when v(ff)B,
i.e. if there is a bound on the error on the coeﬃcients of gbetter than the bound
on the error on the roots of g. That is not likely to be the case. For when
v(ff)B, it is no longer possible to bipartition the roots of fas in Lemma
3. In other words, the factorisation f=gh no longer gives rise to a natural
factorisation f=gh. This view is supported by the observation that, in the
limit v(ff) = B, the bound on the error on gin the example above coincides
with the bound on the error on the roots of g.
7 Krasner’s lemma
The well-known Krasner’s lemma (see Corollaire 1, page 190 of Ribenboim (1968),
for instance) was in fact found by Ostrowski already in 1917. We give here a gen-
eralisation that will be used in the next section.
Theorem 6 (lemma a la Krasner). Consider a monic polynomial f=Qn
k=1(X
15
α
k)of degree n > 1with coeﬃcients in a Henselian ﬁeld (K, v)and roots in the
algebraic closure ˜
K. Let Iand Jbe two disjoint, non-empty sets with union
{1, . . . , n}. Moreover, consider a polynomial g=QiI(Xαi)with coeﬃcients
and roots in ˜
K. Assume
iIjJ:v(αiα
i)> v(α
iα
j).(7)
Then the coeﬃcients of the polynomials g:= QiI(Xα
i)and h:= QjJ(X
α
j)are contained in the ﬁeld extension of Kgenerated by the coeﬃcients of g.
Proof. Part A. First some preliminary observations. From (7) follows at once that
gand hare relatively prime. Since f=gh, the coeﬃcients of ggenerate
the same extension of Kas the coeﬃcients of h. We may assume without loss
of generality – and will do so – that ghas coeﬃcients in K. What is left to prove
is that ghas coeﬃcients in K.
Now let Ksep be the separable algebraic closure of K. Since Ksep is a sepa-
rably closed ﬁeld, every irreducible polynomial over Ksep has only one (possibly
multiple) root. Since ghhas coeﬃcients in Ksep, and gand hare relatively
prime, it follows that gand hhave coeﬃcients in Ksep.
We show in part B that every K-automorphism σon ˜
Kpermutes the roots
of g. Hence, every such σﬁxes the coeﬃcients of g. The coeﬃcients of gare
therefore purely inseparable over K.
Since the coeﬃcients of gare both separable and purely inseparable over K,
they do in fact belong to K.
Part B. Let σbe a K-automorphism on ˜
K. Consider the sets A={αi|iI},
A={α
i|iI}, and A∗∗ ={α
j|jJ}. Note that AAand A∗∗ are disjoint
by (7). Since gand fhave coeﬃcients in K,σis a “multiplicity-preserving”
permutation on both Aand AA∗∗. Since Kis Henselian, σis isometric. We
show that (7) implies that σpermutes Aand A∗∗ individually. This is really a
lemma on ﬁnite ultra-metric spaces.
For αA, let B(α) be the maximal ball in the ﬁnite ultra-metric space
AAA∗∗ containing αand being contained in AA. Then (7) implies
iI:αi∈ B(α)α
i∈ B(α).(8)
Every αAis thereby contained in some B(α), so we are done if we can show
σ(B(α)) jAA.
16
For any αiAσ(B(α)), the balls σ(B(α)) and B(αi) have non-empty
intersection (both contain αi), hence one is contained in the other. If there is an
αiAσ(B(α)) such that σ(B(α)) jB(αi), then σ(B(α)) jAAand we are
done. So assume from now on σ(B(α)) ⊃ B(αi) for all αiAσ(B(α)).
For a subset Xof AAA∗∗ , let #Xdenote X’s cardinality “counted with
multiplicity”, i.e.
#X:= |{iI|αiX}| +|{kIJ|α
kX}| .
We then have
#B(α) = 2 · |{iI|αi∈ B(α)}|
by (8). Since σpreserves multiplicity and permutes A,
#B(α) = #σ(B(α)) and |{iI|αi∈ B(α)}| =|{iI|αiσ(B(α))}|
hold. For iIwith αiσ(B(α)), (8) implies α
i∈ B(αi)σ(B(α)) and hence
|{iI|α
iσ(B(α))}| ≥ |{iI|αiσ(B(α))}| .
Putting everything together gives
#σ(B(α)) = |{iI|αiσ(B(α))}| +|{kIJ|α
kσ(B(α))}|
2· |{iI|αiσ(B(α))}| +|{jJ|α
jσ(B(α))}|
= 2 · |{iI|αi∈ B(α)}| +|{jJ|α
jσ(B(α))}|
= #B(α) + |{jJ|α
jσ(B(α))}|
= #σ(B(α)) + |{jJ|α
jσ(B(α))}| .
Finally, conclude |{jJ|α
jσ(B(α))}| = 0, i.e. σ(B(α)) jAA.
Theorem 6 has an immediate corollary which itself reduces to the usual Kras-
ner’s lemma when the element ais separable over K:
Corollary 7. Consider a Henselian ﬁeld Kand let aand bbe elements in
the algebraic closure ˜
K. Assume bis closer to athan to any of a’s conjugates.
Then K(b)contains the coeﬃcients of the polynomial (Xa)µwhere µis the
root multiplicity of ain its minimal polynomial over K.
17
Remark. In the application of Theorem 6 in the proof of Theorem 8 below, we
also have a polynomial h=QjJ(Xαj) satisfying
iIjJ:v(αjα
j)> v(α
iα
j) (9)
and such that gh has coeﬃcients in K. In this situation, part B of the proof
of Theorem 6 can be replaced by the following simpler argument: Assume for
a contradiction that there are iIand jJsuch that σ(α
i) = α
j. By
symmetry, we may assume v(αiα
i)v(αjα
j) Then σ(αi) = αi0for some
i0I. Since σis isometric, v(αi0α
j) = v(αiα
i)v(αjα
j). But now
v(αi0αj)v(αjα
j), in contradiction with (9).
8 Hensel’s lemma
We can now state and prove the promised general Hensel’s lemma.
Theorem 8 (monic Hensel’s lemma). Consider two monic polynomials f
and fof common degree n > 1with integral coeﬃcients in a Henselian ﬁeld
(K, v). Let there be given a factorisation f=gh with monic gand h. Assume
v(ff)>Bwhere Bis the bipartitionant of gand h. Then there is a fac-
torisation f=ghwhere gand hare monic and have integral coeﬃcients,
deg(g) = deg(g), deg(h) = deg(h), and v(gg), v(hh)v(ff)B.
Proof. Consider the induced factorisation f=gh. The factors gand hhave
coeﬃcients in Kby Lemma 3 and Theorem 6. The bound on v(gg) and
v(hh) follows from Theorem 5.
Example. Consider the polynomial f=X8(X+ 2)8+ 2νwith ν0 over the
ﬁeld of dyadic numbers Q2. The bipartitionant of g=X8and h= (X+ 2)8is
B= 16. By Theorem 8, fis reducible for all ν > 16. More precisely, there is
in this case a monic factorisation f=ghwith v(gg), v(hh)ν16
(using Lemma 4 instead of Theorem 5 gives in fact v(gg), v(hh)ν15).
It can be shown that fis irreducible for ν= 0,1,3,4,5,7,8,9,11,12,13,15,16,
implying that the bound v(ff)>Bis best possible. The dyadic value of
the resultant of gand his 64, so the Hensel’s lemma of 1908 gives a factorisation
18
f=ghwith v(gg), v(hh)ν64 for ν > 128.
To make life as easy as possible, we have so far solely studied monic poly-
nomials having integral coeﬃcients. This is indeed the situation in almost all
applications of Hensel’s lemma. Also, when a given non-monic polynomial Fhas
an approximate factorisation satisfying the conditions of the non-monic Hensel’s
lemma, the reducibility of Ffollows immediately from the observation that the
Newton polygon of Fis not a straight line.
Nevertheless, we now turn our attention to the non-monic case. The proof of
the following theorem is entirely analogous to that of the monic Hensel’s lemma,
but the presence of non-monic polynomials forces us to reexamine the proofs of
earlier theorems.
Theorem 9 (Hensel’s lemma, ﬁnal form). Consider two polynomials Fand F
of common degree n > 1with integral coeﬃcients in a Henselian ﬁeld (K, v)and
with the same leading coeﬃcient c. Let there be given a factorisation F=gH
where gis monic and has integral coeﬃcients, and His primitive, i.e. v(H) = 0.
Assume v(FF)>max{0,B+v(c)}where Bis the bipartitionant of gand
c1H. Then there is a factorisation F=gHwhere gis monic and has
integral coeﬃcients, His primitive, deg(g) = deg(g), deg(H) = deg(H), and
v(gg), v(HH)v(FF)max{0,B+v(c)}.
Proof. First introduce monic polynomials f:= c1F,f:= c1F, and h:=
c1H. Note f=gh,v(ff) = v(FF)v(c), and thus v(ff)>
max{−v(c),B}.
Write f=Qn
k=1(Xαk) and let Iand Jbe the sets with g=QiI(Xαi)
and h=QjJ(Xαj). Put ρi:= Φ1
i(v(ff)) for each iI. Note Φi(0) =
Pn
l=1 min{0, v(αiαl)}=v(c)< v(ff) and hence 0 < ρi.
The proof of Theorem 2, word for word, shows that fand fhave the same
number of roots (counted with multiplicity) in the ball {x˜
K|v(xαi)ρi}
for any iI. It follows that we can write f=Qn
k=1(Xα
k) such that
v(αiα
i)ρifor each iI. We have v(αiαj)Φ1
i(B)< ρifor iI
and jJ, and therefore v(α
iα
j)< ρifor iIand jJ. Conclude
v(αiα
i)> v(α
iα
j) for all iIand jJ.
19
By Theorem 6, g:= QiI(Xα
i) and h:= QjJ(Xα
j) have coeﬃ-
cients in K. Reexamination of the proofs of Lemma 4 and Theorem 5 shows
v(gg)v(ff)max{−v(c),B}. Now put H:= ch.
Notice that the resultant of gand Hhas value
v(Res(g, H )) = deg(g)·v(c) + v(Res(g, h))
= deg(g)·v(c) + X
iI,j J
v(αiαj)
=X
iI,j J
max{0, v(αiαj)}
By (5), the bipartitionant of gand his B=PiIv(αiαj0) + PjJv(αi0αj)
for suitable i0Iand j0J. There follows max{0,B+v(c)} ≤ 2v(Res(g, H)).
Hence, Theorem 9 generalises the Hensel’s lemma of 1908 as well as its in section
1 mentioned later reincarnations.
References
 G. Dumas, Sur quelques cas d’irr´eductibilit´e des polynomes a coeﬃcients
rationnels, J. Math. Pures Appl. 61 (1906), 191–258.
 K. Hensel, Neue Grundlagen der Arithmetik, J. Reine Angew. Math. 127
(1904), 51–84.
 K. Hensel, Theorie der algebraischen Zahlen, Teubner, Leipzig, 1908.
 W. Krull, Allgemeine Bewertungstheorie, J. Reine Angew. Math. 167 (1932),
160–196.
 J. K¨urscak, ¨
Uber Limesbildung und allgemeine K¨orpertheorie, J. Reine
Angew. Math. 142 (1913), 211–253.
 M. Nagata, On the Theory of Henselian Rings, Nagoya Math. J. 5(1953),
45–57.
 A. Ostrowski, Untersuchungen zur arithmetischen Theorie der K¨orper,
Math. Z. 39 (1935), 269–404.
20
 F. J. Rayner, Relatively Complete Fields, Proc. Edinburgh Math. Soc. 11
(1958), 131–133.
 T. Rella, Zur Newtonschen Approximationsmethode in der Theorie der p-
adischen Gleichungswurzeln, J. Reine Angew. Math. 153 (1924), 111–112.
 T. Rella, Ordnungsbestimmungen in Integrit¨atsbereichen und Newtonsche
Polygone, J. Reine Angew. Math. 158 (1927), 33–48.
 P. Ribenboim, Th´eorie des valuations, Les presses de l’Universit´e de
Montr´eal, Montreal, 1968.
 P. Ribenboim, Equivalent forms of Hensel’s lemma, Expo. Math. 3(1985),
3–24.
 D. S. Rim, Relatively complete ﬁelds, Duke Math. J. 24 (1957), 197–200.
 P. Roquette, History of Valuation Theory. Part I. In: F. V. Kuhlmann, S.
Kuhlmann, M. Marshall (ed.), Valuation Theory and its applications, vol. 1,
Fields Inst. Commun. 32 (2002), 291–355.
 K. Rychl´ık, Zur Bewertungstheorie der algebraischen K¨orper, J. Reine
Angew. Math. 153 (1924), 94–107.
21