Article

New light on Hensel's lemma

Abstract

The historical development of Hensel's lemma is briefly discussed (Section 1). Using Newton polygons, a simple proof of a general Hensel's lemma for separable polynomials over Henselian fields is given (Section 3). For polynomials over algebraically closed, valued fields, best possible results on continuity of roots (Section 4) and continuity of factors (Section 6) are demonstrated. Using this and a general Krasner's lemma (Section 7), we give a short proof of a general Hensel's lemma and show that it is, in a certain sense, best possible (Section 8). All valuations here are non-Archimedean and of arbitrary rank. The article is practically self-contained.
New light on Hensel’s lemma
David Brink
March 2006
Abstract: The historical development of Hensel’s lemma is briefly discussed
(section 1). Using Newton polygons, a simple proof of a general Hensel’s lemma
for separable polynomials over Henselian fields is given (section 3). For polyno-
mials over algebraically closed, valued fields, best possible results on continuity
of roots (section 4) and continuity of factors (section 6) are demonstrated. Using
this and a general Krasner’s lemma (section 7), we give a short proof of a general
Hensel’s lemma and show that it is, in a certain sense, best possible (section 8).
All valuations here are non-archimedean and of arbitrary rank. The article is
practically self-contained.
1 Introduction and historical remarks
The p-adic numbers were introduced in 1904 by Hensel in Neue Grundlagen der
Arithmetik. In the same article, Hensel showed that if a monic polynomial fwith
integral p-adic coefficients has an approximate factorisation fgh, meaning that
the coefficients of the difference fgh are p-adically smaller than the discriminant
of f, then there exists an exact factorisation f=gh. Four years later, in 1908,
Hensel gave a somewhat more general result in his book Theorie der algebraischen
Zahlen, where fis no longer assumed monic, and the discriminant of fis replaced
by the squared resultant of gand h.
Since then, many variations and generalisations of Hensel’s result have been
found, some of which bear only little resemblance to the original. Confusingly,
all these theorems are known today as “Hensel’s lemma”. We mention here the
most important. urschak (1913) introduced real valuations on the abstract
1
fields recently defined by Steinitz and indicated that Hensel’s arguments would
carry over to complete, non-archimedean valued fields. Rychl´ık (1919) under-
took these generalisations explicitly. Krull (1932) introduced general valuations,
gave a new concept of completeness, and showed that a weak Hensel’s lemma (g
and hare assumed relatively prime modulo the valuation ideal) holds for such
fields. Nagata (1953) showed that if a weak Hensel’s lemma holds in some field
with a valuation v, then the original Hensel’s lemma holds too under the extra
assumption that v(fgh)2v(Res(g, h)) is not contained in the maximal con-
vex subgroup of the value group not containing v(Res(g, h)). Rim (1957) and
Rayner (1958) proved that the unique extension property implies weak forms of
Hensel’s lemma. Ribenboim (1985) showed the logical equivalence between these
and other “Hensel’s lemmas”. The reader is referred to the very interesting paper
of Roquette (2002) regarding the history of Hensel’s lemma and valuation theory
in general.
In the present paper, a new proof of Hensel’s lemma is presented that gener-
alises the original in another direction, namely with respect to the accuracy of the
approximate factorisation. It will be seen that the discriminant and the resultant
disappear completely. They are replaced by two new polynomial invariants, here
called the separant and the bipartitionant. The core of the proof is an analysis
of the continuous behaviour of the roots of a polynomial as functions of the co-
efficients. These arguments, in contrast to earlier proofs, work equally well for
arbitrary as for real valuations and make Nagata’s extra assumption superfluous.
The only thing we need is that the valuation has the unique extension property.
After proving his lemma in Hensel (1908), Hensel demonstrated the following:
If the p-adic polynomial Fof degree νhas an approximate root ξ0satifying
ρ > max 0ρ(i)
i1
i= 2,3, . . . , ν(1)
where ρis the value of F(ξ0), and ρ(i)is the value of F(i)(ξ0)/i!, then Newton
approximation gives an exact root ξ, provided that the values ρ0, ρ00 , . . . , ρ(ν)remain
unchanged during the approximation process. In a short note from 1924, Rella
showed the last condition to follow from (1). Our general Hensel’s lemma will be
seen to cover this Hensel-Rella criterion.
As noted by Rella in 1927, the existence of ξis an almost immediate conse-
quence of the Newton polygon method, a ubiquitous theme of this article. The p-
2
adic Newton polygon was introduced by Dumas already in 1906 and later studied
by K¨ursch´ak, Rella, and Ostrowski, but surprisingly never mentioned by Hensel.
2 Valuations, Newton polygons, and the unique
extension property
Consider a field K. By a valuation on Kwe understand a map vfrom Kinto a
totally ordered, additively written abelian group with infinity Γ ∪ {∞} satisfying
v(0) = ,v(x)Γ if x6= 0, v(xy) = v(x) + v(y), and the strong triangle
inequality v(x+y)min{v(x), v(y)}. In this situation, the pair (K, v) is called
avalued field,v(x) is called the value of xK, and xis called integral if
v(x)0. If Γ is order-isomorphic to a subgroup of R+, the valuation is called
real (the term “rank 1” is also standard). Sometimes we will use that Γ has
division from N. This may indeed be assumed without loss of generality, for
we can always embed Γ into some larger group Γ0having that property. For
a polynomial f=a0Xn+a1Xn1+· · · +anwith coefficients in K, we define
v(f) := min{v(a0), . . . , v(an)}.
The Newton polygon is a simple, yet powerful tool in valuation theory. It
seems to have been always restricted to the case of real valuations, so we give here
a definition for arbitrary valuations in the above sense. Consider a polynomial f=
a0Xn+a1Xn1+· · ·+anof degree n > 0 with coefficients and roots in a valued field
(K, v). Usually, it is difficult to compute the roots by means of the coefficients,
but in contrast to this, it is easy to compute the values of the roots by means of
the values of the coefficients. Define f’s Newton polygon as the maximal convex
map NP :{0,1, . . . , n} → Γ∪ {∞} satisfying NP(i)v(ai) for all i. By “convex”
is understood the obvious, i.e. that 2·NP(i)NP(i1)+NP(i+1) for all i6= 0, n.
The differences NP(i)NP(i1), with the convention ∞−∞ =, are the slopes
of NP. They form an increasing sequence. Now write f=a0·Qn
i=1(Xαi) such
that v(α1)≤ · · · ≤ v(αn). Then v(αi) = NP(i)NP(i1) for all i= 1, . . . , n.
In words, the values of the roots of a polynomial equal the slopes of its Newton
polygon. The conceptually easy, but notationally cumbersome proof expresses the
aias elementary symmetric functions in the αiwhereupon the v(ai) are computed
from the v(αi) using the strong triangle inequality.
3
We call a valued field (K, v)Henselian if it has the unique extension
property, i.e. if vhas a unique extension (also denoted v) to the algebraic closure
˜
Kof K. Note that the existence of a valuation extension is automatic with this
definition. The unique extension property is, as a matter of fact, equivalent to
many (maybe all) variants of Hensel’s lemma, see for instance Ribenboim (1985).
We actually only use a certain consequence of the unique extension property,
namely this: any K-automorphism σof ˜
Kis isometric with respect to v(since
otherwise vσwould be an extension different from v). The slopes of the Newton
polygon of an irreducible polynomial over a Henselian field are thus all the same,
an observation due to Ostrowski (1935).
3 The separant and the “separable Hensel’s lemma”
For a monic polynomial f=Qn
k=1(Xαk) of degree n > 1 with roots in a valued
field (K, v), we define the polynomial invariant
S= max{v(f0(αk)) + v(αkαl)|k6=l}
and call it f’s separant. Note f0(αk) = Ql6=k(αkαl) and that S<iff
fis separable (i.e. fhas no multiple roots). A monic polynomial with integral
coefficients has integral roots. So if fhas integral coefficients, Sis less than or
equal to the value of f’s discriminant disc(f) = Qk <l(αkαl)2. Therefore, the
following “separable Hensel’s lemma” generalises the Hensel’s lemma of 1904.
Theorem 1 (separable Hensel’s lemma). Let fand fbe monic polynomials
of common degree n > 1with integral coefficients in a Henselian field (K, v).
Assume v(ff)>Swhere Sis the separant of f. Then fand fare both
separable, and we may write f=Qn
k=1(Xαk)and f=Qn
k=1(Xα
k)such
that K(αk) = K(α
k)for all k.
Proof. Since Sis finite, fis separable. Write f=Qn
k=1(Xαk) and fix a k. The
Newton polygon NP of f(X+αk) has NP(n) = and NP(n1) = v(f0(αk)).
The root αkis integral, and therefore v(f(X+αk)f(X+αk)) = v(ff).
Consequently, the assumption v(ff)>Simplies that the Newton polygon
4
NPof f(X+αk) satisfies
NP(i) =
NP(i) for i<n
v(f(αk)) >Sfor i=n
Hence, fhas a root α
kwith v(α
kαk) = NP(n)NP(n1) >Sv(f0(αk))
v(αkαl) for all ldifferent from k.
This way we get ndistinct roots α
1, . . . , α
nof fsuch that v(α
kαk)>
v(αkαl) for all distinct kand l. Now Krasner’s lemma (see section 7) gives
K(αk) = K(α
k) for all k. Naturally, f=Qn
k=1(Xα
k).
So if a polynomial fis separable, then any other polynomial fhaving coeffi-
cients sufficiently close to those of fhas the same factorisation as f. This fails to
be true if fhas multiple roots. Over the field of dyadic numbers Q2, for instance,
f=X2is reducible, but f=X2+ 2νis irreducible for any ν.
Example. Consider the polynomial f=X(X2)(X4) = X36X2+ 8X
over Q2. It has separant S= 5 (whereas the value of the discriminant is 8).
Hence, the polynomial f=f+ 2νhas 3 distinct roots in Q2for all ν > 5. For
ν= 5, however, fhas an irreducible quadratic factor over Q2, showing that the
bound v(ff)>Sis best possible.
4 Error functions and continuity of roots
Consider two monic polynomials fand fof common degree n > 1 with coef-
ficients in an algebraically closed, valued field (K, v). Since the coefficients of
a polynomial can be expressed as elementary symmetric functions of the roots,
the coefficients depend continuously on the roots. More precisely, if we write
f=Qn
k=1(Xαk) and f=Qn
k=1(Xα
k) in any way, then v(ff)
min{v(α1α
1), . . . , v(αnα
n)}.
The opposite, that the roots depend continuously on the coefficients, is less
evident – it is not even clear what is to be understood by a such statement. The
known results in this direction are of a qualitative nature and do not work well
for polynomials with multiple roots.
5
Define the error function of the root αof fas the map Φ : Γ ∪ {∞}
Γ∪ {∞} given by
Φ(x) =
n
X
l=1
min{x, v(ααl)}.
It is a strictly increasing, piecewise linear (i.e. piecewise of the form x7→ νx +γ),
bijective (since Γ is assumed to have division from N) map with decreasing slopes
νfrom the set {1,...n}. If Ψ is the error function of the root βof f, the strong
triangle inequality gives
Φ(x) = Ψ(x) for all xv(αβ).(2)
Using error functions, we can now bound the error on the roots of a polynomial
caused by an error on the coefficients.
Theorem 2 (continuity of roots). Let fand fbe monic polynomials of
common degree n > 1with integral coefficients in an algebraically closed, valued
field (K, v). We may then write f=Qn
k=1(Xαk)and f=Qn
k=1(Xα
k)such
that v(αkα
k)Φ1
k(v(ff)) for each k. Here Φkdenotes the error function
of the root αkof f.
Proof. Write f=Qn
k=1(Xαk) and put ρk:= Φ1
k(v(ff)) for each k. We may
assume 0 < v(ff)<, and hence 0 < ρk<for each k, since otherwise
the claim is trivial. We show, for each k, that fand fhave the same number
of roots (counted with multiplicity) in the ball {xK|v(xαk)ρk}. It will
then follow (for instance by assuming ρ1ρ2. . . and then choosing α
1, α
2, . . . ,
in that order, such that, for each k,α
kis a root of f/Qk1
l=1 (Xα
l) and has
v(α
kαk)ρk) that we can write f=Qn
k=1(Xα
k) such that v(α
kαk)ρk
for each k.
So fix a k. Let µbe the number of indices lwith v(αlαk)< ρk. We must
show that the number of indices lwith v(α
lαk)< ρkis also µ. Consider the
Newton polygon NP of f(X+αk) = Xn+a1Xn1+· · · +an. The slopes of NP
are v(α1αk), . . . , v(αnαk), in increasing order. So NP(i)NP(i1) < ρk
for iµ, and NP(i)NP(i1) ρkfor i>µ. Let `be the “line through
the point p= (µ, v(aµ)) with slope ρk”, i.e. the map {0, . . . , n} → Γ given by
`(i) = (iµ)ρk+v(aµ). Then NP(i)> `(i) for i < µ,NP(µ) = `(µ), and
6
NP(i)`(i) for i>µ(see figure).
`(n)
v(aµ)
µ n
p
q
`
NP
r
r
If we can show the same for the Newton polygon NPof f(X+αk), we are done.
Consider to this end the point q= (n, `(n)) on `and compute `(n):
`(n) = (nµ)ρk+v(aµ) = X
l∈{l|v(αl)ρk}
ρk+X
l∈{l|v(αl)k}
v(αl)
=
n
X
l=1
min{ρk, v(αl)}= Φk(ρk) = v(ff)
Since αkis integral, v(f(X+αk)f(X+αk)) = v(ff). It follows that
NP(i) = NP(i) for iµ, and NP(i)`(i) for i>µ. This finishes the proof.
Heuristically, Theorem 2 says, if a root αof fis far away from the other roots,
then an error on the coefficients of fcauses an error on αof equal or smaller
magnitude; however the proximity of other roots makes αmore sensitive to errors
on the coefficients. Let us note a consequence of Theorem 2 illustrating this. Fix
ak, and let µbe the root multiplicity of αkin fmodulo the valuation ideal. This
means that v(αkαl) is 0 for all but µvalues of l. Hence Φk(v(αkα
k)) =
Pn
l=1 min{v(αkα
k), v(αkαl)} ≤ µ·v(αkα
k) and thus
v(αkα
k)v(ff)/µ . (3)
In particular, v(αkα
k)v(ff)/n holds for all k. In light of (3), we might say
that the root αk, as a function of f’s coefficients, satisfies a Lipschitz condition
of order 1.
We conclude the section with a typical example where the bound given by
Theorem 2 is best possible.
7
Example. Consider again the polynomial f=X(X2)(X4) over the field of
dyadic numbers Q2. The roots α1= 0 and α3= 4 have the same error function
Φ1= Φ3:γ7→ γ+ min{γ, 2}+ min{γ , 1}. The root α2= 2 has error function
Φ2:γ7→ γ+ 2 ·min{γ, 1}. They are shown in Figure 1 and 2.
123
1
2
3
4
5
6
Φ1= Φ3
Figure 1 Figure 2 Figure 3
123
1
2
3
4
5
6
Φ2
123
1
2
3
4
5
6
s
s
s
s
s
s
Now put f=f+ 2νwith some ν0. By Theorem 2, we may write
f= (Xα
1)(Xα
2)(Xα
3) such that v(αkα
k)Φ1
k(ν) for k= 1,2,3.
If αis a root of fmaximally close to α1= 0, then v(α) is the maximal slope
of the Newton polygon NPof f. Figure 3 shows the Newton polygon NP of f
(solid line) and NPfor some values of ν(dotted lines). It is seen that v(α) equals
Φ1
1(ν), and hence v(α
1)=Φ1
1(ν). Similarly, one sees v(αkα
k)=Φ1
k(ν) for
k= 2,3. So Theorem 2 gives in fact an optimal bound.
Finally note that, for ν > 5, each root α
kof fis closer to αkthan to either
of the two other roots of f. This agrees with Theorem 1 and the fact that fhas
separant S= 5.
5 The bipartitionant and the induced factorisa-
tion
Consider a monic polynomial fof degree n > 1 with coefficients in an algebraically
closed, valued field (K, v) and write f=Qn
k=1(Xαk). Let Iand Jbe disjoint,
8
non-empty sets with union {1,2, . . . , n}and put g=QiI(Xαi) and h=
QjJ(Xαj) so that f=gh. Define the bipartitionant of the polynomials g
and has
B:= max{Φi(v(αiαj)) |iI , j J}
where Φiis the error function of the root αiof f. Clearly, B<iff gand hare
relatively prime. Equation (2) implies
B= max{Φj(v(αiαj)) |iI , j J},
showing that the definition is symmetric in gand h. The crucial property of the
bipartitionant is this:
Lemma 3. Suppose the coefficients of fare integral. Let fbe another monic
polynomial of degree nwith integral coefficients in K, and assume v(ff)>B.
Then we may write f=Qn
k=1(Xα
k)such that v(αiα
i), v(αjα
j)>
v(αiαj)and thereby v(αiαj) = v(α
iα
j)for all iIand all jJ.
Proof. Write f=Qn
k=1(Xα
k) as in Theorem 2. Then v(αiα
i)
Φ1
i(v(ff)) >Φ1
i(B)v(αiαj) and v(αjα
j)Φ1
j(v(ff)) >
Φ1
j(B)v(αiαj) for all iIand jJ. The strong triangle inequality gives
v(αiαj) = v(α
iα
j).
So in the situation of Lemma 3, the roots of fmay be “bipartitioned” into
two sets {α
i|iI}and {α
j|jJ}. This bipartitioning only depends on
the factorisation f=gh and not on the representation f=Qn
k=1(Xα
k) from
Theorem 2. We say that the factorisation f=ghwhere g:= QiI(Xα
i)
and h:= QjJ(Xα
j) is induced by the factorisation f=gh.
How does one compute B? If i0Iand j0Jare such that B= Φi0(v(αi0
αj0)), then
v(αi0αj0) = max{v(αiαj0)|iI}= max{v(αi0αj)|jJ}(4)
9
since the Φ’s are strictly increasing. If, in turn, i0Iand j0Jsatisfy (4), then
Φi0(v(αi0αj0)) =
n
X
k=1
min{v(αi0αj0), v(αi0αk)}
=X
iI
min{v(αi0αj0), v(αi0αi)}+
X
jJ
min{v(αi0αj0), v(αi0αj)}
=X
iI
v(αiαj0) + X
jJ
v(αi0αj)
=v(g(αj0)) + v(h(αi0))
where the third equality requires (4), the strong triangle inequality, and some
consideration. Now conclude
B= max{v(g(αj0)) + v(h(αi0))} | i0Iand j0Jsatisfy (4)}.(5)
Since the bipartitionant replaces twice the value of the resultant Res(g, h) =
Qi,j (αiαj) in our Hensel’s lemma (Theorem 8), it is of interest to compare
these two invariants, and from (5) follows immediately
BX
jJ
v(g(αj)) + X
iI
v(h(αi)) = 2v(Res(g, h))
when fhas integral coefficients.
6 Continuity of factors
There is a remarkable analogue to the continuity of roots that could be called
continuity of factors. In words it says, if there is a factorisation f=gh such
that the roots of gare far away from the roots of h(but possibly close to each
other), then an error on the coefficients of fcauses an error on the coefficients of
gwhich is in general smaller than the error caused on the roots of gindividually.
It should be noted that the main part of Hensel’s lemma is proved in the next
section without the results of this section.
Consider two monic polynomials f, f of common degree n > 1 with integral
coefficients in an algebraically closed, valued field (K, v), and write f=Qn
k=1(X
10
αk). Let Iand Jbe disjoint, non-empty sets with union {1,2, . . . , n}and put
g=QiI(Xαi) and h=QjJ(Xαj). Let us call gan isolated factor of f
if
i, i0IjJ:v(αiαi0)> v(αiαj)
i.e. if there is a ball in Kcontaining all roots of gand no roots of h.
Lemma 4 (continuity of isolated factors). Assume v(ff)>Bwhere Bis
the bipartitionant of gand h, and consider the induced factorisation f=gh.
If gis an isolated factor of f, then v(gg)v(ff)B+ max{v(αiαj)|
iI, j J}.
Proof. The idea is to use a general form of Newton approximation to come from
gto g. We may assume g(0) = 0 by a change of variable. Put ν= deg(g) and
µ= deg(h). We may then further assume g=Qν
i=1(Xαi), h=Qn
j=ν+1(Xαj),
and
=v(α1)v(α2)≥ · · · ≥ v(αν)> v(αν+1)≥ · · · ≥ v(αn)
since gis isolated. Thus, u:= max{v(αiαj)|iI , j J}equals v(αν+1), and
Bequals ν·u+v(h(0)) by (5).
Define three polynomial sequences (gm)mN, (hm)mN, and (rm)mNrecursively
like this: Put g1:= g. Given gm, define hmand rmsuch that f=gmhm+rm
and deg(rm)< ν. Given gm,hm, and rm, define gm+1 := gm+rm/hm(0).
The difficulty of the proof lies in finding the right thing to prove. For a fixed
mand for i= 1, . . . , ν, let aiand cibe the values of the coefficients to the terms
of degree νiin gmand rm, respectively. We claim:
(A) aiiu + ∆ where ∆ := min{v(αν)u, v(ff)B}.
(B) The Newton polygon of hmequals the Newton polygon NP of h.
(C) civ(h(0)) + iu +ki∆ where ki:= max{kN|k < (m+i+ν1)}.
The claims are shown by induction after m. Assume m= 1 for the induction
start. All roots of g1=ghave value at least v(αν), and hence
aii·v(αν)i(u+ ∆) iu + ∆ .
This shows (A). Write ff=gh0+r0with deg(r0)< ν. Then f=g(h+h0) +r0
and thus h1=h+h0and r1=r0. Also, v(h0), v(r0)v(ff). Adding h0to h
11
does not change the Newton polygon since v(h0)>Bv(h(0)) = NP(µ). This
shows (B). Finally,
civ(r0)v(ff)B+ ∆ v(h(0)) + iu +ki
since ki= 1 for m= 1, showing (C).
For the induction step, assume (A), (B), and (C) hold for some m, and let
(A’), (B’), and (C’) be the statements corresponding to m+ 1. (A’) follows
immediately from (A) and (C). Note f=gm+1hm(hm/hm(0) 1)rmand hence
hm+1 =hm+h0and rm+1 =r0if we write
(hm/hm(0) 1)rm=gm+1h0+r0(6)
with deg(r0)< ν. Let dibe the value of the coefficient to the term of degree ni
in the left hand side of (6). Using (A), (B), and (C) gives
d1NP(0) + u+k1
d2NP(1) + u+k1
.
.
.
dµNP(µ1) + u+k1
dµ+1 NP(µ1) + 2u+k2
.
.
.
dn1NP(µ1) + νu +kν
=dnNP(µ1) + (ν+ 1)u+kν+1
The algorithm of polynomial division resulting in the expression (6) consists of
a number of steps in each of which a monomial times gm+1 is subtracted from
(hm/hm(0) 1)rm. The key observation is that, in each step, the values of the
coefficients of the remainder satify the same inequalities as the di. Let b0
ibe the
value of the coefficient to the term of degree µiin h0. Then
b0
1NP(0) + u+k1>NP(0) + uNP(1)
.
.
.
b0
µNP(µ1) + u+k1>NP(µ1) + uNP(µ)
12
Hence hm+1 =hm+h0has NP as its Newton polygon, showing (B’). Let c0
ibe the
value of the coefficient to the term of degree νiin r0. Then
c0
1NP(µ1) + 2u+k2∆ = v(h(0)) + u+k2
.
.
.
c0
νNP(µ1) + (ν+ 1)u+kν+1∆ = v(h(0)) + νu +kν+1
This shows (C’) and finishes the induction step.
By (C), v(rm)→ ∞ and hence gmhmf. By the continuity of roots, the
roots of gmhmconverge to the roots of f(in a multiplicity-respecting way). By
assumption, the roots of ghave values > u, whereas the roots of hhave values
u. Lemma 3 then gives that the roots of ghave values > u, whereas the roots
of hhave values u. By (A), the roots of gmhave values > u. It follows that
the roots of gmconverge to the roots of g, and thereby the coefficients converge
too: gmg. Finally, g=g+P
m=1 rm/hm(0) and therefore by (C),
v(gg)min{v(rm)v(h(0)) |mN}
u+ ∆
v(ff)B+ max{v(αiαj)|iI, j J}.
Let us show that Lemma 4 coincides with the Hensel-Rella criterion when g
is linear. Given is a polynomial Fwith an approximate root ξ0. Put g=Xξ0
and h= (FF(ξ0))/(Xξ0). Then the left hand side of (1) is the value of
F(ξ0) = Fgh, and it can be seen that the right hand side of (1) equals the
bipartitionant of gand h. Hence, the gmconverge to a polynomial g=Xξ
dividing F. In the proof of Lemma 4, we could as well have defined gm+1 as
gm+rm/hm(ξm) where ξmis any root of gm(or any other element sufficiently
close to 0). With this definition and with linear g, the approximation process
becomes identical with usual Newton approximation.
Theorem 5 (continuity of factors). Let fand fbe monic polynomials of com-
mon degree n > 1with integral coefficients in an algebraically closed, valued field
(K, v). Consider a monic factorisation f=gh, and let Bbe the bipartitionant of
gand h. Assume v(ff)>B, and let f=ghbe the induced factorisation.
Then v(gg), v(hh)v(ff)B.
13
Proof. Write g=g1. . . grsuch that each glis a maximal (with respect to divisi-
bility) monic factor of gwhich is an isolated factor of f. The bipartitionant of gl
and ˜gl:= f/glis
Bl:= max{Φi(v(αiαj)) |gl(αi) = ˜gl(αj) = 0}
= max{Φi(v(αiαj)) |gl(αi)=0, j J}
(last equality follows from the maximality of gl), implying
B= max{Φi(v(αiαj)) |iI, j J}
= max{B1,...,Br}.
Lemma 4 gives
v(gg)min{v(ff)Bl+ Φ1
i(Bl)|l= 1, . . . , r , gl(αi)=0}
min{v(ff)Bl|l= 1, . . . , l}
=v(ff)B.
The inequality for v(hh) can be proved the same way, but also follows directly
by dividing fby g.
Example. Consider the polynomial f=X2(X2)(X4) = X46X3+8X2over
the field of dyadic numbers Q2. The error function of the double root α1=α2= 0
is Φ(γ)=2·γ+ min{γ, 1}+ min{γ, 2}. The bipartitionant of the factors g=X2
and h= (X2)(X4) is B=v(g(4))+ v(h(0)) = 7. Let f=f+2νwith ν > 7
and consider the induced factorisation f=gh. By Lemma 4, v(gg)ν5.
The figure shows the inverse error function Φ1and the line ν7→ ν5 (dotted):
12345678910
1
2
3
4
5
Φ1
r
r
14
Let us compute v(gg) precisely. The Newton polygon of fshows that
the roots α
1, . . . , α
4of fhave values v(α
1) = v(α
2) = (ν3)/2, v(α
3) = 1, and
v(α
4) = 2. We have
g= (Xα
1)(Xα
2) = X2(α
1+α
2)X+α
1α
2.
From the above follows v(α
1α
2) = ν3. It is more tricky to compute v(α
1+α
2).
To this end, consider the polynomial f(Xα
1). It has roots 2α
1,α
1+α
2,
α
1+α
3,α
1+α
4and constant term f(α
1) = f(α
1) + 12(α
1)3= 12(α
1)3. Thus,
v(α
1+α
2) = v(f(α
1)) v(2α
1)v(α
1+α
3)v(α
1+α
4)
= (3ν5)/2(ν1)/212
=ν5.
Conclude v(gg) = min{ν5, ν 3}=ν5.
The moral of the story is that the bound on the coefficients of ggiven by
Lemma 4 is best possible (contrary to that of Theorem 5) and better than the
bound on the roots of ggiven by Theorem 2.
One may wonder if there is also “continuity of factors” when v(ff)B,
i.e. if there is a bound on the error on the coefficients of gbetter than the bound
on the error on the roots of g. That is not likely to be the case. For when
v(ff)B, it is no longer possible to bipartition the roots of fas in Lemma
3. In other words, the factorisation f=gh no longer gives rise to a natural
factorisation f=gh. This view is supported by the observation that, in the
limit v(ff) = B, the bound on the error on gin the example above coincides
with the bound on the error on the roots of g.
7 Krasner’s lemma
The well-known Krasner’s lemma (see Corollaire 1, page 190 of Ribenboim (1968),
for instance) was in fact found by Ostrowski already in 1917. We give here a gen-
eralisation that will be used in the next section.
Theorem 6 (lemma `a la Krasner). Consider a monic polynomial f=Qn
k=1(X
15
α
k)of degree n > 1with coefficients in a Henselian field (K, v)and roots in the
algebraic closure ˜
K. Let Iand Jbe two disjoint, non-empty sets with union
{1, . . . , n}. Moreover, consider a polynomial g=QiI(Xαi)with coefficients
and roots in ˜
K. Assume
iIjJ:v(αiα
i)> v(α
iα
j).(7)
Then the coefficients of the polynomials g:= QiI(Xα
i)and h:= QjJ(X
α
j)are contained in the field extension of Kgenerated by the coefficients of g.
Proof. Part A. First some preliminary observations. From (7) follows at once that
gand hare relatively prime. Since f=gh, the coefficients of ggenerate
the same extension of Kas the coefficients of h. We may assume without loss
of generality – and will do so – that ghas coefficients in K. What is left to prove
is that ghas coefficients in K.
Now let Ksep be the separable algebraic closure of K. Since Ksep is a sepa-
rably closed field, every irreducible polynomial over Ksep has only one (possibly
multiple) root. Since ghhas coefficients in Ksep, and gand hare relatively
prime, it follows that gand hhave coefficients in Ksep.
We show in part B that every K-automorphism σon ˜
Kpermutes the roots
of g. Hence, every such σfixes the coefficients of g. The coefficients of gare
therefore purely inseparable over K.
Since the coefficients of gare both separable and purely inseparable over K,
they do in fact belong to K.
Part B. Let σbe a K-automorphism on ˜
K. Consider the sets A={αi|iI},
A={α
i|iI}, and A∗∗ ={α
j|jJ}. Note that AAand A∗∗ are disjoint
by (7). Since gand fhave coefficients in K,σis a “multiplicity-preserving”
permutation on both Aand AA∗∗. Since Kis Henselian, σis isometric. We
show that (7) implies that σpermutes Aand A∗∗ individually. This is really a
lemma on finite ultra-metric spaces.
For αA, let B(α) be the maximal ball in the finite ultra-metric space
AAA∗∗ containing αand being contained in AA. Then (7) implies
iI:αi∈ B(α)α
i∈ B(α).(8)
Every αAis thereby contained in some B(α), so we are done if we can show
σ(B(α)) jAA.
16
For any αiAσ(B(α)), the balls σ(B(α)) and B(αi) have non-empty
intersection (both contain αi), hence one is contained in the other. If there is an
αiAσ(B(α)) such that σ(B(α)) jB(αi), then σ(B(α)) jAAand we are
done. So assume from now on σ(B(α)) ⊃ B(αi) for all αiAσ(B(α)).
For a subset Xof AAA∗∗ , let #Xdenote X’s cardinality “counted with
multiplicity”, i.e.
#X:= |{iI|αiX}| +|{kIJ|α
kX}| .
We then have
#B(α) = 2 · |{iI|αi∈ B(α)}|
by (8). Since σpreserves multiplicity and permutes A,
#B(α) = #σ(B(α)) and |{iI|αi∈ B(α)}| =|{iI|αiσ(B(α))}|
hold. For iIwith αiσ(B(α)), (8) implies α
i∈ B(αi)σ(B(α)) and hence
|{iI|α
iσ(B(α))}| ≥ |{iI|αiσ(B(α))}| .
Putting everything together gives
#σ(B(α)) = |{iI|αiσ(B(α))}| +|{kIJ|α
kσ(B(α))}|
2· |{iI|αiσ(B(α))}| +|{jJ|α
jσ(B(α))}|
= 2 · |{iI|αi∈ B(α)}| +|{jJ|α
jσ(B(α))}|
= #B(α) + |{jJ|α
jσ(B(α))}|
= #σ(B(α)) + |{jJ|α
jσ(B(α))}| .
Finally, conclude |{jJ|α
jσ(B(α))}| = 0, i.e. σ(B(α)) jAA.
Theorem 6 has an immediate corollary which itself reduces to the usual Kras-
ner’s lemma when the element ais separable over K:
Corollary 7. Consider a Henselian field Kand let aand bbe elements in
the algebraic closure ˜
K. Assume bis closer to athan to any of a’s conjugates.
Then K(b)contains the coefficients of the polynomial (Xa)µwhere µis the
root multiplicity of ain its minimal polynomial over K.
17
Remark. In the application of Theorem 6 in the proof of Theorem 8 below, we
also have a polynomial h=QjJ(Xαj) satisfying
iIjJ:v(αjα
j)> v(α
iα
j) (9)
and such that gh has coefficients in K. In this situation, part B of the proof
of Theorem 6 can be replaced by the following simpler argument: Assume for
a contradiction that there are iIand jJsuch that σ(α
i) = α
j. By
symmetry, we may assume v(αiα
i)v(αjα
j) Then σ(αi) = αi0for some
i0I. Since σis isometric, v(αi0α
j) = v(αiα
i)v(αjα
j). But now
v(αi0αj)v(αjα
j), in contradiction with (9).
8 Hensel’s lemma
We can now state and prove the promised general Hensel’s lemma.
Theorem 8 (monic Hensel’s lemma). Consider two monic polynomials f
and fof common degree n > 1with integral coefficients in a Henselian field
(K, v). Let there be given a factorisation f=gh with monic gand h. Assume
v(ff)>Bwhere Bis the bipartitionant of gand h. Then there is a fac-
torisation f=ghwhere gand hare monic and have integral coefficients,
deg(g) = deg(g), deg(h) = deg(h), and v(gg), v(hh)v(ff)B.
Proof. Consider the induced factorisation f=gh. The factors gand hhave
coefficients in Kby Lemma 3 and Theorem 6. The bound on v(gg) and
v(hh) follows from Theorem 5.
Example. Consider the polynomial f=X8(X+ 2)8+ 2νwith ν0 over the
field of dyadic numbers Q2. The bipartitionant of g=X8and h= (X+ 2)8is
B= 16. By Theorem 8, fis reducible for all ν > 16. More precisely, there is
in this case a monic factorisation f=ghwith v(gg), v(hh)ν16
(using Lemma 4 instead of Theorem 5 gives in fact v(gg), v(hh)ν15).
It can be shown that fis irreducible for ν= 0,1,3,4,5,7,8,9,11,12,13,15,16,
implying that the bound v(ff)>Bis best possible. The dyadic value of
the resultant of gand his 64, so the Hensel’s lemma of 1908 gives a factorisation
18
f=ghwith v(gg), v(hh)ν64 for ν > 128.
To make life as easy as possible, we have so far solely studied monic poly-
nomials having integral coefficients. This is indeed the situation in almost all
applications of Hensel’s lemma. Also, when a given non-monic polynomial Fhas
an approximate factorisation satisfying the conditions of the non-monic Hensel’s
lemma, the reducibility of Ffollows immediately from the observation that the
Newton polygon of Fis not a straight line.
Nevertheless, we now turn our attention to the non-monic case. The proof of
the following theorem is entirely analogous to that of the monic Hensel’s lemma,
but the presence of non-monic polynomials forces us to reexamine the proofs of
earlier theorems.
Theorem 9 (Hensel’s lemma, final form). Consider two polynomials Fand F
of common degree n > 1with integral coefficients in a Henselian field (K, v)and
with the same leading coefficient c. Let there be given a factorisation F=gH
where gis monic and has integral coefficients, and His primitive, i.e. v(H) = 0.
Assume v(FF)>max{0,B+v(c)}where Bis the bipartitionant of gand
c1H. Then there is a factorisation F=gHwhere gis monic and has
integral coefficients, His primitive, deg(g) = deg(g), deg(H) = deg(H), and
v(gg), v(HH)v(FF)max{0,B+v(c)}.
Proof. First introduce monic polynomials f:= c1F,f:= c1F, and h:=
c1H. Note f=gh,v(ff) = v(FF)v(c), and thus v(ff)>
max{−v(c),B}.
Write f=Qn
k=1(Xαk) and let Iand Jbe the sets with g=QiI(Xαi)
and h=QjJ(Xαj). Put ρi:= Φ1
i(v(ff)) for each iI. Note Φi(0) =
Pn
l=1 min{0, v(αiαl)}=v(c)< v(ff) and hence 0 < ρi.
The proof of Theorem 2, word for word, shows that fand fhave the same
number of roots (counted with multiplicity) in the ball {x˜
K|v(xαi)ρi}
for any iI. It follows that we can write f=Qn
k=1(Xα
k) such that
v(αiα
i)ρifor each iI. We have v(αiαj)Φ1
i(B)< ρifor iI
and jJ, and therefore v(α
iα
j)< ρifor iIand jJ. Conclude
v(αiα
i)> v(α
iα
j) for all iIand jJ.
19
By Theorem 6, g:= QiI(Xα
i) and h:= QjJ(Xα
j) have coeffi-
cients in K. Reexamination of the proofs of Lemma 4 and Theorem 5 shows
v(gg)v(ff)max{−v(c),B}. Now put H:= ch.
Notice that the resultant of gand Hhas value
v(Res(g, H )) = deg(g)·v(c) + v(Res(g, h))
= deg(g)·v(c) + X
iI,j J
v(αiαj)
=X
iI,j J
max{0, v(αiαj)}
By (5), the bipartitionant of gand his B=PiIv(αiαj0) + PjJv(αi0αj)
for suitable i0Iand j0J. There follows max{0,B+v(c)} ≤ 2v(Res(g, H)).
Hence, Theorem 9 generalises the Hensel’s lemma of 1908 as well as its in section
1 mentioned later reincarnations.
References
[1] G. Dumas, Sur quelques cas d’irr´eductibilit´e des polynomes `a coefficients
rationnels, J. Math. Pures Appl. 61 (1906), 191–258.
[2] K. Hensel, Neue Grundlagen der Arithmetik, J. Reine Angew. Math. 127
(1904), 51–84.
[3] K. Hensel, Theorie der algebraischen Zahlen, Teubner, Leipzig, 1908.
[4] W. Krull, Allgemeine Bewertungstheorie, J. Reine Angew. Math. 167 (1932),
160–196.
[5] J. K¨urscak, ¨
Uber Limesbildung und allgemeine K¨orpertheorie, J. Reine
Angew. Math. 142 (1913), 211–253.
[6] M. Nagata, On the Theory of Henselian Rings, Nagoya Math. J. 5(1953),
45–57.
[7] A. Ostrowski, Untersuchungen zur arithmetischen Theorie der K¨orper,
Math. Z. 39 (1935), 269–404.
20
[8] F. J. Rayner, Relatively Complete Fields, Proc. Edinburgh Math. Soc. 11
(1958), 131–133.
[9] T. Rella, Zur Newtonschen Approximationsmethode in der Theorie der p-
adischen Gleichungswurzeln, J. Reine Angew. Math. 153 (1924), 111–112.
[10] T. Rella, Ordnungsbestimmungen in Integrit¨atsbereichen und Newtonsche
Polygone, J. Reine Angew. Math. 158 (1927), 33–48.
[11] P. Ribenboim, Th´eorie des valuations, Les presses de l’Universit´e de
Montr´eal, Montreal, 1968.
[12] P. Ribenboim, Equivalent forms of Hensel’s lemma, Expo. Math. 3(1985),
3–24.
[13] D. S. Rim, Relatively complete fields, Duke Math. J. 24 (1957), 197–200.
[14] P. Roquette, History of Valuation Theory. Part I. In: F. V. Kuhlmann, S.
Kuhlmann, M. Marshall (ed.), Valuation Theory and its applications, vol. 1,
Fields Inst. Commun. 32 (2002), 291–355.
[15] K. Rychl´ık, Zur Bewertungstheorie der algebraischen K¨orper, J. Reine
Angew. Math. 153 (1924), 94–107.
21
... We now use the continuity of roots of a monic polynomial of constant degree relative to its coefficients. We use as a reference [12,Theorem 2], but more specifically the remark that follows Theorem 2 of the aforementioned paper. ...
Thesis
This thesis is dedicated to the study of extensions of a valuation v on K to the ring of polynomials in one variable K[X]. Our objective is to give a geometric interpretation to a large class of these extensions, called valuation transcendental valuations. We start by reviewing the fundamental concepts of abstract key polynomials and minimal pairs. We build on the correspondence between them, established by Novacoski and show how to relate the valuations generated by each of these two objects. We illustrate this process by giving some direct applications. We can then give a geometric interpretation of valuations built from key polynomials and minimal pairs. To do so we employ an object called diskoid, which is a generalisation of the classical concept of ball in non-archimedian valued fields.
... R ℘ i (and then taking the maximum) gives some exponent n i such that any other monic polynomial over p R ℘ i congruent to any given irreducible factor of f ℘ i modulo t n i i will define the same extension of F ℘ i as that factor. By a general form of Hensel's Lemma (see Theorem 8 of [Bri06]), for each i there is an integer m i such that for any monic polynomial g ℘ i over p R ℘ i that is congruent to f ℘ i modulo t m i i , the irreducible factors of g ℘ i are respectively congruent to those of f ℘ i modulo t n i i . The field F P is dense in ś F ℘ i by Theorem VI.7.2.1 of [Bou72]. ...
Preprint
We provide a uniform bound for the index of cohomology classes in $H^i(F, \mu_\ell^{\otimes i-1})$ when $F$ is a semiglobal field (i.e., a one-variable function field over a complete discretely valued field $K$). The bound is given in terms of the analogous data for the residue field of $K$ and its finitely generated extensions of transcendence degree at most one. We also obtain analogous bounds for collections of cohomology classes. Our results provide recursive formulas for function fields over higher rank complete discretely valued fields, and explicit bounds in some cases when the information on the residue field is known.
... We now use the continuity of roots of a monic polynomial of constant degree relative to its coefficients. We use as a reference [8,Theorem 2], but more specifically the remark that follows Theorem 2 of the aforementioned paper. We write V ( l a l X l ) = min l ν(a l ) so that we can state the theorem below. ...
Preprint
Full-text available
We build on the correspondence between abstract key polynomials and minimal pairs made by Novacoski and show how to relate the valuations that are generated by each object. We can then give a geometric interpretation of valuations built in this fashion. To do so we employ an object called diskoid, which is a generalisation of the classical concept of ball in non-archimedian valued fields.
... and so that for every k in t0, 1, . . . , qu we have z m,k Ñ z 0,k when m tends to infinity, see for example [Bri06,Theorem 2]. For each m in t0, 1, 2, . . ...
Preprint
We study the asymptotic distribution of CM points on the moduli space of elliptic curves over C_p, as the discriminant of the underlying endomorphism ring varies. In contrast with the complex case, we show that there is no uniform distribution. In this paper we characterize all the sequences of discriminants for which the corresponding CM points converge towards the Gauss point of the Berkovich affine line. We also give an analogous characterization for Hecke orbits. In the companion paper we characterize all the remaining limit measures of CM points and Hecke orbits.
... Proof. In the non-archimedean case the proof can be found in [8,Theorem 1]. In the archimedean case this follows from the well-known fact (see, for instance, [2, 1.5.9]) that on the space of monic separable real polynomials of degree n the function "number of real roots" is locally constant. ...
Article
We prove, under some mild hypothesis, that an \'etale cover of curves defined over a number field has infinitely many specializations into an everywhere unramified extension of number fields. This constitutes an "absolute" version of the Chevalley-Weil theorem. Using this result, we are able to generalize the techniques of Mestre, Levin and the second author for constructing and counting number fields with large class group.
... In a series of papers [2][3][4][5][6], separants of polynomials were found for some important (interesting) classes of polynomials. If we use the notion of a constructive model [7] or, which is equivalent, of a computable model [7,8], then it is not hard to see that in the case where F 0 is a constructive field, the definition of a separant σ f itself shows a way of computing the separant. ...
Article
Full-text available
Let f be an arbitrary (unitary) polynomial over a valued field \( \mathbb{F}=\left\langle F,R\right\rangle \) . In [2], a separant σf of such a polynomial was defined to be an element of a value group \( {\varGamma}_{R_0} \) for any algebraically closed extension \( {\mathbb{F}}_0=\left\langle {F}_0,{R}_0\right\rangle \ge \mathbb{F} \) . Specifically, the separant was used to obtain a generalization of Hensel’s lemma. We show a more algebraic way (compared to the previous) for finding a separant.
... On the other hand, the coefficient of x i in the polynomial g n (x + α) has value v(∆ i g n (α)) = w n+1 (∆ i g n ) (see formulas (2.3) and (2.4)) and w n+1 (∆ 0 g n ) = w n+1 (g n ) = q n (see formula (2.2)). The theorem now follows from the theory of Newton polygons [6]. For the convenience of the reader we will give in the next lemma the relevant part of that theory, that the largest value of the roots of a polynomial ...
Article
In an earlier paper the authors calculated the main invariant of a tame polynomial over a valued field in terms of the simple invariants associated with a strict system of polynomial extensions that contained that polynomial. In this note we give upper and lower bounds in terms of such invariants for the main invariant of any defectless polynomial. We also determine precisely the polynomials for which the upper bound is the main invariant; this class strictly contains the set of tame polynomials. A class of examples with the same upper and with the same lower bound for the main invariant is given whose main invariants form a dense subset of the interval between the two bounds. A second class of polynomials is given whose strict systems have arbitrarily long length and whose main invariant is the lower bound. A basic tool is a formula for the main invariant which itself gives an algorithm for computing the main invariants of the polynomials in any strict system; in particular, simple formulas are given for the main invariants of some very special types of defectless polynomials including generalized Schönemann polynomials. The Krasner constants of defectless polynomials are also studied.
Article
We build on the correspondence between abstract key polynomials and minimal pairs made by Novacoski and show how to relate the valuations that are generated by each object. We can then give a geometric interpretation of valuations built in this fashion. To do so we employ an object called diskoid, which is a generalisation of the classical concept of ball in non-archimedian valued fields.
Article
Full-text available
Let f be a unitary polynomial over F. Previously, the concept of a separant of a polynomial f was defined for the case where f has no multiple roots. The notion of a separant turned out to be very useful for generalizations of Hensel’s lemma. We propose a generalization of this concept to the case where a polynomial may have multiple roots. This allows us to extend Hensel’s lemma to this case as well.
Article
Let K be a valued field, let v denote its valuation and B its valuation ring. Let P denote the valuation ideal. For each a in B , let ā denote the residue class a + P in the field B / P ; for f ( x )=∑ a r x r in B [x], let f (x) denote ∑ā r x r in B / P [ x ]. Let Λ p denote the leading coefficient of a polynomial p , and ∂ p the degree of a non-zero polynomial.
Article
The theory of valuations was started in 1912 by the Hungarian mathe- matician Josef Kurschak who formulated the valuation axioms as we are used today. The main motivation was to provide a solid foundation for the theory of p-adic fields as defined by Kurt Hensel. In the following decades we can observe a quick development of valuation theory, triggered mainly by the discovery that much of algebraic number theory could be better understood by using valuation theoretic notions and methods. An out- standing figure in this development was Helmut Hasse. Independent of the application to number theory, there were essential contributions to valuation theory given by Alexander Ostrowski, published 1934. About the same time Wolfgang Krull gave a more general, universal definition of valuation which turned out to be applicable also in many other mathe- matical disciplines such as algebraic geometry or functional analysis, thus opening a new era of valuation theory. In the present article which is planned as the first part of more to come, we report on the development of valuation theory until the ideas of Krull about general valuations of arbitrary rank took roots. That is, we cover the pre-Krull era. As our sources we use not only the published articles but also the information contained in letters and other material from that time, mostly but not exclusively from the legacy of Hasse at the University library at Gottingen.