PreprintPDF Available

Scaling-invariant functions versus positively homogeneous functions

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

Scaling-invariant functions preserve the order of points when the points are scaled by the same positive scalar (with respect to a unique reference point). Composites of strictly monotonic functions with positively homogeneous functions are scaling-invariant with respect to zero. We prove in this paper that the reverse is true for large classes of scaling-invariant functions. Specifically, we give necessary and sufficient conditions for scaling-invariant functions to be composites of a strictly monotonic function with a positively homogeneous function. We also study sublevel sets of scaling-invariant functions generalizing well-known properties of positively homogeneous functions. Keywords: scaling-invariant function, positively homogeneous function, compact level set
JOTA manuscript No.
(will be inserted by the editor)
Scaling-invariant functions versus positively homogeneous
functions
Cheikh Toure ·Armand Gissler ·Anne
Auger ·Nikolaus Hansen
Received: date / Accepted: date
Abstract Scaling-invariant functions preserve the order of points when the
points are scaled by the same positive scalar (with respect to a unique reference
point).
Composites of strictly monotonic functions with positively homogeneous
functions are scaling-invariant with respect to zero. We prove in this paper that
the reverse is true for large classes of scaling-invariant functions. Specifically,
we give necessary and sufficient conditions for scaling-invariant functions to
be composites of a strictly monotonic function with a positively homogeneous
function. We also study sublevel sets of scaling-invariant functions generalizing
well-known properties of positively homogeneous functions.
Keywords scaling-invariant function ·positively homogeneous function ·
compact level set.
Mathematics Subject Classification (2000) 49J52 ·54C35
Contents
1 Introduction........................................ 2
2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Scaling invariant functions as composite of strictly monotonic functions with pos-
itively homogeneous functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1 Continuous SI functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Sufficient and necessary condition to be the composite of a PH function . . . 17
4 Level sets of SI functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1 Identical sublevel sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Compactness of the sublevel sets . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Sufficient Condition for Lebesgue Negligible Level Sets . . . . . . . . . . . . . 25
4.4 Balls containing and balls contained in sublevel sets . . . . . . . . . . . . . . 26
4.5 A generalization of a weak formulation of Euler’s homogeneous function theorem 28
Inria and CMAP, Ecole Polytechnique, IP Paris, France
firstname.lastname@inria.fr
cheikh.toure@polytechnique.edu
2 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen
4.6 Compact neighborhoods of level sets with non-vanishing gradient . . . . . . . 33
A Bijection Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1 Introduction
A function f:RnRis scaling-invariant (SI) with respect to a reference
point x?Rnif for all x, y Rnand ρ > 0:
f(x?+x)f(x?+y)f(x?+ρx)f(x?+ρy),(1)
that is, the f-order of any two points is invariant under a multiplicative change
of their distance to the reference point—the order only depends on their di-
rection and their relative distance to the reference. Scaling-invariant functions
appear naturally when studying the convergence of comparison-based opti-
mization algorithms where the update of the state of the algorithm is using f
only through comparisons of candidate solutions [3,7]. A famous example of a
comparison-based optimization algorithm is the Nelder-Mead method [15].
A function p:RnRis positively homogeneous (PH) with degree α > 0
(PHα) if for all xRnand ρ > 0:
p(ρ x) = ραp(x).(2)
Positively homogeneous functions are scaling-invariant with respect to x?= 0.
We also consider that x7→ p(xx?) is positively homogeneous w.r.t. x?when
pis positively homogeneous. Linear functions, norms, and convex quadratic
functions are positively homogeneous. We can define PH functions piecewise
on cones or half-lines, because a function is PHαif and only if (2) is sat-
isfied within each cone or half-line (which is not the case with SI functions
where xand yin (1) can belong to different cones). For example, the func-
tion p:RnRdefined as p(x) = x1if x1x2>0 and p(x) = 0 otherwise,
is PH1. Positively homogeneous functions and in particular increasing pos-
itively homogenous functions are well-studied in the context of Monotonic
Analysis [6, 16, 17] or nonsmooth analysis and nonsmooth optimization [8].
Specifically, non-linear programming problems where the objective function
and constraints are positively homogeneous are analyzed in [12] whereas sad-
dle representations of continuous positively homogeneous functions by linear
functions are established in [9]. The (left) composition of a PH function with
a strictly monotonic function is SI while this composite function is in general
not PH. One of the questions we investigate in this paper is to which extent
SI functions and composites of PH functions with strictly monotonic functions
are the same. We prove that a continuous SI function is always the compos-
ite of a strictly monotonic function with a PH function. We give necessary
and sufficient conditions for an SI function to be the composite of a strictly
monotonic function with a PH function in the general case.
Only level sets or sublevel sets matter to determine the difficulty of an
SI problem optimized with a comparison-based algorithm. We investigate dif-
ferent properties of level sets thereby generalizing properties that are known
Scaling-invariant functions versus positively homogeneous functions 3
for PH functions, including a formulation of the Euler homogenous function
theorem that holds for PH functions.
Notations: We denote R+the interval [0,+),R= (−∞,0], Zthe set of
all integers, Z+the set of all non-negative integers and Qthe set of rational
numbers. The Euclidian norm is denoted by k.k.For xRnand ρ > 0, we
denote by B(x, ρ) = {yRn;kxyk< ρ}the open ball centered at xand of
radius ρ,B(x, ρ) its closure and S(x, ρ) its boundary. When they are centered
at 0, we denote Bρ=B(0, ρ), Bρ=B(0, ρ) and Sρ=S(0, ρ). For an interval
IRand a function ϕ:IR,we use the terminology of strictly increasing
(respectively strictly decreasing) if for all a, b Iwith a < b, ϕ(a)< ϕ(b)
(respectively ϕ(a)> ϕ(b)). For a real number ρand a subset ARn, we
define ρA ={ρ x;xA}. For a function f, we denote by Im(f) the image of
f.
2 Preliminaries
Given a function f:RnRand xRn, we denote the level set going
through xas Lf,x ={yRn, f (y) = f(x)}and the sublevel set as L
f,x =
{yRn, f (y)f(x)}.
If fis SI with respect to x?, then the function x7→ f(x+x?)f(x?) is
scaling invariant with respect to 0. Hence, if a function fis SI, we assume
in the following that fis SI with respect to the reference point 0 and that
f(0) = 0, without loss of generality.
We can immediately imply from (1) that if xand ybelong to the same
level set, then ρx and ρy belong to the same level set. Hence the level set of x
and ρx are scaled from one another, i.e. Lf,ρx =ρLf,x .
Similarly, since for any x, y Rnand ρ > 0, f(y
ρ)f(x) if and only if
f(y)f(ρx),
L
f,ρx =ρL
f,x , and Lf,ρx =ρLf ,x .(3)
These properties are visualized in Figure 1.
Given an SI function f, we define surjective restrictions of fto half-lines
along a vector xRnas
fx:t[0,)7→ f(tx).(4)
It is immediate to see that the fxare also SI1. However, fmay not be SI even
when all fxare2.
1This directly follows because for s, t R+and ρ > 0, fx(t)fx(s)f(tx)
f(sx)f(ρtx)f(ρsx)fx(ρt)fx(ρs).
2For example, define f:RRas t7→ ton R+and t7→ t2on R. Then f1(t) = tand
f1(t) = t2, for tR+, are both SI and even PH with degree 1 and 2, respectively. But f
is not SI, and hence also not PH, because f(1
2) = 1
2>1
4=f(1
2) but f(4 ×1
2) = 2 <4 =
f(4 ×(1
2)).
4 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen
Fig. 1: Level sets of SI functions with respect to the red star x?. The four
functions are strictly increasing transformations of x7→ p(xx?) where pis
a PH function. From left to right: p(x) = kxk;p(x) = x>Ax for Asymmetric
positive and definite; p(x) = Pip|xi|2the 1
2-norm; a randomly generated
SI function from a “smoothly” randomly perturbed sphere function. The two
first functions from the left have convex sub-level sets, contrary to the last
two.
Scaling invariant functions have at most one isolated local optimum [3]
where an isolated local optimum, say, an isolated argmin, x, for a function g:
RnRis defined in that there exists  > 0 such that for all y∈ B (x, )\ {x},
g(y)> g(x).This result is reminded in the following proposition.
Proposition 2.1 (see [3, Proposition 3.2]) Let fbe an SI function. Then
fcan admit an isolated local optimum only in f(0) = 0 and this local optimum
is also the global optimum. In addition, the functions fxcannot admit a local
plateau, i.e., a ball where the function is locally constant, unless the function
is equal to 0everywhere.
We characterize in the following the functions fxof an SI function funder
different conditions.
Proposition 2.2 If fis a continuous SI function on Rn, then for all xRn,
fxis either constant equal to 0or strictly monotonic.
More specifically, if ϕ:R+Ris a 1-dimensional continuous SI function,
then ϕis either constant equal to 0or strictly monotonic.
Proof Assume that ϕis not strictly monotonic on [0,). Then ϕis not strictly
monotonic on (0,). Since continuous injective functions are strictly mono-
tonic, ϕis not injective on (0,). Therefore there exists 0 < s < t such that
ϕ(s) = ϕ(t). By scaling-invariance, it follows that ϕ(s
t) = ϕ(1). It follows iter-
atively that for all integer k > 0, ϕ (s
t)k=ϕ(1). Taking the limit for k→ ∞,
we obtain that ϕ(0) = ϕ(1). Thereby by scaling-invariance again, it follows
for all ρ > 0 that ϕ(0) = ϕ(ρ). Hence we have shown that if ϕis not strictly
monotonic, it is a constant function.
Now if fis a continuous SI function on Rnand xRn,then fxis also
scaling invariant and continuous. Then it follows that fxis either constant or
strictly monotonic. ut
We deduce from Proposition 2.2 the next corollary.
Scaling-invariant functions versus positively homogeneous functions 5
Corollary 2.1 Let fbe a continuous SI function. If fhas a local optimum
at x, then for all t0,f(tx) = f(0). In particular, if fhas a global argmin
(resp. argmax), then 0is a global argmin (resp. argmax).
Proof Assume that there exists a local optimum at x. Then fxhas a local
optimum at 1. Therefore fxis not strictly monotonic, and thanks to Proposi-
tion 2.2, fxis necessarily a constant function. In other words, f(tx) = f(0)
for all t0. ut
We derive another proposition with the same conclusions as Proposition 2.2
but under a different assumption. We start by showing the following lemma.
Lemma 2.1 Let ϕ:R+Rbe an SI function continuous at 0and strictly
monotonic on a non-empty interval IR+, then ϕis strictly monotonic.
Proof Assume without loss of generality that ϕis strictly increasing on I
and that I= (a, b) with 0 < a < b, up to replacing Iwith a subset of I.
Denote ρ=b
a. Then ρk, ρk+1kZcovers (0,). To prove that ϕis strictly
increasing on (0,), it is enough to prove that ϕis strictly increasing on
[ρk, ρk+1] for all integer k.
Let kbe an integer and (x, y) two real numbers such that ρkx<y
ρk+1. Then aax
ρk<ay
ρk=b. Therefore ϕ(ax
ρk)< ϕ(ay
ρk). And by scaling-
invariance, ϕ(x)< ϕ(y).
With the continuity at 0, it follows that ϕis strictly increasing on R+.ut
We derive from Lemma 2.1 the following proposition.
Proposition 2.3 Let fbe an SI function continuous at 0. Assume that each
fxis either strictly monotonic or constant on some non-empty interval. Then
for all xRn,fxis either constant equal to 0or strictly monotonic.
Note that the continuity of the function fxalone does not suffice to con-
clude that fxis either constant or strictly monotonic on some non-empty inter-
val. Indeed there exist 1-D continuous functions (even differentiable functions)
that are not monotonic on any non-empty interval [4,5, 10].
For the sake of completeness, we construct SI functions in R+that are
not monotonic on any non-empty interval. The construction of such functions
is based on the nonlinear solutions of the Cauchy functional equation: for all
x, y R, g(x+y) = g(x)+g(y), called Hamel functions [11]. A Hamel function
falso satisfies f(q1x+q2y) = q1f(x) + q2f(y) for all real numbers x, y and
rational numbers q1, q2[1, Chapter 2]. Since gis nonlinear, there exist real
numbers xand ysuch that the vectors {(x, g(x)),(y, g(y))}form a basis of R2
over the field R. Then the graph of g, which is a vector subspace of R2over the
field Q, contains q1·(x, g(x)) + q2·(y, g(y)); (q1, q2)Q2which is dense in
R2. Therefore a 1-D Hamel function is highly pathological, since its graph is
dense in R2.
Lemma 2.2 There exist SI functions on R+that are neither monotonic nor
continuous on any non-empty interval.
6 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen
Proof We start by choosing a nonlinear solution of the Cauchy’s functional
equation denoted by g:RR,knowing that there are uncountably many
ways to pick such a g[11]. Then for all real numbers aand b,g(a+b) =
g(a)+g(b). And since gis not linear, we also know that g is neither continuous
nor monotonic on any interval [11]. Let us define f= exp glog on (0,)
and f(0) = 0. Then f(x)>0 for all x > 0 and fis still not monotonic
on any non-empty interval. We also have for all ρ > 0 and x > 0, f(ρx) =
exp (g(log(x) + log(ρ))) = exp(g(log(x))) exp(g(log(ρ))) = f(x)f(ρ). This last
result gives the scaling-invariance property. ut
Based on Lemma 2.2, we derive the next proposition.
Proposition 2.4 There exist SI functions fon Rnsuch that for all non-zero
x,fxis neither monotonic nor continuous on any non-empty interval.
Proof Based on Lemma 2.2, there exists ϕ:R+RSI on R+which is
neither monotonic nor continuous on any non-empty interval. We construct f
as follows. For all xRn,f(x) = ϕ(kxk). Then fis SI because for x, y Rn
and for ρ > 0, f(x)f(y)ϕ(kxk)ϕ(kyk)ϕ(ρkxk)
ϕ(ρkyk)f(ρx)f(ρy).In addition for a non-zero xand t0, fx(t) =
f(tx) = ϕ(tkxk) and then fxis neither monotonic nor continuous on any non-
empty interval. ut
Now assume that fis a continuous scaling invariant function and we can
write f=ϕgwhere ϕis a continuous bijection and gis a positively homo-
geneous function. As a direct consequence of Theorem A.1, ϕ1is continuous
if ϕis a continuous bijection defined on an interval. Therefore g=ϕ1fis
also continuous. This result is stated in the following corollary.
Corollary 2.2 Let fbe a continuous SI function, ϕa continuous bijection
defined on an interval in Rand pa positively homogeneous function such that
f=ϕp. Then pis also continuous.
3 Scaling invariant functions as composite of strictly monotonic
functions with positively homogeneous functions
As underlined in the introduction, compositions of strictly monotonic functions
with positively homogeneous functions are scaling-invariant (SI) functions.
We investigate in this section under which conditions the converse is true,
that is, when SI functions are compositions of strictly monotonic functions
with PH functions. Section 3.1 shows that continuity is a sufficient condition,
whereas Section 3.2 gives some necessary and sufficient condition on fto be
decomposable in this way.
Scaling-invariant functions versus positively homogeneous functions 7
3.1 Continuous SI functions
We prove in this section a main result of the paper: any continuous SI function
fcan be written as f=ϕpwhere pis PH1and ϕis a homeomorphism (and in
particular strictly monotonically increasing and continuous). The proof relies
on the following proposition where we do not assume yet that fis continuous
but only the restrictions of fto the half-lines originating in 0, the fxfunctions.
Proposition 3.1 Let fbe an SI function such that for any xRn,fxas
defined in (4) is either constant or strictly monotonic and continuous. Then
for all α > 0, there exist a PHαfunction pand a strictly increasing, continuous
bijection (thus a homeomorphism) ϕsuch that f=ϕp. For a non-zero f
and α > 0, the choice of (ϕ, p)is unique up to a left composition of pwith a
piece-wise linear function.
(i) In addition, if all non-constant fxhave the same monotonicity for all x
Rn,then for any x0Rnsuch that f(x0)6= 0,the homeomorphism ϕ
corresponding to a PH1function can be chosen as fx0and is at least as
smooth as f.
(ii) Otherwise, there exist x1, x1Rnsuch that fx1is strictly increasing and
fx1is strictly decreasing. And for any such x1and x1we can choose as
ϕthe following function equal to fx1on R+and equal to t7→ fx1(t)on
R.
Proof Let fbe an SI function such that for any xRn,fxis either a constant
or a strictly monotonic continuous function.
In the case where all the fxare constant for all xRn,then f= 0
and therefore we can take pα= 0 as a candidate for a continuous PHαand
ϕα:t7→ tas the candidate for the corresponding homeomorphism.
From now on, at least one of the {fx}xRnis non-constant. We now split
the proof in two parts, the case where all the non-constant fxhave the same
monotonicity and the case where there exist x1, x1Rnsuch that fx1is
strictly increasing and fx1is strictly decreasing.
Part 1. Assume here that all the non-constant fxhave the same monotonic-
ity for all xRn. And up to a transformation x7→ −f(x),we can assume
without loss of generality that they are increasing. Therefore 0 is a global
argmin and since we have assumed f(0) = 0 : f(x)0 for all xRn. Then
there exists x0Rnsuch that f(x0)>0.
For any x∈ Lf,x0={yRn, f (y) = f(x0)}, and any λ > 0 different than
1, λx /∈ Lf ,x0. Indeed, as x∈ Lf,x0, we know from Proposition 2.2 that fxis
strictly increasing on R+,since fxcannot be constant equal to 0.
Moreover, for all xRnsuch that f(x)6= 0, there exists λ > 0 such that
λx ∈ Lf,x0. Indeed, if f(x)< f (x0), the intermediate value theorem applied
to the continuous function fx0shows that there exists 0 < t < 1 such that
f(tx0) = fx0(t) = f(x), and then f(1
tx) = f(x0). And by interchanging xand
x0,the same argument holds if f(x)> f(x0).
8 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen
The two last paragraphs ensure that for all xsuch that f(x)6= 0,there
exists a unique positive number λxsuch that λxx∈ Lf,x0. Let us define the
function pfor all xRnas follows: if f(x)6= 0 then pw(x) = 1
λx,otherwise
p(x) = 0. We prove in the following that pis PH1.
Let xRnand ρ > 0. If f(x) = 0 (hence f(ρx) = 0), then p(ρx) = 0 =
ρp(x). Otherwise f(x)>0 (hence f(ρx)>0), and p(ρx) = ρ
λxsince λx
ρis
the (unique) positive number such that λx
ρρx =λxx∈ Lf,x0. And thereby
p(ρx) = ρ p(x).
We prove that f=fx0p, where fx0is a continuous strictly increasing
function and pis PH1. Let xRn. If f(x)=0,then p(x) = 0, and then
f(x) = 0 = f(0) = fx0(0) = (fx0p) (x). Otherwise, we have by construction
that x
p(x)∈ Lf,x0. Therefore f(x
p(x)) = f(x0) and then f(x) = f(p(x)x0) =
fx0(p(x)). By Theorem A.1, ϕ=fx0is a homeomorphism. Let α > 0, define
˜ϕ=t7→ ϕ(t1) and ˜p=pα. Then ˜pis PHα, ˜ϕis a homeomorphism and
f= ˜ϕ˜p.
Assume that we have two couples of solutions (ϕ, p) and ( ¯ϕ, ¯p) such that
f=ϕp= ¯ϕ¯pwhere ϕ, ¯ϕare homeomorphisms and p, ¯pare PHα. For all
t > 0 and xRn, we have for instance p(tx) = tαp(x). Therefore Im(p) = R+.
Denote ψ= ¯ϕ1ϕ. For all λ > 0 and xRn,ψ(λαp(x)) = ψ(p(λx)) =
¯p(λx) = λαψ(p(x)).Hence ψis PH1on R+. For all t > 0, ψ(t) = (1).
Therefore ψis linear.
Part 2. Assume now that there exist x1, x1Rnsuch that fx1is strictly
increasing and fx1is strictly decreasing. Then f(x1)>0 and f(x1)<0.
Then thanks to the intermediate value theorem, if f(x)>0,there exists a
unique positive number λxsuch that λxx∈ Lf,x1, and if f(x)<0,there exists
a unique positive number λxsuch that λxx∈ Lf,x1. We define now pfor all
xRnas follows: if f(x) = 0 then p(x)=0,if f(x)>0 then p(x) = 1
λx, and
finally if f(x)<0 then p(x) = 1
λx. Let us show that pis PH1. Indeed for any
ρ > 0 and xRn,if f(x) = 0 (hence f(ρx) = 0), then p(ρx)=0=ρp(x). If
f(x)>0 (hence f(ρx)>0), and p(ρx) = ρ
λx=ρp(x) since λx
ρis the (unique)
positive number such that λx
ρρx =λxx∈ Lf,x1. And finally if f(x)<0 (hence
f(ρx)<0), then p(ρx) = ρ
λx=ρp(x) since λx
ρis the (unique) positive
number such that λx
ρρx =λxx∈ Lf,x1. Hence pis PH1.
We define now the function ϕ:RRsuch that if t0, ϕ(t) = fx1(t)
and if t0, ϕ(t) = fx1(t). Then, ϕis well defined (fx1(0) = 0 = fx1(0)),
continuous and strictly increasing.
Let xRn. If f(x)=0,then p(x) = 0, and then f(x)=0=(ϕp) (x).
If f(x)>0, ϕ(p(x)) = fx1(p(x)) = f(p(x)x1) = f(x) since x
p(x)∈ Lf,x1.
And finally if f(x)<0, ϕ(p(x)) = fx1(p(x)) = f(p(x)x1) = f(x) since
x
p(x)=λxx∈ Lf,x1. Thereby, f=ϕp. Theorem A.1 ensures that ϕ
is a homeomorphism. By defining for all α > 0, ˜ϕ(t) = ϕ(t1) if t0,
˜ϕ(t) = ϕ((t)1) if t < 0, ˜p(x) = p(x)αif p(x)0 and ˜p(x) = (p(x))α
if p(x)<0, it follows that f= ˜ϕ˜p.
Scaling-invariant functions versus positively homogeneous functions 9
Assume here again that we have two couples of solutions (ϕ, p) and (¯ϕ, ¯p)
such that f=ϕp= ¯ϕ¯pwhere ϕ, ¯ϕare homeomorphisms and p, ¯pare PHα.
For all t > 0 and xRn, we have p(tx) = tαp(x). Therefore Im(p) = Rsince
p(x1) and p(x2) have opposite signs. Denote ψ= ¯ϕ1ϕ. For all λ > 0 and
xRn,ψ(λαp(x)) = ψ(p(λx)) = ¯p(λx) = λαψ(p(x)).Hence ψis PH1on R.
For all t > 0, ψ(t) = (1) and ψ(t) = (1) Therefore depending on the
values of ψ(1) and ψ(1), ψis either linear or piece-wise linear. ut
We now use the previous proposition to prove that a continuous SI function
is a homeomorphic transformation of a continuous PH1function. The proof
relies on the result that for a continuous SI function, the fxare either constant
or strictly monotonic and continuous (see Proposition 2.2). We distinguish the
case where fhas a global optimum as in Proposition 3.1 (i) and the case
where fdoes not have a global optimum as in Proposition 3.1 (ii). Overall the
following result holds.
Theorem 3.1 Let fbe a continuous SI function. Then for all α > 0, there
exists a continuous PHαfunction pand a strictly increasing and continuous
bijection (thus a homeomorphism) ϕsuch that f=ϕp.
For a non-zero fand α > 0, the choice of (ϕ, p)is unique up to a left com-
position of pwith a piece-wise linear function. If fadmits a global optimum,
then 0is also a global optimum and for any x0Rnsuch that f(x0)6= 0,the
homeomorphism ϕcorresponding to a PH1function can be chosen as fx0and
is at least as smooth as f.
If fdoes not admit a global optimum, then there exist x1, x1Rnsuch
that f(x1)>0and f(x1)<0,and for any such x1and x1, the homeomor-
phism ϕcorresponding to a PH1function can be chosen as the function equal
to fx1on R+and equal to t7→ fx1(t)on R.
Proof Let fbe a continuous SI function. Thanks to Proposition 2.2, for all
xRn, fxis either constant equal to 0 or strictly monotonic.
Part 1. Assume that fhas a global optimum. Corollary 2.1 shows that 0 is
also a global optimum. Then we can apply Proposition 3.1 in the case where the
non-constant fxhave the same monotonicity. Let x0Rnsuch that f(x0)6= 0
and define ϕ=fx0. Then f=ϕpand ϕis a homeomorphism. That settles
the continuity of the PH1function ϕ1fthanks to Corollary 2.2.
Part 2. Assume in this part that fhas no global optimum. Since 0 is not a
global optimum, we can find x1and x1such that f(x1)>0 and f(x1)<0.
Therefore fx1is strictly increasing and fx1is strictly decreasing. We apply
Proposition 3.1 in the case where the non-constant fxdo not have the same
monotonicity. If ϕis the function equal to fx1on R+and to t7→ fx1(t) on
R, then f=ϕpwhere ϕis a homeomorphism. That settles the continuity
of the PH1function ϕ1fthanks to Corollary 2.2. For all α > 0, the unique
construction of (ϕ, p) up to a piece-wise linear function in both parts is a
consequence of Proposition 3.1. ut
10 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen
3.2 Sufficient and necessary condition to be the composite of a PH function
We have seen in the previous section that a continuous SI function can be
written as ϕpwith ϕstrictly monotonic and pPH. Relaxing continuity, we
prove in the next theorem some necessary and sufficient condition under which
an SI function is the composite of a PH function with a strictly monotonic
function.
Theorem 3.2 Let fbe an SI function. There exist a PH1function pand a
strictly increasing and continuous bijection (thus a homeomorphism) ϕsuch
that f=ϕpif and only if for all xRn, the function fxis either constant or
strictly monotonic and the strictly increasing fxshare the same image (i.e., if
λRis reached for one of these functions, then it is reached for all of them)
and the strictly decreasing ones too.
For a non-zero f, up to a left composition of pwith a piece-wise linear
function, the choice of (ϕ, p)is unique.
Proof We prove first the forward implication. Suppose there is a PH1function
pand a strictly monotonic function ϕsuch that f=ϕp. Consider xRn.
Either p(x) = 0 and then for any t>0 we have that p(tx) = 0, so that
fx(t) = f(tx) = ϕ(p(tx)) = ϕ(0) and fxis constant on R+, or p(x)6= 0, and
then tR+7→ p(tx) = tp(x) is strictly monotonic, and fx(t) = ϕ(p(tx)) is
strictly monotonic too on R+. Moreover, consider x16=x2such that fx1and
fx2are increasing. Then p(x1) and p(x2) are of the same sign, so there is some
t>0 such that p(x1) = tp(x2) = p(tx2), so the functions t7→ f(tx1) and
t7→ f(ttx2) are equal, so the functions fx1and fx2take the same values. The
same applies on the strictly decreasing functions.
We now prove the backward implication. Suppose that the functions fxare
either constant or strictly monotonic and the increasing ones share the same
values and the decreasing ones too.
If all the fxare constant, then for all xRn, f(x) = f(0) = 0 and it is
enough to write f=ϕpwith p=t7→ ton R+and p= 0.We assume from
now on that at least one fxis not constant.
Consider that all the non-constant fxhave the same monotonicity. Let us
choose x0such that f(x0)6= 0.Then for all x6= 0, fxand fx0have the same
monotonicity. Since they have the same image and are injective, there exists
a unique λx>0 such that λxx∈ Lf,x0.We then define pand ϕas in the Part
1 of Theorem 3.1 to ensure that f=ϕpwhere pis PH1and ϕis strictly
monotonic.
Consider finally that all the non-constant fxdo not have the same mono-
tonicity. Then there exist x1and x1such that f(x1)>0 and f(x1)<0.
Then, thanks to the assumption that all increasing fxshare the same values
and the strictly decreasing fxtoo, if f(x)>0, then there exists a unique
positive number λxsuch that λxx∈ Lf,x1={yRn, f(y) = f(x1)},and if
f(x)<0, then there exists a unique (thanks to the assumption of strict mono-
tonicity for the non-constant fx) positive number λxsuch that λxx∈ Lf,x1.
Therefore, we can define pas in Theorem 3.1. As before, pis PH1. Define
Scaling-invariant functions versus positively homogeneous functions 11
also the function ϕ:RRas in Theorem 3.1. It is still increasing, but not
necessarily continuous. Then, as in Theorem 3.1, f=ϕp.
The proof of the unicity of (ϕ, p) up to a piece-wise real linear function is
similar to the proof in Proposition 3.1. ut
Complementing Theorem 3.2, we construct an example of an SI functionfthat
can not be decomposed as f=ϕp, because the strictly increasing fxdo not
share the same image. Define fsuch that for all xRn,f(x) = tanh(x1) if
the first coordinate x10 and f(x) = 1 + exp(x1) otherwise. Then fis SI
and if x16= 0, fxis strictly increasing. However for all xsuch that x1>0 then
Im(fx) = [0,1) and otherwise for xsuch that x10, Im(fx) = {0} ∪ (2,).
4 Level sets of SI functions
Scaling-invariant functions appear naturally when studying the convergence
of comparison-based optimization algorithms [3]. In this specific context, the
difficulty of a problem is entirely determined by its level sets whose properties
are studied in this section.
4.1 Identical sublevel sets
Level sets and sublevel sets of a function fremain unchanged if we compose
the function with a strictly increasing function ϕsince
f(x)f(y)ϕ(f(x)) ϕ(f(y)) .(5)
We prove in the next theorem that two arbitrary functions fand phave
the same level sets if and only if f=ϕpwhere ϕis strictly increasing.
Theorem 4.1 Two functions fand phave the same sublevel sets if and only
if there exists a strictly increasing function ϕsuch that f=ϕp.
Proof If f=ϕpwith ϕstrictly increasing, since sublevel sets are invariant
by ϕ,fand phave the same sublevel sets. Now assume that fand phave
the same sublevel sets. Then for all xRn,there exists T(x)Rnsuch that
L
f,x =L
p,T (x). In other words for all yRn,f(y)f(x)p(y)
p(T(x)). We define the function
φ:Im(f)Im(p)
f(x)7→ p(T(x)) .
The function φis well-defined because for x, y Rnsuch that f(x) = f(y),
L
f,x =L
f,y . And since L
f,x =L
p,T (x)and L
f,y =L
p,T (y),then L
p,T (x)=
L
p,T (y),and then p(T(x)) = p(T(y)). Therefore φ(f(x)) = φ(f(y)). By con-
struction we have that φf=pT.
12 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen
Let us show that pT=p. We have L
pT,x =L
f,x =L
p,T (x). Then
x∈ L
p,T (x)and then p(x)p(T(x)). Therefore ppT. In addition for
all yRn, there exists xsuch that L
p,y =L
f,x =L
p,T (x)=L
pT,x . Then
y∈ L
pT,x , which induces that p(T(y)) p(T(x)). Plus, L
p,y =L
p,T (x),
therefore p(T(x)) = p(y). Thereby p(T(y)) p(y), and then pTp. Finally
pT=p. Hence φf=p.
Let us prove now that φis strictly increasing. Consider x, y Rnsuch
that f(x)< f(y). Then L
f,x ⊂ L
f,y with a strict inclusion, which means
that L
p,T (x)⊂ L
p,T (y)with a strict inclusion. Thereby p(T(x)) < p (T(y)) ,
i.e. φ(f(x)) < φ(f(y)). Hence φis strictly increasing. And up to restricting
φto its image, we can assume without loss of generality that φis a strictly
increasing bijection. We finally denote ϕ=φ1and it follows that f=ϕp.
ut
Theorem 4.1 and Theorem 3.2 give both equivalence conditions for an SI
function fto be equal to ϕpwhere ϕis strictly increasing and pis positively
homogeneous3. One condition is that there exists a PH function with the same
sublevel sets as f, while the other condition is that the fxare either constant
or strictly monotonic, and the strictly increasing and decreasing ones have the
same image, respectively.
4.2 Compactness of the sublevel sets
Compactness of sublevel sets is relevant for analyzing step-size adaptive ran-
domized search algorithms [2,13]. We investigate here how compactness prop-
erties shown for positively homogeneous functions extend to scaling-invariant
functions. For an SI function f, we have L
f,tx =tL
f,x. Since ψ:y7→ ty is
a homeomorphism, we have that ψ(L
f,x) equals tL
f,x and is compact if and
only if L
f,x is compact. Therefore, for all t > 0:
L
f,tx is compact if and only if L
f,x is compact. (6)
Furthermore, if pis a lower semi-continuous positively homogeneous func-
tion such that p(x)>0 for all nonzero xthen the sublevel sets of pare
compact [2, Lemma 2.7]. We recall it with all the details in the following
proposition:
Proposition 4.1 ([2, Lemma 2.7]) Let pbe a positively homogeneous func-
tion with degree α > 0and p(x)>0for all x6= 0 (or equivalently 0is the
unique global argmin of p) and p(x)finite for every xRn. Then for every
xRn,the following holds:
3Note that in Theorem 3.2, we can assume without loss of generality that ϕis always
strictly increasing by replacing if needed ϕby tϕ(t) and pby p.
Scaling-invariant functions versus positively homogeneous functions 13
(i) lim
t0p(tx) = 0 and for all x6= 0 the function px:t[0,)7→ p(tx)R+
is continuous, strictly increasing and converges to when tgoes to .
(ii) If pis lower semi-continuous, the sublevel set L
p,x is compact.
We prove a similar theorem for lower semi-continuous SI functions fwith
continuous fxfunctions, showing in particular that the unicity of the global
argmin is equivalent to the above items. Note that we need to assume the
continuity of the functions fx, while this property is unconditionally satisfied
for positively homogeneous functions where px(t) = tαpx(1) for all xRnand
for all t > 0.
Theorem 4.2 Let fbe SI. Then the conditions
f(x)>0for all x6= 0 and
0is the unique global argmin
are equivalent. Let fbe additionally lower semi-continuous and for all xRn,
fxis continuous on a neighborhood of 0. Then the following are equivalent:
(i) 0 is the unique global argmin.
(ii) for any xRn\ {0}, the function fxis strictly increasing.
(iii) The sublevel sets L
f,x for all xare compact.
Proof Since, w.l.o.g., fis given such that f(0) = 0, its unique global argmin is
0 if and only if f(x)>0 for all x6= 0. Now we prove first that (i) (ii):Let
xRn\ {0}. Assume (by contraposition) that there exists 0 < t1< t2such
that fx(t1) = fx(t2). Then by scaling-invariance, fx(1) = fxt1
t2. It follows
by multiplying iteratively by t1
t2that for all kZ+, fx(1) = fxt1
t2k.
Therefore if we take the limit when k→ ∞, it follows thanks to the continuity
of fxat 0 that f(x) = fx(1) = f(0),which contradicts the assumption (i).
Hence, fxis an injective function. Plus, there exists  > 0 such that fxis
continuous on [0, ]. Therefore fxis an injective continuous function on [0, ],
which implies that fxis a strictly monotonic function on [0, ]. Lemma 2.1
implies therefore that fxis strictly monotonic. And since 0 is an argmin of fx,
then fxis strictly increasing.
(ii) (iii):fis lower semi-continuous on the compact S1, then it reaches
its minimum on that compact: there exists s∈ S1such that f(s) = min
z∈S1
f(z).
Also, since sublevel sets of lower semi-continuous functions are closed, then
L
f,s is closed. Now let us show that it is also bounded.
If y∈ L
f,s\ {0},then f(y)f(s)fy
kyk. And since fyis strictly
increasing, we obtain that 1 1
kyk,thereby kyk ≤ 1. We have shown that
L
f,s ⊂ B1.
Then L
f,s is a compact set, as it is a closed and bounded subset of Rn. By
(6), it follows that L
f,ts is compact for all t > 0.
14 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen
For all xRn\{0}, f(x)> f (0) thanks to (ii). Then L
f,0={0}and is
compact.
Let xRn\ {0}. Then there exists  > 0 such that fx
kxkis continuous
on [0, ]. We have that fx
kxk(0) = 0 < f(s)fx
kxk(1),then by scaling-
invariance, fx
kxk(0) < f(s)fx
kxk(). Therefore by the intermediate value
theorem applied to fx
kxkcontinuous on [0, ], there exists t(0, ] such that
fx
kxk(t) = f(s). Then L
f,t x
kxk=L
f,s and is compact. We apply again (6) to
observe that L
f,x is compact.
(iii) (i):Let x∈ L
f,0,then tx ∈ L
f,0for all t0,and then {tx}tR+
L
f,0which is a compact set. This is only possible if x= 0,otherwise the set
{tx}tR+would not be bounded. Hence L
f,0={0}which implies that 0 is the
unique global argmin. ut
We derive from Theorem 4.2 the next corollary, stating that for a continuously
differentiable SI function with a unique global argmin, the intersection of any
half-line of origin 0 and a level set is a singleton.
Corollary 4.1 Let fbe a continuously differentiable SI function with 0as
unique global argmin. Then for all xRn, any half-line of origin 0intersects
Lf,x at a unique point.
Proof For all non-zero x, Theorem 4.2 ensures that fxis strictly increasing.
Therefore for a non-zero x,fxis injective. And then the intersection of a level
set and a half-line of origin 0 contains at most one point. In addition the fx
share the same image for all non-zero x. Then for two non-zero vectors x, y,
there exists t0 such that fy(t) = fx(1). In other words, there exists t0
such that ty ∈ Lf,x. We end this proof by noticing that Lf,0={0}and then
intersects any-half line of origin 0 only at 0. ut
4.3 Sufficient Condition for Lebesgue Negligible Level Sets
We assume that fis lower semi-continuous SI admitting a unique global argmin
and all fxare continuous and prove that fhas Lebesgue negligible level sets.
Proposition 4.2 Let fbe an SI function with 0as unique global argmin. As-
sume also that fis lower semi-continuous and for all xRn, fxis continuous.
Then the level sets of fare Lebesgue negligible.
Proof Let xRn. Let us denote by µthe Lebesgue measure. For all t > 0,
µ(Lf,tx) = µ(tLf,x) = tnµ(Lf,x),thanks to (3). Therefore, if t1, µ (Lf,tx )
µ(Lf,x).In addition, for all k1,Lf,(1+ 1
k)x⊂ {yRn, f (x)f(y)f(2x)} ⊂
L
f,2x,because if x6= 0, fxis strictly increasing thanks to Theorem 4.2. And
the same theorem induces that L
f,x is compact and hence µL
f,x<.
Scaling-invariant functions versus positively homogeneous functions 15
It follows that
X
k=1
µ(Lf,x)
X
k=1
µLf,(1+ 1
k)xµL
f,2x<.Hence,
µ(Lf,x) = 0. ut
4.4 Balls containing and balls contained in sublevel sets
The sublevel sets of continuous PH functions include and are embedded in balls
whose construction is scaling-invariant. Given that continuous SI functions
are monotonic transformation of PH functions, those properties are naturally
transferred to SI functions. This is what we formalize in this section.
From the definition of a PH function with degree α, for all x6= 0 we have
p(x) = kxkαp(x/kxk) for all x6= 0. Therefore, pis continuous on Rn\ {0}if
and only if pis continuous on S1. For such p, we denote mp= min
x∈S1
p(x),and
Mp= max
x∈S1
p(x).We have the following propositions:
Proposition 4.3 ( [2, Lemma 2.8] ) Let pbe a PH function with degree α
such that p(x)>0for all x6= 0. Assume that pis continuous on S1,then for
all x6= 0, the following holds
kxkm1
pp(x)1≤ kxkM1
p.(7)
Proposition 4.4 ( [2, Lemma 2.9] ) Let pbe a PH function with degree α
such that g(x)>0for all x6= 0. Assume that pis continuous on S1. Then for
all ρ > 0,the ball centered in 0and of radius ρis included in the sublevel set
of degree ραMp, i.e.B(0, ρ)⊂ L
p,ρxMp,with p(xMp) = Mp.For all x6= 0,the
sublevel set of degree p(x)is included into the ball centered in 0and of radius
(p(x)/mp)α, i.e.
L
p,x B 0,p(x)
mpα.
We can generalize both propositions to continuous scaling-invariant functions
using Theorem 3.1.
Proposition 4.5 Let fbe a continuous SI function such that f(x)>0for
all x6= 0. Then there exist an increasing homeomorphism ϕon R+and two
positive numbers 0< m Msuch that
(i) for all x6= 0,ϕ(mkxk)f(x)ϕ(Mkxk),
(ii) for all ρ > 0,the ball centered in 0and of radius ρis included in the
sublevel set of degree ϕ(ρϕ1(M)), i.e. B(0, ρ)⊂ L
f,ρxMwith f(ρxM) =
ϕρϕ1(M).
16 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen
(iii) for all x6= 0,the sublevel set of degree f(x)is included into the ball centered
in 0and of radius ϕ1(f(x))
ϕ1(m),i.e.
L
f,x B 0,ϕ1(f(x))
ϕ1(m).(8)
Proof Thanks to Theorem 3.1, we can write f=fx0pwhere x06= 0, pPH1
ϕdefined as fx0is an increasing homeomorphism. Then p=ϕ1fand hence
verifies: for all x6= 0, p(x)>0. Define mand Mas m=ϕ(mp) and M=
ϕ(Mp),where mp= min
x∈S1
p(x) and Mp= max
x∈S1
p(x).
For x6= 0,Proposition 4.3 ensures that mpkxk ≤ p(x)Mpkxk.Taking
the image of this equation with respect to ϕproves (i).
For all ρ > 0, B(0, ρ)⊂ L
p,ρxMp, with p(xMp) = Mp. Since sublevel sets
are invariant with respect to an increasing bijection, it follows that L
p,ρxMp=
L
f,ρxMp. In addition, f(ρxMp) = ϕ(p(ρxMp)) = ϕ(ρp(xMp)) = ϕ(ρϕ1(M))
such that we have proven (ii).
Let x6= 0. Again by invariance of the sublevel set, L
p,x =L
f,x. And
Proposition 4.4 says that L
p,x B 0,p(x)
mp. We obtain the results with the
facts that p=ϕ1fand mp=ϕ1(m). ut
4.5 A generalization of a weak formulation of Euler’s homogeneous function
theorem
For a function p:RnRcontinuously differentiable on Rn\ {0}, Euler’s
homogeneous function theorem states that there is equivalence between pis
PH with degree αand for all x6= 0
αp(x) = p(x)·x. (9)
If in addition pis continuously differentiable on Rn, then αp(0) = 0 = p(0)·0.
Along with (9), this latter equation implies that at each point yof a level set
Lp,x, the scalar product between p(y) and yis constant equal to p(x)·x
or that the level sets of pand of the function x7→ ∇p(x)·xare the same, that
is, the level sets of a continuously differentiable PH function satisfy
Lp,x =Lz7→∇p(z)·z,x ={yRn,p(y)·y=p(x)·x}.(10)
We call this a weak formulation of Euler’s homogeneous function theorem.
If fis a continuous SI function, we can write fas ϕpwhere pis PH
and ϕis a homeomorphism, according to Theorem 3.1. We have the following
proposition in the case where ϕand pare also continuously differentiable.
Scaling-invariant functions versus positively homogeneous functions 17
Proposition 4.6 Let fbe a continuously differentiable SI function that can
be written as ϕpwhere pis P Hα,ϕis a homeomorphism, and ϕand ϕ1are
continuously differentiable (and thus pis continuously differentiable). Then for
all xRn,
f(x)·x=α ϕ0(p(x)) p(x).(11)
Proof Since p=ϕ1f, it is continuously differentiable. From the chain rule,
for all xRn:f(x)·x=ϕ0(p(x))p(x)·x=αϕ0(p(x))p(x). The last
equality results from the Euler’s homogeneous theorem applied to p.ut
Yet, the assumptions of the previous proposition are not necessarily sat-
isfied when fis a continuously differentiable SI function. Indeed, we exhibit
in the next proposition an example of a SI and continuously differentiable
function fsuch that f=ϕpbut either por ϕis non-differentiable.
Proposition 4.7 Define ϕ:t7→ Zt
0
1
1 + log2(u)duon R+and p:x7→ |x1|.
Then f=ϕpis continuously differentiable and SI. Yet, for any ˜ϕstrictly
increasing and ˜pPH such that f= ˜ϕ˜p(including ϕand pabove), either ˜pis
not differentiable on any point of the set {x;x1= 0}or ˜ϕis not differentiable
at 0.
Proof Let us prove that fis continuously differentiable. For x6= 0, f(x) =
1
1+log2(|x1|)
x1
|x1|. Then lim
x0f(x) exists and is equal to 0, hence fis continu-
ously differentiable.
Assume that ( ˜ϕ, ˜p) is such that ϕp= ˜ϕ˜p, with ˜ϕstrictly increasing
and ˜pPHα. Denote ψ= ˜ϕ1ϕ. For all λ > 0 and xRn,ψ(λp(x)) =
ψ(p(λx)) = ˜p(λx) = λαψ(p(x)).Therefore ψis PHαon Im(p) = R+, hence for
all t > 0, ψ(t) = tαψ(1). Then up to a positive constant multiplicative factor,
˜p(x) = |x1|αand ˜ϕ(t) = ϕ(t1). And then if ˜pis differentiable, we necessarily
have that α > 1.
In the case where α > 1, for all t > 0, ˜ϕ0(t) = 1
α
t1
α1
1+log2(t1)and then ˜ϕis
not differentiable at 0. ut
Yet we can prove that for all continuously differentiable SI functions, the
level set of fgoing through x, i.e. Lf,x is included in the level set of z7→
f(z)·zgoing through x.
Lemma 4.1 For a continuously differentiable SI function fand for xRn,
Lf,x ⊂ Lz7→∇f(z)·z,x ={yRn,f(y)·y=f(x)·x}.(12)
That is, each level set of fhas a single value of f(x)·xwhile also different
level sets of fcan have the same value of f(x)·x.
18 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen
Proof Let y∈ Lf,x. Since f(y) = f(x),then for all t0, f(ty) = f(tx). We
define the function hon R+such that for all t0, h(t) = f(tx)f(ty). Then
his the zero function, so is its derivative: h0(t) = f(tx)·x− ∇f(ty)·y= 0
for all t0. In particular we have the result for t= 1. ut
We exhibit in the next proposition a continuously differentiable SI func-
tion where the inclusion in the above lemma is strict (another example is
Lemma 4.2).
Proposition 4.8 Let pbe the PH2function xRn7→ kxk2and ϕthe strictly
monotonic function ϕ(t) = exp(t)for all t0. Then f:x7→ ϕ(p(x)) =
exp(−kxk2)is continuously differentiable. For any 0< r < 1, there is a unique
s > 1such that for any x∈ Sr:
Lz7→∇f(z)·z,x =Lf,x ∪ Lf , s
rx(13)
where Lf,x and Lf, s
rxare disjoint.
Proof Remark that the transformation could be chosen to obtain a degree
equal to 1 as in Theorem 3.1, but the differentiability of pwould not be guar-
anteed.
We notice that t0(t) is not injective on R+. It is injective on [0,1)
and on [1,) and for any 0 < r < 1 there is a unique s > 1 such that
r2ϕ0(r2) = s2ϕ0(s2).(14)
We will prove that for any such r, s and any x∈ Sr,
{yRn,f(y)·y=f(x)·x}=Lf,x ∪ Lf, s
rx(15)
Let xRnsuch that kxk=r. By the chain rule, for all yRnwe have
f(y)·y=ϕ0(p(y))p(y)·y= 2ϕ0(p(y))p(y).
Therefore y∈ {yRn,f(y)·y=f(x)·x}if and only if kyk2ϕ0(kyk2) =
r2ϕ0(r2). From (14), we know that this is possible only if kyk=ror kyk=s,
i.e. only if f(y) = f(x) or f(y) = f(s
rx). Hence the equality in (15).
We remark that Lf,x and Lf , s
rxare disjoint whenever f(x)6=f(s
rx). If
x∈ Sr, f(x) = er26=es2=fs
rxwhich implies that Lf,x and Lf, s
rxare
disjoint. ut
The non-injectivity of t0(t) is essential in the above example to ob-
tain a non-strict inclusion in (12) for some SI functions. We obtain a weak
formulation of Euler’s homogeneous function theorem for some SI functions in
the following proposition.
Proposition 4.9 Let fbe a continuously differentiable SI function that can
be written as ϕpwhere ϕis a homeomorphism, pis PH1and ϕand ϕ1are
continuously differentiable. Assume that the function tR+7→ 0(t)Ris
injective. Then for xRn,
Lf,x =Lz7→∇f(z)·z,x .(16)
Scaling-invariant functions versus positively homogeneous functions 19
Proof It follows from Proposition 4.6 that for all xRn,f(x)·x=ϕ0(p(x))p(x).
Thanks to the bijectivity of ϕalong with the injectivity of t0(t),we have:
ϕ0(p(y))p(y) = ϕ0(p(x))p(x)p(x) = p(y)f(x) = f(y).
In other words, Lf,x ={yRn,f(y)·y=f(x)·x}.ut
4.6 Compact neighborhoods of level sets with non-vanishing gradient
We prove in this section that any continuously differentiable SI function f
with a unique global argmin has level sets, for example Lf,z0, such that for
some compact neighborhood of the level set, N ⊃ Lf,z0, the gradient does not
vanish and f(z)·z > 0 for all z N .
For a continuously differentiable PH function psuch that p(x)>0 for
all x6= 0, i.e. such that 0 is the unique global argmin of p, this result is a
consequence of Euler’s homogeneous function theorem which implies that
p(x)·x > 0 for all x6= 0 .(17)
In particular, (17) is true on any compact neighborhood of any level set of p,
if that compact does not contain 0.
We now remark that the property that f6= 0 for all x6= 0 is not neces-
sarily true if fis a continuously differentiable SI function with a unique global
argmin. Namely, fcan have level sets that contain only saddle points.
Lemma 4.2 Let p(z) = kzk2and ϕ(t) = Zt
0
sin2(u)dufor t0. Then f=
ϕpis a continuously differentiable SI function with a unique global argmin
and an infinite number of zbelonging to different level sets of f, such that
f(z)=0.
Proof The function ϕis strictly increasing since sin2is non-negative and has
zeros on isolated points. Also, for all t0, ϕ(t) = t
2sin(2t)
4, where we use
that cos(2t)=12 sin2(t).
For any natural integer n, nπ is a stationary point of inflection of ϕ:
ϕ0() = 0 and ϕ00 (t) = sin(2t) has opposite signs in the neighborhood of nπ.
For all zwith kzk2πZ+,f(z) = ϕ0(g(z))g(z)=2ϕ0(kzk2)z= 0.
Hence there exists an infinite number of level sets Lf,z for which f(z) = 0.
ut
Yet, a consequence of Theorem 4.2 and Lemma 4.1 is the existence of a
level set of fsuch that f(z)·z > 0 for all zin that level set as shown in the
next proposition
Proposition 4.10 Let fbe a continuously differentiable SI function with 0
as unique global argmin. There exists z0∈ B1with Lf,z0⊂ B1, such that for
all z∈ Lf,z0,f(z)·z > 0.
20 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen
Proof Since fis a continuous SI function, we have all the equivalences in
Theorem 4.2.
Inside the proof of Theorem 4.2, we have shown that there exists s∈ S1
such that L
f,s ⊂ B1,with f(s) = min
z∈S1
f(z). Since fsis strictly increasing
and differentiable, there exists t(0,1] such that f0
s(t)>0. Let us denote
z0=ts. We have that Lf,z0⊂ L
f,s ⊂ B1. And with the chain rule, 0 < f 0
s(t) =
f(z0)·z0
t. Therefore along with Lemma 4.1, it follows that for all z∈ Lf,z0,
f(z)·z=f(z0)·z0>0.ut
From the uniform continuity of z7→ ∇f(z)·zon a compact we deduce the
announced result.
Proposition 4.11 Let fbe a continuously differentiable SI function with 0
as unique global argmin. There exists δ > 0,z0∈ B1with Lf,z0⊂ B1such that
for all z∈ Lf,z0+B(0, δ),f(z)·z > 0.
Proof Since f(z)·z > 0 for all zin the compact Lf,z0, then z7→ ∇f(z)·zhas
a positive minimum (that is reached) denoted by = minz∈Lf,z0f(z)·z > 0.
The continuous function z7→ ∇f(z)·zis uniformly continuous on the compact
Lf,z0+B(0,1), therefore there exists a positive number δ < 1 such that if y, z
Lf,z0+B(0,1) with yzδthen |∇f(z)·z− ∇f(y)·y|<
2. Then for all
z∈ Lf,z0+B(0, δ), there exists y∈ Lf ,z0such that |∇f(z)·z− ∇f(y)·y|<
2.
Then f(z)·z > f(y)·y
2
2>0. Hence z7→ ∇f(z)·zis positive on
the compact set Lf,z0+B(0, δ). ut
Acknowledgements
Part of this research has been conducted in the context of a research collab-
oration between Storengy and Inria. We particularly thank F. Huguet and
A. Lange from Storengy for their strong support.
References
1. Joseph Acz´el. Lectures on functional equations and their applications. Academic press,
1966.
2. Anne Auger and Nikolaus Hansen. Linear convergence on positively homogeneous func-
tions of a comparison based step-size adaptive randomized search: the (1+1)-ES with
generalized one-fifth success rule. arXiv preprint arXiv:1310.8397, 2013.
3. Anne Auger and Nikolaus Hansen. Linear convergence of comparison-based step-size
adaptive randomized search via stability of markov chains. SIAM Journal on Optimiza-
tion, 26(3):1589–1624, 2016.
4. Gerard Buskes and Arnoud van Rooij. Topological spaces. In Topological Spaces, pages
187–201. Springer, 1997.
5. Arnaud Denjoy. Sur les fonctions d´eriv´ees sommables. Bul letin de la Soci´et´e
Math´ematique de France, 43:161–248, 1915.
Scaling-invariant functions versus positively homogeneous functions 21
6. J Dutta, JE Martinez-Legaz, and AM Rubinov. Monotonic analysis over cones: I.
Optimization, 53(2):129–146, 2004.
7. Herv´e Fournier and Olivier Teytaud. Lower bounds for comparison based evolution
strategies using vc-dimension and sign patterns. Algorithmica, 59(3):387–408, 2011.
8. Valentin V Gorokhovik and Marina Trafimovich. Positively homogeneous functions
revisited. Journal of Optimization Theory and Applications, 171(2):481–503, 2016.
9. Valentin V Gorokhovik and Marina Trafimovich. Saddle representations of positively ho-
mogeneous functions by linear functions. Optimization Letters, 12(8):1971–1980, 2018.
10. Godefroy Harold Hardy. Weierstrass’s non-differentiable function. Trans. Amer. Math.
Soc, 17(3):301–325, 1916.
11. Marek Kuczma. An introduction to the theory of functional equations and inequalities:
Cauchy’s equation and Jensen’s inequality. Springer Science & Business Media, 2009.
12. JB Lasserre and JB Hiriart-Urruty. Mathematical properties of optimization problems
defined by positively homogeneous functions. Journal of optimization theory and ap-
plications, 112(1):31–52, 2002.
13. Daiki Morinaga and Youhei Akimoto. Generalized drift analysis in continuous domain:
linear convergence of (1+1)-ES on strongly convex functions with lipschitz continuous
gradients. In Proceedings of the 15th ACM/SIGEVO Conference on Foundations of
Genetic Algorithms, pages 13–24, 2019.
14. Marian Muresan. A concrete approach to classical analysis, volume 14. Springer, 2009.
15. John A Nelder and Roger Mead. A simplex method for function minimization. The
computer journal, 7(4):308–313, 1965.
16. AM Rubinov and RN Gasimov. Strictly increasing positively homogeneous functions
with application to exact penalization. Optimization, 52(1):1–28, 2003.
17. AM Rubinov and BM Glover. Duality for increasing positively homogeneous functions
and normal sets. RAIRO-Operations Research-Recherche Op´erationnelle, 32(2):105–
123, 1998.
A Bijection Theorem
This standard theorem is reminded for the sake of completeness.
Theorem A.1 (Bijection theorem, [14, Theorem 2.20]) Let IRbe a non-empty
interval, JRand ϕ:IJbe a continuous bijection (and therefore strictly monotonic).
Then Jis an interval and ϕis a homeomorphism, i.e.ϕ1:JIis also a continuous
bijection, and if ϕis strictly increasing (respectively strictly decreasing), then ϕ1is strictly
increasing (respectively strictly decreasing).
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Article
We say that a positively homogeneous function admits a saddle representation by linear functions iff it admits both an inf-sup-representation and a sup-inf-representation with the same two-index family of linear functions. In the paper we show that each continuous positively homogeneous function can be associated with a two-index family of linear functions which provides its saddle representation. We also establish characteristic properties of those two-index families of linear functions which provides saddle representations of functions belonging to the subspace of Lipschitz continuous positively homogeneous functions as well as the subspaces of difference sublinear and piecewise linear functions.
Full-text available
Article
The paper deals with positively homogeneous functions defined on a finite-dimensional space. Our attention is mainly focused on those subspaces of positively homogeneous functions that are important in nonsmooth analysis and optimization: the subspace of continuous positively homogeneous functions, of Lipschitz continuous positively homogeneous functions, of difference sublinear functions, and of piecewise linear functions. We reproduce some known results and present a number of new ones, in particular, those that concern Lipschitz continuous positively homogeneous functions.
Conference Paper
We prove the linear convergence of the (1 + 1)-Evolution Strategy (ES) with a success based step-size adaptation on a broad class of functions, including strongly convex functions with Lipschitz continuous gradients, which is often assumed to analyze gradient based methods. Our proof is based on the methodology recently developed to analyze the same algorithm on the spherical function, namely the additive drift analysis on unbounded continuous domain. An upper bound of the expected first hitting time is derived, from which we can conclude that our algorithm converges linearly. We investigate the class of functions that satisfy the assumptions of our main theorem, revealing that strongly convex functions with Lipschitz continuous gradients and their strictly increasing transformation satisfy the assumptions. To the best of our knowledge, this is the first paper showing the linear convergence of the (1+1)-ES on such a broad class of functions. This opens the possibility to compare the (1 + 1)-ES and gradient based methods in theory.
Article
A nonlinear duality operation is defined for the class of increasing positively homogeneous functions defined on the positive orthant (including zero). This class of function and the associated class of normal sets are used extensively in Mathematical Economics. Various examples are provided along with a discussion of duality for a class of optimization problems involving increasing functions and normal sets.
Book
Marek Kuczma was born in 1935 in Katowice, Poland, and died there in 1991. After finishing high school in his home town, he studied at the Jagiellonian University in Kraków. He defended his doctoral dissertation under the supervision of Stanislaw Golab. In the year of his habilitation, in 1963, he obtained a position at the Katowice branch of the Jagiellonian University (now University of Silesia, Katowice), and worked there till his death. Besides his several administrative positions and his outstanding teaching activity, he accomplished excellent and rich scientific work publishing three monographs and 180 scientific papers. He is considered to be the founder of the celebrated Polish school of functional equations and inequalities. "The second half of the title of this book describes its contents adequately. Probably even the most devoted specialist would not have thought that about 300 pages can be written just about the Cauchy equation (and on some closely related equations and inequalities). And the book is by no means chatty, and does not even claim completeness. Part I lists the required preliminary knowledge in set and measure theory, topology and algebra. Part II gives details on solutions of the Cauchy equation and of the Jensen inequality [...], in particular on continuous convex functions, Hamel bases, on inequalities following from the Jensen inequality [...]. Part III deals with related equations and inequalities (in particular, Pexider, Hossz, and conditional equations, derivations, convex functions of higher order, subadditive functions and stability theorems). It concludes with an excursion into the field of extensions of homomorphisms in general." (Janos Aczel, Mathematical Reviews) "This book is a real holiday for all the mathematicians independently of their strict speciality. One can imagine what deliciousness represents this book for functional equationists." (B. Crstici, Zentralblatt für Mathematik).
Article
In this paper, we consider \emph{comparison-based} adaptive stochastic algorithms for solving numerical optimisation problems. We consider a specific subclass of algorithms called comparison-based step-size adaptive randomized search (CB-SARS), where the state variables at a given iteration are a vector of the search space and a positive parameter, the step-size, typically controlling the overall standard deviation of the underlying search distribution. We investigate the linear convergence of CB-SARS on \emph{scaling-invariant} objective functions. Scaling-invariant functions preserve the ordering of points with respect to their function value when the points are scaled with the same positive parameter (the scaling is done w.r.t. a fixed reference point). This class of functions includes norms composed with strictly increasing functions as well as \emph{non quasi-convex} and \emph{non-continuous} functions. On scaling-invariant functions, we show the existence of a homogeneous Markov chain, as a consequence of natural invariance properties of CB-SARS (essentially scale-invariance and invariance to strictly increasing transformation of the objective function). We then derive sufficient conditions for asymptotic \emph{global linear convergence} of CB-SARS, expressed in terms of different stability conditions of the normalised homogeneous Markov chain (irreducibility, positivity, Harris recurrence, geometric ergodicity) and thus define a general methodology for proving global linear convergence of CB-SARS algorithms on scaling-invariant functions.
Book
Sets and Numbers.- Vector Spaces and Metric Spaces.- Sequences and Series.- Limits and Continuity.- Differential Calculus on R.- Integral Calculus on R.- Differential Calculus on R.- Double Integrals, Triple Integrals, and Line Integrals.- Constants.- Asymptotic and Combinatorial Estimates.
Article
In this article, we study increasing and positively homogeneous functions defined on convex cones of locally convex spaces. This work is the first part in a series of studies to have a general view of the emerging area of Monotonic Analysis. We develop a general notion of so-called elementary functions, so that the generalized increasing and positively homogeneous functions can be represented as upper-envelopes of families of such functions. We also study many other associated properties like the description of support sets and normal and co-normal sets in a very general setting.
Article
We consider the nonlinear programming problem (P) ® { minf(x)| gi (x) \leqslant bi ,i = 1, ¼,m} ,(\mathcal{P}) \mapsto \{ \min f(x)\left| {g_i } \right.(x) \leqslant b_i ,i = 1, \ldots ,m\} , with ff positively p-homogeneous and gig_i positively q-homogeneous functions. We show that (P)(\mathcal{P}) admits a simple min–max formulation (D)(\mathcal{D}) with the inner max-problem being a trivial linear program with a single constraint. This provides a new formulation of the linear programming problem and the linear-quadratic one as well. In particular, under some conditions, a global (nonconvex) optimization problem with quadratic data is shown to be equivalent to a convex minimization problem.