Content uploaded by Cheikh Toure

Author content

All content in this area was uploaded by Cheikh Toure on Jan 10, 2021

Content may be subject to copyright.

JOTA manuscript No.

(will be inserted by the editor)

Scaling-invariant functions versus positively homogeneous

functions

Cheikh Toure ·Armand Gissler ·Anne

Auger ·Nikolaus Hansen

Received: date / Accepted: date

Abstract Scaling-invariant functions preserve the order of points when the

points are scaled by the same positive scalar (with respect to a unique reference

point).

Composites of strictly monotonic functions with positively homogeneous

functions are scaling-invariant with respect to zero. We prove in this paper that

the reverse is true for large classes of scaling-invariant functions. Speciﬁcally,

we give necessary and suﬃcient conditions for scaling-invariant functions to

be composites of a strictly monotonic function with a positively homogeneous

function. We also study sublevel sets of scaling-invariant functions generalizing

well-known properties of positively homogeneous functions.

Keywords scaling-invariant function ·positively homogeneous function ·

compact level set.

Mathematics Subject Classiﬁcation (2000) 49J52 ·54C35

Contents

1 Introduction........................................ 2

2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Scaling invariant functions as composite of strictly monotonic functions with pos-

itively homogeneous functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.1 Continuous SI functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 Suﬃcient and necessary condition to be the composite of a PH function . . . 17

4 Level sets of SI functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.1 Identical sublevel sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.2 Compactness of the sublevel sets . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.3 Suﬃcient Condition for Lebesgue Negligible Level Sets . . . . . . . . . . . . . 25

4.4 Balls containing and balls contained in sublevel sets . . . . . . . . . . . . . . 26

4.5 A generalization of a weak formulation of Euler’s homogeneous function theorem 28

Inria and CMAP, Ecole Polytechnique, IP Paris, France

ﬁrstname.lastname@inria.fr

cheikh.toure@polytechnique.edu

2 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen

4.6 Compact neighborhoods of level sets with non-vanishing gradient . . . . . . . 33

A Bijection Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

1 Introduction

A function f:Rn→Ris scaling-invariant (SI) with respect to a reference

point x?∈Rnif for all x, y ∈Rnand ρ > 0:

f(x?+x)≤f(x?+y)⇐⇒ f(x?+ρx)≤f(x?+ρy),(1)

that is, the f-order of any two points is invariant under a multiplicative change

of their distance to the reference point—the order only depends on their di-

rection and their relative distance to the reference. Scaling-invariant functions

appear naturally when studying the convergence of comparison-based opti-

mization algorithms where the update of the state of the algorithm is using f

only through comparisons of candidate solutions [3,7]. A famous example of a

comparison-based optimization algorithm is the Nelder-Mead method [15].

A function p:Rn→Ris positively homogeneous (PH) with degree α > 0

(PHα) if for all x∈Rnand ρ > 0:

p(ρ x) = ραp(x).(2)

Positively homogeneous functions are scaling-invariant with respect to x?= 0.

We also consider that x7→ p(x−x?) is positively homogeneous w.r.t. x?when

pis positively homogeneous. Linear functions, norms, and convex quadratic

functions are positively homogeneous. We can deﬁne PH functions piecewise

on cones or half-lines, because a function is PHαif and only if (2) is sat-

isﬁed within each cone or half-line (which is not the case with SI functions

where xand yin (1) can belong to diﬀerent cones). For example, the func-

tion p:Rn→Rdeﬁned as p(x) = x1if x1x2>0 and p(x) = 0 otherwise,

is PH1. Positively homogeneous functions and in particular increasing pos-

itively homogenous functions are well-studied in the context of Monotonic

Analysis [6, 16, 17] or nonsmooth analysis and nonsmooth optimization [8].

Speciﬁcally, non-linear programming problems where the objective function

and constraints are positively homogeneous are analyzed in [12] whereas sad-

dle representations of continuous positively homogeneous functions by linear

functions are established in [9]. The (left) composition of a PH function with

a strictly monotonic function is SI while this composite function is in general

not PH. One of the questions we investigate in this paper is to which extent

SI functions and composites of PH functions with strictly monotonic functions

are the same. We prove that a continuous SI function is always the compos-

ite of a strictly monotonic function with a PH function. We give necessary

and suﬃcient conditions for an SI function to be the composite of a strictly

monotonic function with a PH function in the general case.

Only level sets or sublevel sets matter to determine the diﬃculty of an

SI problem optimized with a comparison-based algorithm. We investigate dif-

ferent properties of level sets thereby generalizing properties that are known

Scaling-invariant functions versus positively homogeneous functions 3

for PH functions, including a formulation of the Euler homogenous function

theorem that holds for PH functions.

Notations: We denote R+the interval [0,+∞),R−= (−∞,0], Zthe set of

all integers, Z+the set of all non-negative integers and Qthe set of rational

numbers. The Euclidian norm is denoted by k.k.For x∈Rnand ρ > 0, we

denote by B(x, ρ) = {y∈Rn;kx−yk< ρ}the open ball centered at xand of

radius ρ,B(x, ρ) its closure and S(x, ρ) its boundary. When they are centered

at 0, we denote Bρ=B(0, ρ), Bρ=B(0, ρ) and Sρ=S(0, ρ). For an interval

I⊂Rand a function ϕ:I→R,we use the terminology of strictly increasing

(respectively strictly decreasing) if for all a, b ∈Iwith a < b, ϕ(a)< ϕ(b)

(respectively ϕ(a)> ϕ(b)). For a real number ρand a subset A⊂Rn, we

deﬁne ρA ={ρ x;x∈A}. For a function f, we denote by Im(f) the image of

f.

2 Preliminaries

Given a function f:Rn→Rand x∈Rn, we denote the level set going

through xas Lf,x ={y∈Rn, f (y) = f(x)}and the sublevel set as L≤

f,x =

{y∈Rn, f (y)≤f(x)}.

If fis SI with respect to x?, then the function x7→ f(x+x?)−f(x?) is

scaling invariant with respect to 0. Hence, if a function fis SI, we assume

in the following that fis SI with respect to the reference point 0 and that

f(0) = 0, without loss of generality.

We can immediately imply from (1) that if xand ybelong to the same

level set, then ρx and ρy belong to the same level set. Hence the level set of x

and ρx are scaled from one another, i.e. Lf,ρx =ρLf,x .

Similarly, since for any x, y ∈Rnand ρ > 0, f(y

ρ)≤f(x) if and only if

f(y)≤f(ρx),

L≤

f,ρx =ρL≤

f,x , and Lf,ρx =ρLf ,x .(3)

These properties are visualized in Figure 1.

Given an SI function f, we deﬁne surjective restrictions of fto half-lines

along a vector x∈Rnas

fx:t∈[0,∞)7→ f(tx).(4)

It is immediate to see that the fxare also SI1. However, fmay not be SI even

when all fxare2.

1This directly follows because for s, t ∈R+and ρ > 0, fx(t)≤fx(s)⇐⇒ f(tx)≤

f(sx)⇐⇒ f(ρtx)≤f(ρsx)⇐⇒ fx(ρt)≤fx(ρs).

2For example, deﬁne f:R→Ras t7→ ton R+and t7→ t2on R−. Then f1(t) = tand

f−1(t) = t2, for t∈R+, are both SI and even PH with degree 1 and 2, respectively. But f

is not SI, and hence also not PH, because f(1

2) = 1

2>1

4=f(−1

2) but f(4 ×1

2) = 2 <4 =

f(4 ×(−1

2)).

4 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen

Fig. 1: Level sets of SI functions with respect to the red star x?. The four

functions are strictly increasing transformations of x7→ p(x−x?) where pis

a PH function. From left to right: p(x) = kxk;p(x) = x>Ax for Asymmetric

positive and deﬁnite; p(x) = Pip|xi|2the 1

2-norm; a randomly generated

SI function from a “smoothly” randomly perturbed sphere function. The two

ﬁrst functions from the left have convex sub-level sets, contrary to the last

two.

Scaling invariant functions have at most one isolated local optimum [3]

where an isolated local optimum, say, an isolated argmin, x, for a function g:

Rn→Ris deﬁned in that there exists > 0 such that for all y∈ B (x, )\ {x},

g(y)> g(x).This result is reminded in the following proposition.

Proposition 2.1 (see [3, Proposition 3.2]) Let fbe an SI function. Then

fcan admit an isolated local optimum only in f(0) = 0 and this local optimum

is also the global optimum. In addition, the functions fxcannot admit a local

plateau, i.e., a ball where the function is locally constant, unless the function

is equal to 0everywhere.

We characterize in the following the functions fxof an SI function funder

diﬀerent conditions.

Proposition 2.2 If fis a continuous SI function on Rn, then for all x∈Rn,

fxis either constant equal to 0or strictly monotonic.

More speciﬁcally, if ϕ:R+→Ris a 1-dimensional continuous SI function,

then ϕis either constant equal to 0or strictly monotonic.

Proof Assume that ϕis not strictly monotonic on [0,∞). Then ϕis not strictly

monotonic on (0,∞). Since continuous injective functions are strictly mono-

tonic, ϕis not injective on (0,∞). Therefore there exists 0 < s < t such that

ϕ(s) = ϕ(t). By scaling-invariance, it follows that ϕ(s

t) = ϕ(1). It follows iter-

atively that for all integer k > 0, ϕ (s

t)k=ϕ(1). Taking the limit for k→ ∞,

we obtain that ϕ(0) = ϕ(1). Thereby by scaling-invariance again, it follows

for all ρ > 0 that ϕ(0) = ϕ(ρ). Hence we have shown that if ϕis not strictly

monotonic, it is a constant function.

Now if fis a continuous SI function on Rnand x∈Rn,then fxis also

scaling invariant and continuous. Then it follows that fxis either constant or

strictly monotonic. ut

We deduce from Proposition 2.2 the next corollary.

Scaling-invariant functions versus positively homogeneous functions 5

Corollary 2.1 Let fbe a continuous SI function. If fhas a local optimum

at x, then for all t≥0,f(tx) = f(0). In particular, if fhas a global argmin

(resp. argmax), then 0is a global argmin (resp. argmax).

Proof Assume that there exists a local optimum at x. Then fxhas a local

optimum at 1. Therefore fxis not strictly monotonic, and thanks to Proposi-

tion 2.2, fxis necessarily a constant function. In other words, f(tx) = f(0)

for all t≥0. ut

We derive another proposition with the same conclusions as Proposition 2.2

but under a diﬀerent assumption. We start by showing the following lemma.

Lemma 2.1 Let ϕ:R+→Rbe an SI function continuous at 0and strictly

monotonic on a non-empty interval I⊂R+, then ϕis strictly monotonic.

Proof Assume without loss of generality that ϕis strictly increasing on I

and that I= (a, b) with 0 < a < b, up to replacing Iwith a subset of I.

Denote ρ=b

a. Then ρk, ρk+1k∈Zcovers (0,∞). To prove that ϕis strictly

increasing on (0,∞), it is enough to prove that ϕis strictly increasing on

[ρk, ρk+1] for all integer k.

Let kbe an integer and (x, y) two real numbers such that ρk≤x<y≤

ρk+1. Then a≤ax

ρk<ay

ρk≤aρ =b. Therefore ϕ(ax

ρk)< ϕ(ay

ρk). And by scaling-

invariance, ϕ(x)< ϕ(y).

With the continuity at 0, it follows that ϕis strictly increasing on R+.ut

We derive from Lemma 2.1 the following proposition.

Proposition 2.3 Let fbe an SI function continuous at 0. Assume that each

fxis either strictly monotonic or constant on some non-empty interval. Then

for all x∈Rn,fxis either constant equal to 0or strictly monotonic.

Note that the continuity of the function fxalone does not suﬃce to con-

clude that fxis either constant or strictly monotonic on some non-empty inter-

val. Indeed there exist 1-D continuous functions (even diﬀerentiable functions)

that are not monotonic on any non-empty interval [4,5, 10].

For the sake of completeness, we construct SI functions in R+that are

not monotonic on any non-empty interval. The construction of such functions

is based on the nonlinear solutions of the Cauchy functional equation: for all

x, y ∈R, g(x+y) = g(x)+g(y), called Hamel functions [11]. A Hamel function

falso satisﬁes f(q1x+q2y) = q1f(x) + q2f(y) for all real numbers x, y and

rational numbers q1, q2[1, Chapter 2]. Since gis nonlinear, there exist real

numbers xand ysuch that the vectors {(x, g(x)),(y, g(y))}form a basis of R2

over the ﬁeld R. Then the graph of g, which is a vector subspace of R2over the

ﬁeld Q, contains q1·(x, g(x)) + q2·(y, g(y)); (q1, q2)∈Q2which is dense in

R2. Therefore a 1-D Hamel function is highly pathological, since its graph is

dense in R2.

Lemma 2.2 There exist SI functions on R+that are neither monotonic nor

continuous on any non-empty interval.

6 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen

Proof We start by choosing a nonlinear solution of the Cauchy’s functional

equation denoted by g:R→R,knowing that there are uncountably many

ways to pick such a g[11]. Then for all real numbers aand b,g(a+b) =

g(a)+g(b). And since gis not linear, we also know that g is neither continuous

nor monotonic on any interval [11]. Let us deﬁne f= exp ◦g◦log on (0,∞)

and f(0) = 0. Then f(x)>0 for all x > 0 and fis still not monotonic

on any non-empty interval. We also have for all ρ > 0 and x > 0, f(ρx) =

exp (g(log(x) + log(ρ))) = exp(g(log(x))) exp(g(log(ρ))) = f(x)f(ρ). This last

result gives the scaling-invariance property. ut

Based on Lemma 2.2, we derive the next proposition.

Proposition 2.4 There exist SI functions fon Rnsuch that for all non-zero

x,fxis neither monotonic nor continuous on any non-empty interval.

Proof Based on Lemma 2.2, there exists ϕ:R+→RSI on R+which is

neither monotonic nor continuous on any non-empty interval. We construct f

as follows. For all x∈Rn,f(x) = ϕ(kxk). Then fis SI because for x, y ∈Rn

and for ρ > 0, f(x)≤f(y)⇐⇒ ϕ(kxk)≤ϕ(kyk)⇐⇒ ϕ(ρkxk)≤

ϕ(ρkyk)⇐⇒ f(ρx)≤f(ρy).In addition for a non-zero xand t≥0, fx(t) =

f(tx) = ϕ(tkxk) and then fxis neither monotonic nor continuous on any non-

empty interval. ut

Now assume that fis a continuous scaling invariant function and we can

write f=ϕ◦gwhere ϕis a continuous bijection and gis a positively homo-

geneous function. As a direct consequence of Theorem A.1, ϕ−1is continuous

if ϕis a continuous bijection deﬁned on an interval. Therefore g=ϕ−1◦fis

also continuous. This result is stated in the following corollary.

Corollary 2.2 Let fbe a continuous SI function, ϕa continuous bijection

deﬁned on an interval in Rand pa positively homogeneous function such that

f=ϕ◦p. Then pis also continuous.

3 Scaling invariant functions as composite of strictly monotonic

functions with positively homogeneous functions

As underlined in the introduction, compositions of strictly monotonic functions

with positively homogeneous functions are scaling-invariant (SI) functions.

We investigate in this section under which conditions the converse is true,

that is, when SI functions are compositions of strictly monotonic functions

with PH functions. Section 3.1 shows that continuity is a suﬃcient condition,

whereas Section 3.2 gives some necessary and suﬃcient condition on fto be

decomposable in this way.

Scaling-invariant functions versus positively homogeneous functions 7

3.1 Continuous SI functions

We prove in this section a main result of the paper: any continuous SI function

fcan be written as f=ϕ◦pwhere pis PH1and ϕis a homeomorphism (and in

particular strictly monotonically increasing and continuous). The proof relies

on the following proposition where we do not assume yet that fis continuous

but only the restrictions of fto the half-lines originating in 0, the fxfunctions.

Proposition 3.1 Let fbe an SI function such that for any x∈Rn,fxas

deﬁned in (4) is either constant or strictly monotonic and continuous. Then

for all α > 0, there exist a PHαfunction pand a strictly increasing, continuous

bijection (thus a homeomorphism) ϕsuch that f=ϕ◦p. For a non-zero f

and α > 0, the choice of (ϕ, p)is unique up to a left composition of pwith a

piece-wise linear function.

(i) In addition, if all non-constant fxhave the same monotonicity for all x∈

Rn,then for any x0∈Rnsuch that f(x0)6= 0,the homeomorphism ϕ

corresponding to a PH1function can be chosen as fx0and is at least as

smooth as f.

(ii) Otherwise, there exist x1, x−1∈Rnsuch that fx1is strictly increasing and

fx−1is strictly decreasing. And for any such x1and x−1we can choose as

ϕthe following function equal to fx1on R+and equal to t7→ fx−1(−t)on

R−.

Proof Let fbe an SI function such that for any x∈Rn,fxis either a constant

or a strictly monotonic continuous function.

In the case where all the fxare constant for all x∈Rn,then f= 0

and therefore we can take pα= 0 as a candidate for a continuous PHαand

ϕα:t7→ tas the candidate for the corresponding homeomorphism.

From now on, at least one of the {fx}x∈Rnis non-constant. We now split

the proof in two parts, the case where all the non-constant fxhave the same

monotonicity and the case where there exist x1, x−1∈Rnsuch that fx1is

strictly increasing and fx−1is strictly decreasing.

Part 1. Assume here that all the non-constant fxhave the same monotonic-

ity for all x∈Rn. And up to a transformation x7→ −f(x),we can assume

without loss of generality that they are increasing. Therefore 0 is a global

argmin and since we have assumed f(0) = 0 : f(x)≥0 for all x∈Rn. Then

there exists x0∈Rnsuch that f(x0)>0.

For any x∈ Lf,x0={y∈Rn, f (y) = f(x0)}, and any λ > 0 diﬀerent than

1, λx /∈ Lf ,x0. Indeed, as x∈ Lf,x0, we know from Proposition 2.2 that fxis

strictly increasing on R+,since fxcannot be constant equal to 0.

Moreover, for all x∈Rnsuch that f(x)6= 0, there exists λ > 0 such that

λx ∈ Lf,x0. Indeed, if f(x)< f (x0), the intermediate value theorem applied

to the continuous function fx0shows that there exists 0 < t < 1 such that

f(tx0) = fx0(t) = f(x), and then f(1

tx) = f(x0). And by interchanging xand

x0,the same argument holds if f(x)> f(x0).

8 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen

The two last paragraphs ensure that for all xsuch that f(x)6= 0,there

exists a unique positive number λxsuch that λxx∈ Lf,x0. Let us deﬁne the

function pfor all x∈Rnas follows: if f(x)6= 0 then pw(x) = 1

λx,otherwise

p(x) = 0. We prove in the following that pis PH1.

Let x∈Rnand ρ > 0. If f(x) = 0 (hence f(ρx) = 0), then p(ρx) = 0 =

ρp(x). Otherwise f(x)>0 (hence f(ρx)>0), and p(ρx) = ρ

λxsince λx

ρis

the (unique) positive number such that λx

ρρx =λxx∈ Lf,x0. And thereby

p(ρx) = ρ p(x).

We prove that f=fx0◦p, where fx0is a continuous strictly increasing

function and pis PH1. Let x∈Rn. If f(x)=0,then p(x) = 0, and then

f(x) = 0 = f(0) = fx0(0) = (fx0◦p) (x). Otherwise, we have by construction

that x

p(x)∈ Lf,x0. Therefore f(x

p(x)) = f(x0) and then f(x) = f(p(x)x0) =

fx0(p(x)). By Theorem A.1, ϕ=fx0is a homeomorphism. Let α > 0, deﬁne

˜ϕ=t7→ ϕ(t1/α) and ˜p=pα. Then ˜pis PHα, ˜ϕis a homeomorphism and

f= ˜ϕ◦˜p.

Assume that we have two couples of solutions (ϕ, p) and ( ¯ϕ, ¯p) such that

f=ϕ◦p= ¯ϕ◦¯pwhere ϕ, ¯ϕare homeomorphisms and p, ¯pare PHα. For all

t > 0 and x∈Rn, we have for instance p(tx) = tαp(x). Therefore Im(p) = R+.

Denote ψ= ¯ϕ−1◦ϕ. For all λ > 0 and x∈Rn,ψ(λαp(x)) = ψ(p(λx)) =

¯p(λx) = λαψ(p(x)).Hence ψis PH1on R+. For all t > 0, ψ(t) = tψ(1).

Therefore ψis linear.

Part 2. Assume now that there exist x1, x−1∈Rnsuch that fx1is strictly

increasing and fx−1is strictly decreasing. Then f(x1)>0 and f(x−1)<0.

Then thanks to the intermediate value theorem, if f(x)>0,there exists a

unique positive number λxsuch that λxx∈ Lf,x1, and if f(x)<0,there exists

a unique positive number λxsuch that λxx∈ Lf,x−1. We deﬁne now pfor all

x∈Rnas follows: if f(x) = 0 then p(x)=0,if f(x)>0 then p(x) = 1

λx, and

ﬁnally if f(x)<0 then p(x) = −1

λx. Let us show that pis PH1. Indeed for any

ρ > 0 and x∈Rn,if f(x) = 0 (hence f(ρx) = 0), then p(ρx)=0=ρp(x). If

f(x)>0 (hence f(ρx)>0), and p(ρx) = ρ

λx=ρp(x) since λx

ρis the (unique)

positive number such that λx

ρρx =λxx∈ Lf,x1. And ﬁnally if f(x)<0 (hence

f(ρx)<0), then p(ρx) = −ρ

λx=ρp(x) since λx

ρis the (unique) positive

number such that λx

ρρx =λxx∈ Lf,x−1. Hence pis PH1.

We deﬁne now the function ϕ:R→Rsuch that if t≥0, ϕ(t) = fx1(t)

and if t≤0, ϕ(t) = fx−1(−t). Then, ϕis well deﬁned (fx1(0) = 0 = fx−1(0)),

continuous and strictly increasing.

Let x∈Rn. If f(x)=0,then p(x) = 0, and then f(x)=0=(ϕ◦p) (x).

If f(x)>0, ϕ(p(x)) = fx1(p(x)) = f(p(x)x1) = f(x) since x

p(x)∈ Lf,x1.

And ﬁnally if f(x)<0, ϕ(p(x)) = fx−1(−p(x)) = f(−p(x)x−1) = f(x) since

−x

p(x)=λxx∈ Lf,x−1. Thereby, f=ϕ◦p. Theorem A.1 ensures that ϕ

is a homeomorphism. By deﬁning for all α > 0, ˜ϕ(t) = ϕ(t1/α) if t≥0,

˜ϕ(t) = ϕ(−(−t)1/α) if t < 0, ˜p(x) = p(x)αif p(x)≥0 and ˜p(x) = −(−p(x))α

if p(x)<0, it follows that f= ˜ϕ◦˜p.

Scaling-invariant functions versus positively homogeneous functions 9

Assume here again that we have two couples of solutions (ϕ, p) and (¯ϕ, ¯p)

such that f=ϕ◦p= ¯ϕ◦¯pwhere ϕ, ¯ϕare homeomorphisms and p, ¯pare PHα.

For all t > 0 and x∈Rn, we have p(tx) = tαp(x). Therefore Im(p) = Rsince

p(x1) and p(x2) have opposite signs. Denote ψ= ¯ϕ−1◦ϕ. For all λ > 0 and

x∈Rn,ψ(λαp(x)) = ψ(p(λx)) = ¯p(λx) = λαψ(p(x)).Hence ψis PH1on R.

For all t > 0, ψ(t) = tψ(1) and ψ(−t) = tψ(−1) Therefore depending on the

values of ψ(1) and ψ(−1), ψis either linear or piece-wise linear. ut

We now use the previous proposition to prove that a continuous SI function

is a homeomorphic transformation of a continuous PH1function. The proof

relies on the result that for a continuous SI function, the fxare either constant

or strictly monotonic and continuous (see Proposition 2.2). We distinguish the

case where fhas a global optimum as in Proposition 3.1 (i) and the case

where fdoes not have a global optimum as in Proposition 3.1 (ii). Overall the

following result holds.

Theorem 3.1 Let fbe a continuous SI function. Then for all α > 0, there

exists a continuous PHαfunction pand a strictly increasing and continuous

bijection (thus a homeomorphism) ϕsuch that f=ϕ◦p.

For a non-zero fand α > 0, the choice of (ϕ, p)is unique up to a left com-

position of pwith a piece-wise linear function. If fadmits a global optimum,

then 0is also a global optimum and for any x0∈Rnsuch that f(x0)6= 0,the

homeomorphism ϕcorresponding to a PH1function can be chosen as fx0and

is at least as smooth as f.

If fdoes not admit a global optimum, then there exist x1, x−1∈Rnsuch

that f(x1)>0and f(x−1)<0,and for any such x1and x−1, the homeomor-

phism ϕcorresponding to a PH1function can be chosen as the function equal

to fx1on R+and equal to t7→ fx−1(−t)on R−.

Proof Let fbe a continuous SI function. Thanks to Proposition 2.2, for all

x∈Rn, fxis either constant equal to 0 or strictly monotonic.

Part 1. Assume that fhas a global optimum. Corollary 2.1 shows that 0 is

also a global optimum. Then we can apply Proposition 3.1 in the case where the

non-constant fxhave the same monotonicity. Let x0∈Rnsuch that f(x0)6= 0

and deﬁne ϕ=fx0. Then f=ϕ◦pand ϕis a homeomorphism. That settles

the continuity of the PH1function ϕ−1◦fthanks to Corollary 2.2.

Part 2. Assume in this part that fhas no global optimum. Since 0 is not a

global optimum, we can ﬁnd x1and x−1such that f(x1)>0 and f(x−1)<0.

Therefore fx1is strictly increasing and fx−1is strictly decreasing. We apply

Proposition 3.1 in the case where the non-constant fxdo not have the same

monotonicity. If ϕis the function equal to fx1on R+and to t7→ fx−1(−t) on

R−, then f=ϕ◦pwhere ϕis a homeomorphism. That settles the continuity

of the PH1function ϕ−1◦fthanks to Corollary 2.2. For all α > 0, the unique

construction of (ϕ, p) up to a piece-wise linear function in both parts is a

consequence of Proposition 3.1. ut

10 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen

3.2 Suﬃcient and necessary condition to be the composite of a PH function

We have seen in the previous section that a continuous SI function can be

written as ϕ◦pwith ϕstrictly monotonic and pPH. Relaxing continuity, we

prove in the next theorem some necessary and suﬃcient condition under which

an SI function is the composite of a PH function with a strictly monotonic

function.

Theorem 3.2 Let fbe an SI function. There exist a PH1function pand a

strictly increasing and continuous bijection (thus a homeomorphism) ϕsuch

that f=ϕ◦pif and only if for all x∈Rn, the function fxis either constant or

strictly monotonic and the strictly increasing fxshare the same image (i.e., if

λ∈Ris reached for one of these functions, then it is reached for all of them)

and the strictly decreasing ones too.

For a non-zero f, up to a left composition of pwith a piece-wise linear

function, the choice of (ϕ, p)is unique.

Proof We prove ﬁrst the forward implication. Suppose there is a PH1function

pand a strictly monotonic function ϕsuch that f=ϕ◦p. Consider x∈Rn.

Either p(x) = 0 and then for any t>0 we have that p(tx) = 0, so that

fx(t) = f(tx) = ϕ(p(tx)) = ϕ(0) and fxis constant on R+, or p(x)6= 0, and

then t∈R+7→ p(tx) = tp(x) is strictly monotonic, and fx(t) = ϕ(p(tx)) is

strictly monotonic too on R+. Moreover, consider x16=x2such that fx1and

fx2are increasing. Then p(x1) and p(x2) are of the same sign, so there is some

t∗>0 such that p(x1) = t∗p(x2) = p(t∗x2), so the functions t7→ f(tx1) and

t7→ f(tt∗x2) are equal, so the functions fx1and fx2take the same values. The

same applies on the strictly decreasing functions.

We now prove the backward implication. Suppose that the functions fxare

either constant or strictly monotonic and the increasing ones share the same

values and the decreasing ones too.

If all the fxare constant, then for all x∈Rn, f(x) = f(0) = 0 and it is

enough to write f=ϕ◦pwith p=t7→ ton R+and p= 0.We assume from

now on that at least one fxis not constant.

Consider that all the non-constant fxhave the same monotonicity. Let us

choose x0such that f(x0)6= 0.Then for all x6= 0, fxand fx0have the same

monotonicity. Since they have the same image and are injective, there exists

a unique λx>0 such that λxx∈ Lf,x0.We then deﬁne pand ϕas in the Part

1 of Theorem 3.1 to ensure that f=ϕ◦pwhere pis PH1and ϕis strictly

monotonic.

Consider ﬁnally that all the non-constant fxdo not have the same mono-

tonicity. Then there exist x1and x−1such that f(x1)>0 and f(x−1)<0.

Then, thanks to the assumption that all increasing fxshare the same values

and the strictly decreasing fxtoo, if f(x)>0, then there exists a unique

positive number λxsuch that λxx∈ Lf,x1={y∈Rn, f(y) = f(x1)},and if

f(x)<0, then there exists a unique (thanks to the assumption of strict mono-

tonicity for the non-constant fx) positive number λxsuch that λxx∈ Lf,x−1.

Therefore, we can deﬁne pas in Theorem 3.1. As before, pis PH1. Deﬁne

Scaling-invariant functions versus positively homogeneous functions 11

also the function ϕ:R→Ras in Theorem 3.1. It is still increasing, but not

necessarily continuous. Then, as in Theorem 3.1, f=ϕ◦p.

The proof of the unicity of (ϕ, p) up to a piece-wise real linear function is

similar to the proof in Proposition 3.1. ut

Complementing Theorem 3.2, we construct an example of an SI functionfthat

can not be decomposed as f=ϕ◦p, because the strictly increasing fxdo not

share the same image. Deﬁne fsuch that for all x∈Rn,f(x) = tanh(x1) if

the ﬁrst coordinate x1≥0 and f(x) = 1 + exp(−x1) otherwise. Then fis SI

and if x16= 0, fxis strictly increasing. However for all xsuch that x1>0 then

Im(fx) = [0,1) and otherwise for xsuch that x1≤0, Im(fx) = {0} ∪ (2,∞).

4 Level sets of SI functions

Scaling-invariant functions appear naturally when studying the convergence

of comparison-based optimization algorithms [3]. In this speciﬁc context, the

diﬃculty of a problem is entirely determined by its level sets whose properties

are studied in this section.

4.1 Identical sublevel sets

Level sets and sublevel sets of a function fremain unchanged if we compose

the function with a strictly increasing function ϕsince

f(x)≤f(y)⇐⇒ ϕ(f(x)) ≤ϕ(f(y)) .(5)

We prove in the next theorem that two arbitrary functions fand phave

the same level sets if and only if f=ϕ◦pwhere ϕis strictly increasing.

Theorem 4.1 Two functions fand phave the same sublevel sets if and only

if there exists a strictly increasing function ϕsuch that f=ϕ◦p.

Proof If f=ϕ◦pwith ϕstrictly increasing, since sublevel sets are invariant

by ϕ,fand phave the same sublevel sets. Now assume that fand phave

the same sublevel sets. Then for all x∈Rn,there exists T(x)∈Rnsuch that

L≤

f,x =L≤

p,T (x). In other words for all y∈Rn,f(y)≤f(x)⇐⇒ p(y)≤

p(T(x)). We deﬁne the function

φ:Im(f)−→ Im(p)

f(x)7→ p(T(x)) .

The function φis well-deﬁned because for x, y ∈Rnsuch that f(x) = f(y),

L≤

f,x =L≤

f,y . And since L≤

f,x =L≤

p,T (x)and L≤

f,y =L≤

p,T (y),then L≤

p,T (x)=

L≤

p,T (y),and then p(T(x)) = p(T(y)). Therefore φ(f(x)) = φ(f(y)). By con-

struction we have that φ◦f=p◦T.

12 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen

Let us show that p◦T=p. We have L≤

p◦T,x =L≤

f,x =L≤

p,T (x). Then

x∈ L≤

p,T (x)and then p(x)≤p(T(x)). Therefore p≤p◦T. In addition for

all y∈Rn, there exists xsuch that L≤

p,y =L≤

f,x =L≤

p,T (x)=L≤

p◦T,x . Then

y∈ L≤

p◦T,x , which induces that p(T(y)) ≤p(T(x)). Plus, L≤

p,y =L≤

p,T (x),

therefore p(T(x)) = p(y). Thereby p(T(y)) ≤p(y), and then p◦T≤p. Finally

p◦T=p. Hence φ◦f=p.

Let us prove now that φis strictly increasing. Consider x, y ∈Rnsuch

that f(x)< f(y). Then L≤

f,x ⊂ L≤

f,y with a strict inclusion, which means

that L≤

p,T (x)⊂ L≤

p,T (y)with a strict inclusion. Thereby p(T(x)) < p (T(y)) ,

i.e. φ(f(x)) < φ(f(y)). Hence φis strictly increasing. And up to restricting

φto its image, we can assume without loss of generality that φis a strictly

increasing bijection. We ﬁnally denote ϕ=φ−1and it follows that f=ϕ◦p.

ut

Theorem 4.1 and Theorem 3.2 give both equivalence conditions for an SI

function fto be equal to ϕ◦pwhere ϕis strictly increasing and pis positively

homogeneous3. One condition is that there exists a PH function with the same

sublevel sets as f, while the other condition is that the fxare either constant

or strictly monotonic, and the strictly increasing and decreasing ones have the

same image, respectively.

4.2 Compactness of the sublevel sets

Compactness of sublevel sets is relevant for analyzing step-size adaptive ran-

domized search algorithms [2,13]. We investigate here how compactness prop-

erties shown for positively homogeneous functions extend to scaling-invariant

functions. For an SI function f, we have L≤

f,tx =tL≤

f,x. Since ψ:y7→ ty is

a homeomorphism, we have that ψ(L≤

f,x) equals tL≤

f,x and is compact if and

only if L≤

f,x is compact. Therefore, for all t > 0:

L≤

f,tx is compact if and only if L≤

f,x is compact. (6)

Furthermore, if pis a lower semi-continuous positively homogeneous func-

tion such that p(x)>0 for all nonzero xthen the sublevel sets of pare

compact [2, Lemma 2.7]. We recall it with all the details in the following

proposition:

Proposition 4.1 ([2, Lemma 2.7]) Let pbe a positively homogeneous func-

tion with degree α > 0and p(x)>0for all x6= 0 (or equivalently 0is the

unique global argmin of p) and p(x)ﬁnite for every x∈Rn. Then for every

x∈Rn,the following holds:

3Note that in Theorem 3.2, we can assume without loss of generality that ϕis always

strictly increasing by replacing if needed ϕby t→ϕ(−t) and pby −p.

Scaling-invariant functions versus positively homogeneous functions 13

(i) lim

t→0p(tx) = 0 and for all x6= 0 the function px:t∈[0,∞)7→ p(tx)∈R+

is continuous, strictly increasing and converges to ∞when tgoes to ∞.

(ii) If pis lower semi-continuous, the sublevel set L≤

p,x is compact.

We prove a similar theorem for lower semi-continuous SI functions fwith

continuous fxfunctions, showing in particular that the unicity of the global

argmin is equivalent to the above items. Note that we need to assume the

continuity of the functions fx, while this property is unconditionally satisﬁed

for positively homogeneous functions where px(t) = tαpx(1) for all x∈Rnand

for all t > 0.

Theorem 4.2 Let fbe SI. Then the conditions

–f(x)>0for all x6= 0 and

–0is the unique global argmin

are equivalent. Let fbe additionally lower semi-continuous and for all x∈Rn,

fxis continuous on a neighborhood of 0. Then the following are equivalent:

(i) 0 is the unique global argmin.

(ii) for any x∈Rn\ {0}, the function fxis strictly increasing.

(iii) The sublevel sets L≤

f,x for all xare compact.

Proof Since, w.l.o.g., fis given such that f(0) = 0, its unique global argmin is

0 if and only if f(x)>0 for all x6= 0. Now we prove ﬁrst that (i) ⇒(ii):Let

x∈Rn\ {0}. Assume (by contraposition) that there exists 0 < t1< t2such

that fx(t1) = fx(t2). Then by scaling-invariance, fx(1) = fxt1

t2. It follows

by multiplying iteratively by t1

t2that for all k∈Z+, fx(1) = fxt1

t2k.

Therefore if we take the limit when k→ ∞, it follows thanks to the continuity

of fxat 0 that f(x) = fx(1) = f(0),which contradicts the assumption (i).

Hence, fxis an injective function. Plus, there exists > 0 such that fxis

continuous on [0, ]. Therefore fxis an injective continuous function on [0, ],

which implies that fxis a strictly monotonic function on [0, ]. Lemma 2.1

implies therefore that fxis strictly monotonic. And since 0 is an argmin of fx,

then fxis strictly increasing.

(ii) ⇒(iii):fis lower semi-continuous on the compact S1, then it reaches

its minimum on that compact: there exists s∈ S1such that f(s) = min

z∈S1

f(z).

Also, since sublevel sets of lower semi-continuous functions are closed, then

L≤

f,s is closed. Now let us show that it is also bounded.

If y∈ L≤

f,s\ {0},then f(y)≤f(s)≤fy

kyk. And since fyis strictly

increasing, we obtain that 1 ≤1

kyk,thereby kyk ≤ 1. We have shown that

L≤

f,s ⊂ B1.

Then L≤

f,s is a compact set, as it is a closed and bounded subset of Rn. By

(6), it follows that L≤

f,ts is compact for all t > 0.

14 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen

For all x∈Rn\{0}, f(x)> f (0) thanks to (ii). Then L≤

f,0={0}and is

compact.

Let x∈Rn\ {0}. Then there exists > 0 such that fx

kxkis continuous

on [0, ]. We have that fx

kxk(0) = 0 < f(s)≤fx

kxk(1),then by scaling-

invariance, fx

kxk(0) < f(s)≤fx

kxk(). Therefore by the intermediate value

theorem applied to fx

kxkcontinuous on [0, ], there exists t∈(0, ] such that

fx

kxk(t) = f(s). Then L≤

f,t x

kxk=L≤

f,s and is compact. We apply again (6) to

observe that L≤

f,x is compact.

(iii) ⇒(i):Let x∈ L≤

f,0,then tx ∈ L≤

f,0for all t≥0,and then {tx}t∈R+⊂

L≤

f,0which is a compact set. This is only possible if x= 0,otherwise the set

{tx}t∈R+would not be bounded. Hence L≤

f,0={0}which implies that 0 is the

unique global argmin. ut

We derive from Theorem 4.2 the next corollary, stating that for a continuously

diﬀerentiable SI function with a unique global argmin, the intersection of any

half-line of origin 0 and a level set is a singleton.

Corollary 4.1 Let fbe a continuously diﬀerentiable SI function with 0as

unique global argmin. Then for all x∈Rn, any half-line of origin 0intersects

Lf,x at a unique point.

Proof For all non-zero x, Theorem 4.2 ensures that fxis strictly increasing.

Therefore for a non-zero x,fxis injective. And then the intersection of a level

set and a half-line of origin 0 contains at most one point. In addition the fx

share the same image for all non-zero x. Then for two non-zero vectors x, y,

there exists t≥0 such that fy(t) = fx(1). In other words, there exists t≥0

such that ty ∈ Lf,x. We end this proof by noticing that Lf,0={0}and then

intersects any-half line of origin 0 only at 0. ut

4.3 Suﬃcient Condition for Lebesgue Negligible Level Sets

We assume that fis lower semi-continuous SI admitting a unique global argmin

and all fxare continuous and prove that fhas Lebesgue negligible level sets.

Proposition 4.2 Let fbe an SI function with 0as unique global argmin. As-

sume also that fis lower semi-continuous and for all x∈Rn, fxis continuous.

Then the level sets of fare Lebesgue negligible.

Proof Let x∈Rn. Let us denote by µthe Lebesgue measure. For all t > 0,

µ(Lf,tx) = µ(tLf,x) = tnµ(Lf,x),thanks to (3). Therefore, if t≥1, µ (Lf,tx )≥

µ(Lf,x).In addition, for all k≥1,Lf,(1+ 1

k)x⊂ {y∈Rn, f (x)≤f(y)≤f(2x)} ⊂

L≤

f,2x,because if x6= 0, fxis strictly increasing thanks to Theorem 4.2. And

the same theorem induces that L≤

f,x is compact and hence µL≤

f,x<∞.

Scaling-invariant functions versus positively homogeneous functions 15

It follows that

∞

X

k=1

µ(Lf,x)≤

∞

X

k=1

µLf,(1+ 1

k)x≤µL≤

f,2x<∞.Hence,

µ(Lf,x) = 0. ut

4.4 Balls containing and balls contained in sublevel sets

The sublevel sets of continuous PH functions include and are embedded in balls

whose construction is scaling-invariant. Given that continuous SI functions

are monotonic transformation of PH functions, those properties are naturally

transferred to SI functions. This is what we formalize in this section.

From the deﬁnition of a PH function with degree α, for all x6= 0 we have

p(x) = kxkαp(x/kxk) for all x6= 0. Therefore, pis continuous on Rn\ {0}if

and only if pis continuous on S1. For such p, we denote mp= min

x∈S1

p(x),and

Mp= max

x∈S1

p(x).We have the following propositions:

Proposition 4.3 ( [2, Lemma 2.8] ) Let pbe a PH function with degree α

such that p(x)>0for all x6= 0. Assume that pis continuous on S1,then for

all x6= 0, the following holds

kxkm1/α

p≤p(x)1/α ≤ kxkM1/α

p.(7)

Proposition 4.4 ( [2, Lemma 2.9] ) Let pbe a PH function with degree α

such that g(x)>0for all x6= 0. Assume that pis continuous on S1. Then for

all ρ > 0,the ball centered in 0and of radius ρis included in the sublevel set

of degree ραMp, i.e.B(0, ρ)⊂ L≤

p,ρxMp,with p(xMp) = Mp.For all x6= 0,the

sublevel set of degree p(x)is included into the ball centered in 0and of radius

(p(x)/mp)α, i.e.

L≤

p,x ⊂ B 0,p(x)

mpα.

We can generalize both propositions to continuous scaling-invariant functions

using Theorem 3.1.

Proposition 4.5 Let fbe a continuous SI function such that f(x)>0for

all x6= 0. Then there exist an increasing homeomorphism ϕon R+and two

positive numbers 0< m ≤Msuch that

(i) for all x6= 0,ϕ(mkxk)≤f(x)≤ϕ(Mkxk),

(ii) for all ρ > 0,the ball centered in 0and of radius ρis included in the

sublevel set of degree ϕ(ρϕ−1(M)), i.e. B(0, ρ)⊂ L≤

f,ρxMwith f(ρxM) =

ϕρϕ−1(M).

16 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen

(iii) for all x6= 0,the sublevel set of degree f(x)is included into the ball centered

in 0and of radius ϕ−1(f(x))

ϕ−1(m),i.e.

L≤

f,x ⊂ B 0,ϕ−1(f(x))

ϕ−1(m).(8)

Proof Thanks to Theorem 3.1, we can write f=fx0◦pwhere x06= 0, pPH1

ϕdeﬁned as fx0is an increasing homeomorphism. Then p=ϕ−1◦fand hence

veriﬁes: for all x6= 0, p(x)>0. Deﬁne mand Mas m=ϕ(mp) and M=

ϕ(Mp),where mp= min

x∈S1

p(x) and Mp= max

x∈S1

p(x).

For x6= 0,Proposition 4.3 ensures that mpkxk ≤ p(x)≤Mpkxk.Taking

the image of this equation with respect to ϕproves (i).

For all ρ > 0, B(0, ρ)⊂ L≤

p,ρxMp, with p(xMp) = Mp. Since sublevel sets

are invariant with respect to an increasing bijection, it follows that L≤

p,ρxMp=

L≤

f,ρxMp. In addition, f(ρxMp) = ϕ(p(ρxMp)) = ϕ(ρp(xMp)) = ϕ(ρϕ−1(M))

such that we have proven (ii).

Let x6= 0. Again by invariance of the sublevel set, L≤

p,x =L≤

f,x. And

Proposition 4.4 says that L≤

p,x ⊂ B 0,p(x)

mp. We obtain the results with the

facts that p=ϕ−1◦fand mp=ϕ−1(m). ut

4.5 A generalization of a weak formulation of Euler’s homogeneous function

theorem

For a function p:Rn→Rcontinuously diﬀerentiable on Rn\ {0}, Euler’s

homogeneous function theorem states that there is equivalence between pis

PH with degree αand for all x6= 0

αp(x) = ∇p(x)·x. (9)

If in addition pis continuously diﬀerentiable on Rn, then αp(0) = 0 = ∇p(0)·0.

Along with (9), this latter equation implies that at each point yof a level set

Lp,x, the scalar product between ∇p(y) and yis constant equal to ∇p(x)·x

or that the level sets of pand of the function x7→ ∇p(x)·xare the same, that

is, the level sets of a continuously diﬀerentiable PH function satisfy

Lp,x =Lz7→∇p(z)·z,x ={y∈Rn,∇p(y)·y=∇p(x)·x}.(10)

We call this a weak formulation of Euler’s homogeneous function theorem.

If fis a continuous SI function, we can write fas ϕ◦pwhere pis PH

and ϕis a homeomorphism, according to Theorem 3.1. We have the following

proposition in the case where ϕand pare also continuously diﬀerentiable.

Scaling-invariant functions versus positively homogeneous functions 17

Proposition 4.6 Let fbe a continuously diﬀerentiable SI function that can

be written as ϕ◦pwhere pis P Hα,ϕis a homeomorphism, and ϕand ϕ−1are

continuously diﬀerentiable (and thus pis continuously diﬀerentiable). Then for

all x∈Rn,

∇f(x)·x=α ϕ0(p(x)) p(x).(11)

Proof Since p=ϕ−1◦f, it is continuously diﬀerentiable. From the chain rule,

for all x∈Rn:∇f(x)·x=ϕ0(p(x))∇p(x)·x=αϕ0(p(x))p(x). The last

equality results from the Euler’s homogeneous theorem applied to p.ut

Yet, the assumptions of the previous proposition are not necessarily sat-

isﬁed when fis a continuously diﬀerentiable SI function. Indeed, we exhibit

in the next proposition an example of a SI and continuously diﬀerentiable

function fsuch that f=ϕ◦pbut either por ϕis non-diﬀerentiable.

Proposition 4.7 Deﬁne ϕ:t7→ Zt

0

1

1 + log2(u)duon R+and p:x7→ |x1|.

Then f=ϕ◦pis continuously diﬀerentiable and SI. Yet, for any ˜ϕstrictly

increasing and ˜pPH such that f= ˜ϕ◦˜p(including ϕand pabove), either ˜pis

not diﬀerentiable on any point of the set {x;x1= 0}or ˜ϕis not diﬀerentiable

at 0.

Proof Let us prove that fis continuously diﬀerentiable. For x6= 0, ∇f(x) =

1

1+log2(|x1|)

x1

|x1|. Then lim

x→0∇f(x) exists and is equal to 0, hence fis continu-

ously diﬀerentiable.

Assume that ( ˜ϕ, ˜p) is such that ϕ◦p= ˜ϕ◦˜p, with ˜ϕstrictly increasing

and ˜pPHα. Denote ψ= ˜ϕ−1◦ϕ. For all λ > 0 and x∈Rn,ψ(λp(x)) =

ψ(p(λx)) = ˜p(λx) = λαψ(p(x)).Therefore ψis PHαon Im(p) = R+, hence for

all t > 0, ψ(t) = tαψ(1). Then up to a positive constant multiplicative factor,

˜p(x) = |x1|αand ˜ϕ(t) = ϕ(t1/α). And then if ˜pis diﬀerentiable, we necessarily

have that α > 1.

In the case where α > 1, for all t > 0, ˜ϕ0(t) = 1

α

t1

α−1

1+log2(t1/α)and then ˜ϕis

not diﬀerentiable at 0. ut

Yet we can prove that for all continuously diﬀerentiable SI functions, the

level set of fgoing through x, i.e. Lf,x is included in the level set of z7→

∇f(z)·zgoing through x.

Lemma 4.1 For a continuously diﬀerentiable SI function fand for x∈Rn,

Lf,x ⊂ Lz7→∇f(z)·z,x ={y∈Rn,∇f(y)·y=∇f(x)·x}.(12)

That is, each level set of fhas a single value of ∇f(x)·xwhile also diﬀerent

level sets of fcan have the same value of ∇f(x)·x.

18 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen

Proof Let y∈ Lf,x. Since f(y) = f(x),then for all t≥0, f(ty) = f(tx). We

deﬁne the function hon R+such that for all t≥0, h(t) = f(tx)−f(ty). Then

his the zero function, so is its derivative: h0(t) = ∇f(tx)·x− ∇f(ty)·y= 0

for all t≥0. In particular we have the result for t= 1. ut

We exhibit in the next proposition a continuously diﬀerentiable SI func-

tion where the inclusion in the above lemma is strict (another example is

Lemma 4.2).

Proposition 4.8 Let pbe the PH2function x∈Rn7→ kxk2and ϕthe strictly

monotonic function ϕ(t) = exp(−t)for all t≥0. Then f:x7→ ϕ(p(x)) =

exp(−kxk2)is continuously diﬀerentiable. For any 0< r < 1, there is a unique

s > 1such that for any x∈ Sr:

Lz7→∇f(z)·z,x =Lf,x ∪ Lf , s

rx(13)

where Lf,x and Lf, s

rxare disjoint.

Proof Remark that the transformation could be chosen to obtain a degree

equal to 1 as in Theorem 3.1, but the diﬀerentiability of pwould not be guar-

anteed.

We notice that t→tϕ0(t) is not injective on R+. It is injective on [0,1)

and on [1,∞) and for any 0 < r < 1 there is a unique s > 1 such that

r2ϕ0(r2) = s2ϕ0(s2).(14)

We will prove that for any such r, s and any x∈ Sr,

{y∈Rn,∇f(y)·y=∇f(x)·x}=Lf,x ∪ Lf, s

rx(15)

Let x∈Rnsuch that kxk=r. By the chain rule, for all y∈Rnwe have

∇f(y)·y=ϕ0(p(y))∇p(y)·y= 2ϕ0(p(y))p(y).

Therefore y∈ {y∈Rn,∇f(y)·y=∇f(x)·x}if and only if kyk2ϕ0(kyk2) =

r2ϕ0(r2). From (14), we know that this is possible only if kyk=ror kyk=s,

i.e. only if f(y) = f(x) or f(y) = f(s

rx). Hence the equality in (15).

We remark that Lf,x and Lf , s

rxare disjoint whenever f(x)6=f(s

rx). If

x∈ Sr, f(x) = e−r26=e−s2=fs

rxwhich implies that Lf,x and Lf, s

rxare

disjoint. ut

The non-injectivity of t→tϕ0(t) is essential in the above example to ob-

tain a non-strict inclusion in (12) for some SI functions. We obtain a weak

formulation of Euler’s homogeneous function theorem for some SI functions in

the following proposition.

Proposition 4.9 Let fbe a continuously diﬀerentiable SI function that can

be written as ϕ◦pwhere ϕis a homeomorphism, pis PH1and ϕand ϕ−1are

continuously diﬀerentiable. Assume that the function t∈R+7→ tϕ0(t)∈Ris

injective. Then for x∈Rn,

Lf,x =Lz7→∇f(z)·z,x .(16)

Scaling-invariant functions versus positively homogeneous functions 19

Proof It follows from Proposition 4.6 that for all x∈Rn,∇f(x)·x=ϕ0(p(x))p(x).

Thanks to the bijectivity of ϕalong with the injectivity of t→tϕ0(t),we have:

ϕ0(p(y))p(y) = ϕ0(p(x))p(x)⇐⇒ p(x) = p(y)⇐⇒ f(x) = f(y).

In other words, Lf,x ={y∈Rn,∇f(y)·y=∇f(x)·x}.ut

4.6 Compact neighborhoods of level sets with non-vanishing gradient

We prove in this section that any continuously diﬀerentiable SI function f

with a unique global argmin has level sets, for example Lf,z0, such that for

some compact neighborhood of the level set, N ⊃ Lf,z0, the gradient does not

vanish and ∇f(z)·z > 0 for all z∈ N .

For a continuously diﬀerentiable PH function psuch that p(x)>0 for

all x6= 0, i.e. such that 0 is the unique global argmin of p, this result is a

consequence of Euler’s homogeneous function theorem which implies that

∇p(x)·x > 0 for all x6= 0 .(17)

In particular, (17) is true on any compact neighborhood of any level set of p,

if that compact does not contain 0.

We now remark that the property that ∇f6= 0 for all x6= 0 is not neces-

sarily true if fis a continuously diﬀerentiable SI function with a unique global

argmin. Namely, fcan have level sets that contain only saddle points.

Lemma 4.2 Let p(z) = kzk2and ϕ(t) = Zt

0

sin2(u)dufor t≥0. Then f=

ϕ◦pis a continuously diﬀerentiable SI function with a unique global argmin

and an inﬁnite number of zbelonging to diﬀerent level sets of f, such that

∇f(z)=0.

Proof The function ϕis strictly increasing since sin2is non-negative and has

zeros on isolated points. Also, for all t≥0, ϕ(t) = t

2−sin(2t)

4, where we use

that cos(2t)=1−2 sin2(t).

For any natural integer n, nπ is a stationary point of inﬂection of ϕ:

ϕ0(nπ) = 0 and ϕ00 (t) = sin(2t) has opposite signs in the neighborhood of nπ.

For all zwith kzk2∈πZ+,∇f(z) = ϕ0(g(z))∇g(z)=2ϕ0(kzk2)z= 0.

Hence there exists an inﬁnite number of level sets Lf,z for which ∇f(z) = 0.

ut

Yet, a consequence of Theorem 4.2 and Lemma 4.1 is the existence of a

level set of fsuch that ∇f(z)·z > 0 for all zin that level set as shown in the

next proposition

Proposition 4.10 Let fbe a continuously diﬀerentiable SI function with 0

as unique global argmin. There exists z0∈ B1with Lf,z0⊂ B1, such that for

all z∈ Lf,z0,∇f(z)·z > 0.

20 Cheikh Toure, Armand Gissler, Anne Auger, Nikolaus Hansen

Proof Since fis a continuous SI function, we have all the equivalences in

Theorem 4.2.

Inside the proof of Theorem 4.2, we have shown that there exists s∈ S1

such that L≤

f,s ⊂ B1,with f(s) = min

z∈S1

f(z). Since fsis strictly increasing

and diﬀerentiable, there exists t∈(0,1] such that f0

s(t)>0. Let us denote

z0=ts. We have that Lf,z0⊂ L≤

f,s ⊂ B1. And with the chain rule, 0 < f 0

s(t) =

∇f(z0)·z0

t. Therefore along with Lemma 4.1, it follows that for all z∈ Lf,z0,

∇f(z)·z=∇f(z0)·z0>0.ut

From the uniform continuity of z7→ ∇f(z)·zon a compact we deduce the

announced result.

Proposition 4.11 Let fbe a continuously diﬀerentiable SI function with 0

as unique global argmin. There exists δ > 0,z0∈ B1with Lf,z0⊂ B1such that

for all z∈ Lf,z0+B(0, δ),∇f(z)·z > 0.

Proof Since ∇f(z)·z > 0 for all zin the compact Lf,z0, then z7→ ∇f(z)·zhas

a positive minimum (that is reached) denoted by = minz∈Lf,z0∇f(z)·z > 0.

The continuous function z7→ ∇f(z)·zis uniformly continuous on the compact

Lf,z0+B(0,1), therefore there exists a positive number δ < 1 such that if y, z ∈

Lf,z0+B(0,1) with y−z≤δthen |∇f(z)·z− ∇f(y)·y|<

2. Then for all

z∈ Lf,z0+B(0, δ), there exists y∈ Lf ,z0such that |∇f(z)·z− ∇f(y)·y|<

2.

Then ∇f(z)·z > ∇f(y)·y−

2≥

2>0. Hence z7→ ∇f(z)·zis positive on

the compact set Lf,z0+B(0, δ). ut

Acknowledgements

Part of this research has been conducted in the context of a research collab-

oration between Storengy and Inria. We particularly thank F. Huguet and

A. Lange from Storengy for their strong support.

References

1. Joseph Acz´el. Lectures on functional equations and their applications. Academic press,

1966.

2. Anne Auger and Nikolaus Hansen. Linear convergence on positively homogeneous func-

tions of a comparison based step-size adaptive randomized search: the (1+1)-ES with

generalized one-ﬁfth success rule. arXiv preprint arXiv:1310.8397, 2013.

3. Anne Auger and Nikolaus Hansen. Linear convergence of comparison-based step-size

adaptive randomized search via stability of markov chains. SIAM Journal on Optimiza-

tion, 26(3):1589–1624, 2016.

4. Gerard Buskes and Arnoud van Rooij. Topological spaces. In Topological Spaces, pages

187–201. Springer, 1997.

5. Arnaud Denjoy. Sur les fonctions d´eriv´ees sommables. Bul letin de la Soci´et´e

Math´ematique de France, 43:161–248, 1915.

Scaling-invariant functions versus positively homogeneous functions 21

6. J Dutta, JE Martinez-Legaz, and AM Rubinov. Monotonic analysis over cones: I.

Optimization, 53(2):129–146, 2004.

7. Herv´e Fournier and Olivier Teytaud. Lower bounds for comparison based evolution

strategies using vc-dimension and sign patterns. Algorithmica, 59(3):387–408, 2011.

8. Valentin V Gorokhovik and Marina Traﬁmovich. Positively homogeneous functions

revisited. Journal of Optimization Theory and Applications, 171(2):481–503, 2016.

9. Valentin V Gorokhovik and Marina Traﬁmovich. Saddle representations of positively ho-

mogeneous functions by linear functions. Optimization Letters, 12(8):1971–1980, 2018.

10. Godefroy Harold Hardy. Weierstrass’s non-diﬀerentiable function. Trans. Amer. Math.

Soc, 17(3):301–325, 1916.

11. Marek Kuczma. An introduction to the theory of functional equations and inequalities:

Cauchy’s equation and Jensen’s inequality. Springer Science & Business Media, 2009.

12. JB Lasserre and JB Hiriart-Urruty. Mathematical properties of optimization problems

deﬁned by positively homogeneous functions. Journal of optimization theory and ap-

plications, 112(1):31–52, 2002.

13. Daiki Morinaga and Youhei Akimoto. Generalized drift analysis in continuous domain:

linear convergence of (1+1)-ES on strongly convex functions with lipschitz continuous

gradients. In Proceedings of the 15th ACM/SIGEVO Conference on Foundations of

Genetic Algorithms, pages 13–24, 2019.

14. Marian Muresan. A concrete approach to classical analysis, volume 14. Springer, 2009.

15. John A Nelder and Roger Mead. A simplex method for function minimization. The

computer journal, 7(4):308–313, 1965.

16. AM Rubinov and RN Gasimov. Strictly increasing positively homogeneous functions

with application to exact penalization. Optimization, 52(1):1–28, 2003.

17. AM Rubinov and BM Glover. Duality for increasing positively homogeneous functions

and normal sets. RAIRO-Operations Research-Recherche Op´erationnelle, 32(2):105–

123, 1998.

A Bijection Theorem

This standard theorem is reminded for the sake of completeness.

Theorem A.1 (Bijection theorem, [14, Theorem 2.20]) Let I⊂Rbe a non-empty

interval, J⊂Rand ϕ:I→Jbe a continuous bijection (and therefore strictly monotonic).

Then Jis an interval and ϕis a homeomorphism, i.e.ϕ−1:J→Iis also a continuous

bijection, and if ϕis strictly increasing (respectively strictly decreasing), then ϕ−1is strictly

increasing (respectively strictly decreasing).