PreprintPDF Available

Validated integration of differential equations with state-dependent delay



We present an implicit method of steps for differential equations with state-dependent delays and validated numerics to rigorously enclose solutions of initial-value problems. Our approach uses a combination of contraction mapping arguments based on a Newton-Kantorovich type theorem and piecewise polynomial interpolation. Completing multiple steps of integration is challenging, and we resolve it by smooth interpolation of the previous solution, resulting in an interval-valued polynomial initial condition for the subsequent step. A set of examples is provided.
Validated integration of differential equations with
state-dependent delay
Kevin E. M. Church
Centre de Recherches Math´ematiques, Universit´e de Montr´eal
April 14, 2022
We present an implicit method of steps for differential equations with state-dependent delays and
validated numerics to rigorously enclose solutions of initial-value problems. Our approach uses a
combination of contraction mapping arguments based on a Newton-Kantorovich type theorem and
piecewise polynomial interpolation. Completing multiple steps of integration is challenging, and we
resolve it by smooth interpolation of the previous solution, resulting in an interval-valued polynomial
initial condition for the subsequent step. A set of examples is provided.
1 Introduction
Functional differential equations (FDE) have been studied for over 200 years, with substantial devel-
opments since the 1950s. Broadly, a FDE of retarded type is a differential equation
˙x(t) = f(xt),
where fis a functional acting on a function space, and xtdenotes a history function that “windows” the
solution until some time in the past. When this function space consists of the continuous functions on
a compact interval, and fis at least Lipschitz continuous, there is a well-developed theory popularized
by the works of Diekmann, van Gills, Lunel and Walther [9], Hale & Lunel [12] and Krasovskii [19],
among others. The situation is far less clear-cut for systems with so-called state-dependent delay, where
the functional fcan not be understood as Lipschitz continuous on the typical space of continuous
functions. In such cases, there remain several open problems concerning the regularity of the semiflow
and so-called solution manifolds and invariant manifolds [20].
Consider the differential equation with state-dependent delay (DE-SDD), written in the form
˙x(t) = f(x(t), x(tτ(t, x(t)))).(1)
We assume f:Rd×RdRdand τ:R×RdRare Ckfor some k1. More generally, one could
instead allow fand τto only be defined on an open domain Ω Rd. Equation (1) includes
Time-varying delays: τ(t, x) = r(t) for r:RR.
Discrete state-dependent delays: τ(t, x) = r(x) for r:RdR.
An assumption of causality might be physically reasonable (i.e τ(t, x)0), but in many applications
causality must be verified at the level of a particular solution a posteriori. A classical example
of an equation with time-varying delay is the pantograph equation [22], ˙x(t) = ax(t) + bx(λt) for
λ(0,1). As for discrete state-dependent delay problems, these continue to see application in cell
biology [2, 10], electrodynamics [8, 23] and other fields. In the present work, we will concentrate
on the case of discrete state-dependent delays: that is, formally, τ(t, x) = τ(x) for some function
τ:RdR. However, results will be stated with as much generality as possible.
In many applications, the delay is an implicit function of the state and is given explicitly. Examples
include the two-body problem in electrodynamics and position control models, among others, and we
refer the reader to the chapter [13] for several detailed examples. In many such situations, however,
it is possible to determine other differential equations that are satisfied by the delay variables. This
generally leads to a state-dependent delay differential equation of the form (1), perhaps of higher
dimension and with more than one delay. We consider here the case of a single delay, although the
ideas developed in this paper could be extended to the case of multiple delays.
Recently, there has been a surge of interest in computer-assisted proofs in delay differential equa-
tions. Proiment examples include the proofs of Wright’s conjecture [27], Jones’ conjecture [18]. Sev-
eral works have addressed approaches for validated computation of periodic orbits [24, 26], integration
[21, 24] and parameterization of unstable manifolds [17]. Insofar as validated integration is concerned,
the Taylor methods [24] and Chebyshev spectral methods [21] seem to be the most recent. They are
appropriate for fixed, constant delays. Our objective with the present work is to develop rigorous
numerics for an implicit method of steps, using polynomial interpolation.
1.1 Background on state-dependent delay equations
State-dependent delay equations (SDDE) are very different from those with constant delays. Su-
perficially, the biggest change is that they are always nonlinear. More deeply, the semiflow of a
state-dependent delay equation is generally only C1on its associated solution manifold. For these
and related notions, the reader may consult [28, 29].
There is a large body of literature on numerical methods for SDDE, and we will make no effort to
describe this here. We will, however, mention some recent work on the development of a posteriori val-
idation approaches to invariant objects of SDDE. There is the work periodic orbits in state-dependent
delayed perturbations of ordinary differential equations of Yang, Gimeno and de la Llave [31, 30],
and a recent extension of this work [11] has led to computer-assisted proofs of such perturbations of
periodic orbits. Parameterization of quasi-periodic solutions under a small state-dependent delay and
exponential dichotomy assumption was considered by He and de la Llave [14].
1.2 The method of steps
Recall the method of steps for delay differential equations. Without loss of generality, assume a
constant unit delay and the task of solving the initial-value problem
˙x(t) = f(x(t), x(t1)), t > 0
x(θ) = φ(θ), θ [1,0].
The method of steps exploits the fact that for t[0,1], we have x(t1) = φ(t1). The task of
solving the IVP above is equivalent to solving a sequence of ODE initial-value problems. Denote
φ0=φand consider the sequence of IVPs for nN:
φn(t) = f(φn(t), φn1(t1)), t [n1, n]
φn(n1) = φn1(n1).
We have x(t) = φn(t) whenever t[n1, n], so the solution of the delay IVP can be densely written
Here, 1Xis the indicator function on the set Xand by an abuse of notation we define φn(t) = 0 for
t /[n1, n] so that the above is well-defined.
The Chebyshev spectral approach of [21] makes explicit use of the method of steps by representing
the functions φnat each step as Chebyshev series. The power of this spectral method comes from the
excellent approximation properties of the Chebyshev polynomials and the Banach algebra associated to
the sequence space used in the proofs. Polynomial nonlinearities in ftranslate to cosine convolutions
in the sequence space, and the representation of the derivative operator is diagonally dominant.
To compare, the Taylor method of [24] represents solutions as piecewise-Taylor expansions, using a
bootstrapping procedure to get high-order derivatives of the solution needed to step the procedure
forward iteratively, and the step size is not necessarily equal to the delay.
Our initial goal was to develop a fully spectral integrator based on Chebyshev expansion. In
moving from the constant delay case to the non-constant or state-dependent case, there are a some
technical problems that must be either resolved or circumvented. Chiefly, self-compositions — for
example, x(t)7→ x(tr(x(t))) — are highly nonlinear, and are difficult to characterize at the level
of a sequence algebra. The obstructions were so severe that we abandoned the approach. In the next
section, we will overview these obstructions in more detail.
1.3 Interaction of self-compositions with spectral methods
Let x(t) = a0+ 2 P
n=1 anTn(t) denote a Chebyshev series, uniformly convergent on [1,1] and such
that x(t)[1,1]. Let y(t) = b0+ 2 P
n=1 bnTn(t) denote another Chebyshev series, and let us
consider the composition yx(t) = cn+ 2 Pn1cnTn(t). We might ask: how do we compute the
coefficients {cn}given the coefficients {an}and {bn}, and how are the decay of these sets of coefficients
Write the Chebyshev polynomials in the form Tn(t) = Pn
k=0 sn,ktk. Then we have
y(x(t)) = b0+ 2 X
a0+ 2 X
=b0+ 2 X
0+ 2 X
where amdenotes the m-fold cosine convolution of awith itself: amdef
= (a(m1) a) and
It follows that the operation (y, x)7→ yxis equivalent at the level of the Chebyshev series coefficients
to the map (b, a)7→ c, where
sm,j aj
The previous derivation has a fairly direct consequence to numerical computations. If the coeffi-
cients {an}are finitely-supported with an= 0 for nN+ 1, then a self-composition b=aawill
have support for n∈ {0,...,N2}. Hence, rigously evaluating a numerical defect can be expensive
even for a low-order approximation, and the amount of information that is lost by taking a finite-mode
projection gets quadratically worse for higher-order approximation.
Far more problematic from the point of view of rigorous Chebyshev (or Fourier) spectral methods:
if one prescribes geometric decay of the coefficients — for example, ||a||ν
=|a0|+ 2 P
n=1 νn|an|<
for some ν > 1 — it is generally false that ||a2||ν<. To see this, consider x(t) = t(1 + t2)1.
The poles of xare at ±iand it is therefore analytic on the Bernstein ellipse Bρfor any ρ < 1 + 2,
with the associated Chebyshev series expansion being uniformly convergent there. Let {an}denote
the Chebyshev coefficients of x(t) = t(1 + t2)1. Then ||a||ν<for ν < 2. However, consider the
self-composition xxand the associated Chebyshev coefficients a2.xxhas six poles, all imaginary,
but for the present discussion it suffices to list only the two smallest ones: t=±i(21(3 5))1/2.
Let ρ(1,1 + 2) be such that tis inside the Berenstein ellipse Bρ. Then ||a||ν<for
ν=ρ1, but xxhas a pole on the in the interior or Bρ. In particular, ||a2||ν=. See Figure
For the purposes of solving an initial-value problem associated to a DE-SDD like (1), the self-
composition barrier is not initially present. Indeed, the implicit method of steps introduced in Section
2.1 transforms an initial-value problem for a DD-SDD into a boundary-value problem for an ordinary
differential equation. This ordinary differential equation depends on the initial condition. Its solution
-2 -1 0 1 2
Figure 1: The Berenstein ellipses Bρfor
ρ= 1 + 2 (solid line) and ρ=ρdef
2.1039 (dashed line) are plotted together
with the poles of x(t) (circles) and xx(t)
(stars). Since the interior of Bρcontains
no poles of x, its sequence of Chebyshev
coefficients has finite || · ||νnorm, with
ν=ρ1=1.1039. However, xx
has a pole in the interior of this ellipse, so
its sequence of Chebyshev coefficients has
infinite || · ||νnorm.
will then solve the initial-value problem on an interval of finite length. However, to solve the initial-
value problem on a longer interval, the computed solution must be fed forward as the initial condition
for a new initial-value problem. Viewing this process more globally, the feedforward structure is
“lower triangular” but non-trivially coupled. Solving the initial-value problem
˙x(t) = f(x(t), x(tτ(t, x(t)))), x(θ) = φ(θ), θ 0
on a suitably long time interval (such that two implicit steps occur; see later Section 2.1) is equivalent,
after suitable transformations, to a fixed-point problem of the form
x1(t) = φ(0) + Zt
F1(x1, φ)(s)ds
x2(t) = x1(1) + Zt
F2(x2, x1)(s)ds,
where Fj:C0([0,1],Rd)×C0([0,1],Rd)C0([0,1],Rd) for j= 1,2 is a function that is related to
the vector field fbut, importantly, features a compositional term. Specifically, Fj(y, z)(s) contains a
term of the form y(τ(sδ, z(s))) for some constant δ. At the level of a fixed-point problem, the
pair of integral equations above define (almost; see Section 2.2) a map on C0([0,1],Rd)2that contains
a self-composition, due to the presence of F2. Therefore, moving to a purely spectral approach would
still require understanding a composition formula such as (2). At present, this seems out of reach.
For these reasons, we will avoid the use of purely spectral methods.
1.4 Structure of the paper
In the present work, we develop an implicit method of steps that is applicable to differential equations
with state-dependent delay. This is done in Section 2. Specifically, the first Section 2.1 is more
computational, while Section 2.2 is more theoretically-minded, exploring how to translate the implicit
method of steps to a zero-finding problem. In Section 3 we review some necessary background on
interpolation and introduce the rigorous numerics framework for the zero-finding problem of the
previous section. Section 4 contains the theoretical background for rigorous multiple integration
steps, as well as extensive details concerning practical implementation. Several examples concerning
a scalar state-dependent delay equation appear in 5.
1.5 Notation
Given a function F:X1×X2×·· · ×XnYfor Banach spaces X1,...,Xnand Y, we denote DjF
the partial Fechet derivative of Fwith respect to variables in Xj. For an interval I, we denote by
Iits interior. If Iis closed we denote I
+= [inf I,sup I). IR denotes the set of real intervals, and
we define IRddef
= (IR)dfor any d1.
2 Solving DE-SDD
In this section we present the implicit method of steps and an associated zero-finding problem.
2.1 The implicit method of steps
Our modification to the method of steps will be based on the idea that, initially, the time lag tτ(t, x)
should be in the domain of the initial condition. In many situations the monotonicity of the time
lag is assumed, so once the time lag leaves the domain of the initial condition, it can never return.
To keep the method general, and allow for potential non-monotonicity of the time lag, we need to
introduce a piece of machinery that glues solutions together.
Definition 1. Let g1: [a, b]Rdand g2: [b, c]Rdsatisfy g1(a) = g2(b). The annealing of g1and
g2is the unique function g1g2: [a, c]Rdsuch that g1g2|[a,b]=g1and g1g2|[b,c]=g2.
Consider the initial-value problem
˙x(t) = f(x(t), x(h(t, x(t)))),(3)
x(θ) = φ(θ), θ ∈ J (4)
for φ0:J Rd, with Jbeing a closed, non-degenerate interval. Here, the interpretation is that
h(t, x(t)) represents the lagged time. Specifically, h(t, x) = tτ(t, x), for the delay τ(t, x). We define
the implicit method of steps to be the following algorithm.
0. Initialize solution: x
. Define t0= sup Jand verify h(t0, φ(t0)) ∈ J
1. Solve the following boundary-value problem for ψ: [t0, t1]Rd,t1> t0:
ψ(t) = f(ψ(t), φ(h(t, ψ(t)))), t [t0, t1] (5)
ψ(t0) = φ(t0) (6)
h(t1, ψ(t1)) = t0.(7)
2. Update the solution: x
J [t0, t1], T
T ∪ {t0, t1}.
3. Update the initial data: φ
I. Define t0= sup J.
4. Return to step 1.
Proposition 1. Suppose f,τand φare continuous. Every time step 4 of the implicit method of steps
is reached, the function x:I → Rdis a solution of the initial-value problem (3)(4). In particular, if
f,τand φare Lipschitz continuous, it is the unique solution defined on that interval.
Proof. Let x: [inf J, t0+)Rddenote any solution of (3)–(4). If h(t0, φ(t0)) ∈ J
+, then
h(t0, ψ(t0)) < t0. Since xis continuous, t
1= inf{t > t0:h(t, x(t)) = t0}exists (or t
and t
1> t0. Without loss of generality, assume t
1< t0+. Then h(t, x(t)) ∈ J for t[t0, t
1], so
x(h(t, x(t)) = φ(h(t, x(t)). It follows that xis a solution of the boundary-value problem (5)–(7), with
1. The result then proceeds by induction. Uniqueness if a consequence of the Picard-Lindel¨of
theorem for ordinary differential equations.
We refer to an implicit step as one cycle of 1–4 in the above algorithm. The set Tshould be understood
as a being an ordered set. It is a construct that keeps track of the terminal time in the boundary
condition (6)–(7). Aside from a bookkeeping device, we will use it to define a special class of solution.
Definition 2. Write T={t0, t1, . . . }for ordered times tj< tj+1. We say that the solution x:
I → Rdsupplied by the method of steps is m-steps disjoint if the graph of t7→ h(t, x(t)) intersects
{t0,...,tm1} × Rdexactly mtimes, for t(t0, tm]. In general, a solution of the IVP (3)(4) is
m-steps disjoint if it is equal on the interval [t0, tm]to a solution supplied by the implicit method of
steps, and that solution is m-steps disjoint.
Figure 2: For a disjoint-step solution x(t) of the initial-value problem (plotted right), the times tjfor
j= 0,...,5 correspond to unique zeroes of the map t7→ h(t, x(t)) tj. For visualization, the time
lag function t7→ h(t, x(t)) is plotted on the left, where stars indicate intersections with the level sets
h(t, x(t)) = tj. On the right plot, the pieces of curve between individual stars correspond to the segments
ϕ1, . . . , ϕ5. The horizontal length of each segment is uneven, which is typical of the implicit method of
steps in the scope of DE-SDD. Plots are for illustrative purposes only.
The idea behind such m-steps disjoint solutions is that they can be computed by a simpler version
of the implicit method of steps; see Figure 2. We will call it the implicit method of disjoint steps.
0. Initialize data: φ0
1. Define t0= sup Jand verify h(t0, φ0(t0)) ∈ J
1. Solve the following boundary-value problem for φj: [tj1, tj]Rd:
φj(t) = f(φj(t), φj1(h(t, φj(t)))), t [tj1, tj] (8)
φj(tj1) = φj1(tj1) (9)
h(tj, φj(tj)) = tj1.(10)
2. If j=m, proceed to step 3. Otherwise, set j
j+ 1 and return to step 1.
3. Build the solution: x
φ0φ1∪ · ·· ∪ φm.
In this version of the implicit method of steps, the solution xis only computed at the end. Segment
φj+1 can be computed using only the previous segment, φj, rather than the entire previously-computed
history. One can then prove an analogue of Proposition 1. The proof is straightforward and is omitted.
Proposition 2. The implicit method of disjoint steps produces a solution x: [t0, tm]Rdof the
initial-value problem (3)(4), and this solution is m-steps disjoint.
Remark 1. If φis not on the solution manifold — that is, ˙
φ(t0)̸=f(φ(t0), φ(h(t0, φ(t0)))) — then
the solution of the initial-value problem will have a discontinuous derivative at time t0. Therefore,
regardless the smoothness of φ, the solution loses regularity in a neighbourhood of t0. This means that
in the “standard” implicit method of steps, the vector field (5) might only be (Lipschitz) continuous,
even though the initial condition is smooth. This is related to the so-called splicing condition of
integration of DE-SDD [3], and is one of the reasons solving such equations is numerically challenging.
To compare, in the disjoint steps version of the method, the vector field maintains the regularity of
the initial condition φas long as fand hare also sufficiently smooth.
2.2 Out-of-bounds evaluation and a zero-finding problem
The next step is to transform (5)–(7) into a suitable zero-finding problem. Assume without loss of
generality that dom(φ) = J= [1,0]. Denote δ=t1and perform a re-scaling of time by defining
ψ(t) = ψ(), substituting this into (5)–(7) and dropping the tildes. We get
ψ(t) = δf (ψ(t), φ(h(tδ, ψ(t)))), t [0,1] (11)
ψ(0) = φ(0) (12)
0 = h(δ, ψ(1)).(13)
We want to transform this modified boundary-value problem into a zero-finding problem on a
suitable function space. A standard approach is to integrate the differential equation and use the
boundary condition to obtain an integral equation. However, since the domain of φis only the
interval [1,0], it is possible (in fact, likely) that for ψin a general vector space of functions, the
composition t7→ h(tδ, ψ(t)) will include points outside of the domain [1,0] of φ. We refer to this
apparent issue as the out-of-bounds evaluation problem.
One way to resolve the out-of-bounds evaluation problem would be to restrict the range of any
ψ, and pose the integral equation on function space determined by such a restriction. However, this
would result in a zero-finding problem on a manifold, which we would rather avoid. To circumvent
this problem, let Ekφdenote a specific Ckextension of φto the real line, for kk. That is,
Ekφ[1,0] =φ, (14)
and Ekφis Ck. Such an extension can always be constructed. For example, we can define
Eky(x) =
y(x), x [α, β]
r!y(r)(α), x < α
r!y(r)(β), x > β
In this case, Ekyis simply a degree kTaylor extension of y. In what follows we will drop the subscript
on E, but the reader should keep in mind that kkwill always need to be chosen large enough so
that any subsequent derivatives can be given meaning.
With the extension Eφdefined, we replace every nontrivial instance of φin (5) with Eφ. This gives
ψ(t) = δf (ψ(t),Eφ(h(tδ, ψ(t)))), t [0,1] (15)
ψ(0) = φ(0) (16)
0 = h(δ, ψ(1)).(17)
We then have the following lemma, whose proof is omitted.
Lemma 3. Let ψbe a solution of (15)(17), with φ:J Rd. If h(tδ, ψ(t)) ∈ J for t[0,1], then
ψis a solution of (11)(13).
As we will see in Section 3.6, verifying the inclusion condition of this lemma is simple once we have
rigorously computed a solution of (15)–(17). We can now integrate (15) and use the initial condition
(16) to derive a zero-finding problem. The following lemma can be proven in a straightforward manner.
Lemma 4. ψ: [0,1] Rdis a solution of (15)(17) if and only if F(ψ, δ ) = 0 and t7→ h(tδ, ψ(t))
has range in [1,0], where F:C([0,1],Rd)×RC([0,1],Rd)×Ris defined by
F(ψ, δ) = ψ(t) + Rt
0δf (ψ(s),Eφ(h(sδ, ψ(s))))ds
h(δ, ψ(1)) (18)
With respect to the supremum norm on C([0,1],Rd),Fis ktimes continuously differentiable provided
the same is true of f,hand φ. In that case ψis k+ 1 times continuously differentiable.
3 Interpolation and rigorous numerics for a single im-
plicit step
With Lemma 4 in mind, our focus shifts to proving the existence of a zero of the map Fin (18). A
secondary goal is to rigorously check the inclusion t7→ h(tδ, ψ(t)) ∈ J = dom(φ). To accomplish
this, we will represent approximate solutions ψusing piecewise polynomial interpolants of a given
order and design a Newton-like operator for (18) that can be expected to contract in some closed ball
around a candidate interpolant ψand approximate terminal integration time δR.
3.1 Piecewise reformulation of F
We will use piecewise Chebyshev interpolants for our data. Since these interpolants will not be
continuous unless additional constraints are specified, we reformulate the operator Fon a space of
piecewise-continuous functions. Given mN, let ∆m={s0, s1,...,sm}be a mesh of [0,1], with
0 = s0< s1<··· < sm= 1. For kN, let Ck
m([0,1],Rd) denote the space of functions that are k
times continuously differentiable on [0,1] \m, with the derivatives up to order khaving only jump
discontinuities on the mesh, and whose derivatives up to order kare continuous from the left. The
following lemma is simple to prove.
Lemma 5. Introduce a nonlinear map g:C0
g(u, δ)(t) = u(s
j) + Zt
δf (u(s),Eφ(h(sδ, u(s)))ds, t (sj, sj+1] (19)
and g(u, δ)(0) = φ(0), while in the case j= 0, we replace u(s
j)with φ(0).genjoys the following
1. gis well-defined.
2. If ψ=g(ψ, δ )and h(δ, ψ (1)) = 0, then ψC0([0,1],Rd)and F(ψ, δ) = 0, and vice-versa.
3. Define µ:C1
m([0,1],Rd)to be the embedding µ(u) = u. Then µgis k
times continuously differentiable, provided the same is true of f,hand φ.
As remarked in [4], the piecewise reformulation of the bootstrapped Picard operator has a few
advantages over a purely continuous one. We refer the reader to the discussion therein. Thanks to
Lemma 5, we can define a modification of the map F.
Lemma 6. Define a map G:C0
G(ψ, δ) = u+g(u, δ )
h(δ, ψ(1)) .(20)
If G(ψ, δ) = 0, then ψC([0,1],Rd)and F(ψ , δ) = 0. The regularity assertions for Gand ψin this
case match those of Lemma 4.
In the sections that follow, it will be beneficial to have a compact representation of the integrand
in (19). With φfixed, define ˜
f: [0,1] ×Rd×RRdby
f(t, u, δ) = δf (u, Eφ(h(tδ, u))).(21)
3.2 Piecewise polynomial interpolation
Here we review some necessary material on piecewise polynomial interpolation, specifically with the
Chebyshev points of the second kind. Fix some kN. Given a mesh ∆m, let ∆m,k0denote a refined
mesh such that the k1 Chebyshev points of the second kind are inserted between sjand sj+1.
sj,ℓ =sj+xk
+ 1
2(sj+1 sj), = 1,...,k1,
= cos k
kπ= 0,...,k,
with the symbolic extension sj,0=sjand sj,k =sj+1.
Define Sd
m,k C0
m([0,1],Rd), the space of piecewise polynomial functions on ∆m. Precisely, u
m,k if uC0
m([0,1],Rd) and u|(sj,sj+1)is a (d-vector) polynomial of degree kfor j= 0,...,m1.
We then have a projection operator Πd
m,k :C0
m,k defined by the following property:
u= Πm,kuis the unique function in Sd
m,k such that u|(sj,sj+1)=pj|(sj,sj+1)for a degree kpolynomial
pj, where
pj(sj,ℓ) =
u(sj,ℓ), ̸= 0, k
j,ℓ), = 0
j,ℓ), =k.
The following interpolation error bounds are proven in [4].
Proposition 7. For all uCk+1
j=0,...,m1 (sj+1 sj)k+1 sup
dtk+1 (t)!(22)
where Ck=1
Proposition 8. Fix Nsuch that 1k. For all uC
Ck,ℓ max
j=0,...,m1 (sj+1 sj)sup
Ck,ℓ = min
(1 + Λk)π
4(k+ 1 )!
(k+ 1)! ,1
4q 1
2q! 2q
with Λkbeing the Lebesgue constant. Specifically,
Λk= sup
i=0 |Lk
i(x)|, Lk
i(x) = Y
which satisfies Λk1 + 2
πln(k+ 1), and
cot 2j+ 1
kπfor kodd.
m,k is isomorphic to Rmd(k+1). There are several ways this isomprphism can be realized, depend-
ing on how one would like to store the data associated to an element of Sd
m,k. We will take the most
direct approach and encode uSd
m,k using its values at the points sj,ℓ, which we denote uj,ℓ. Then,
we have the expansion
u(t) =
sj+1 sjsj+1 t
sj+1 sj, t (sj, sj+1), j = 0,...,m1,(25)
for Lk
n: [1,1] Rthe Lagrange function
n(x) = Y
When necessary, we will index the components of uj,ℓ using a superscript, u(i)
j,ℓ. Similarly, we
will index components of uC0
m([0,1],Rd) using superscripts, u(i)for i= 1,...,d. The induced
isomorphism between the function t7→ uand the coefficients {uj,ℓ}will not be explicitly written.
3.3 Newton-like operator
From this point on, we will assume that f,τand φare ktimes continuously differentiable. We will
fix mand k(generally distinct from k) so that we can drop the subscript indices on the projection
operator Πd
m,k and interpolant space Sd
m,k, and write them simply as Πdand Sd. Define a projection
Π : C0
m([0,1],Rd)×RSd×R,(u, δ)7→ Π(u, δ) = (Πdu, δ ).
Let Π=IΠ be the complementary projector, and decompose X=C0
m([0,1],Rd)×Ras an
internal direct sum
X= Π(X)Π(X)def
m,k Xd
Then Xd
= Πd(C0
m([0,1],Rd)) × {0}, so we will identify elements of Xd
with their associated
elements of C0
Let r0, r>0 be two positive parameters. Introduce norms on the factors of X,
||(u, δ)||Xd
m,k = max max
= max
i=1,...,d ||v(i)||(26)
and also define a norm on X:
||(u, δ)|| = max ||Π(u, δ)||Xd
m,k ,1
r||Π(u, δ)||Xd
For all r0, r>0, (X, || · ||) is a Banach space.
Let p∈ {1,...,k+ 1}be fixed, and consider the map G:XXdefined in (20), and ˜
G(u, δ) = ΠG(u, δ).
By Lemma 6, each of Gand ˜
Gis ktimes continuously differentiable. By Lemma 5 and preceding
results, zeroes of Gare (almost) in one-to-one correspondence to solutions of the boundary-value
problem (15)–(17).
Let w= (u, δ)Xd
m,k be a candidate for a numerical zero of the function ˜
G. Let Am,k :Xd
m,k be an injective linear operator, and introduce a Newton-like operator T:XX,
T w = (ΠwAm,kΠG(w)) + Π(G(w) + w).(28)
In practice, we will verify directly that the matrix representation of A
m,k on Rmd(k+1)+1 (recall,
m,d Rmd(k+1)) is invertible, and Am,k will be chosen to be its inverse (up to conjugation by
coordinate maps).
Remark 2. Tcan alternatively be written T w =wAG(w), where A:XXis the bounded linear
operator defined by AΠw=Am,kΠwand AΠw=Πw. In this context, Ais interpreted as an
approximate inverse of DG(w).
Introduce the ball
Br(w) = {zX:||wz|| < r}
centered at zXwith radius r > 0. Let Br(w) denote its closure. The following theorem provides
sufficient conditions under which the operator Tis contractive on the closure of (and maps into the
interior of) a ball centered at the numerical zero wof G. It is similar to a theorem from [7] and is
proven in the same way.
Theorem 9. Let T:XXbe continuously differentiable, Am,k injective, and wXd
m,k. Suppose
there exist bounds Y,Zsuch that
||Π(T w w)||Xd
m,k Y(29)
||Π(T w w)||Xd
ξ1Br(0) ||Π(DT (w+ξ1)ξ2)||Xd
m,k Z(r) (31)
ξ1Br(0) ||Π(DT (w+ξ1)ξ2)||Xd
Z(r) (32)
for all ξ2B1(0). If there exists r > 0such that the radii polynomials satisfy
=Y+rZ(r)r < 0 (33)
then Thas a unique fixed point in Br(w). Hence, Ghas a unique zero in Br(u).
Being based on piecewise polynomial interpolation, the Yand Zbounds can be computed using
much the same machinery as in [4]. The differences are that our vector fields are non-autonomous
(because of the delays) and we have a boundary condition. In the following two sections we derive
these bounds in general, but the presentation will be rather terse since many of the computations are
similar to those of the cited reference.
3.4 Technical bounds for computer-assisted proofs
In this section we provide explicit formulas for the bounds Yand Zfrom Theorem 9. We also discuss
some details concerning the implementation of the operator A
m,k =D˜
G(w) and the approximate
inverse, Am,k. Before deriving the bounds, we introduce a few pieces of notation. Given w= (u, δ)
m,k, we write u=π1wand δ=π2w. Partition Am,k as a block operator
Am,k =A11 A12
A21 A22
such that
Am,k(π1w+π2w) = (A11 π1w+A12 π2w, A21 π1w+A22π2w).(35)
We also use the same projection (i.e. using π1, π2) notation for an element of X. By way of (25), the
isomorphism of Sd
m,k with Rmd(k+1) induces a matrix representation of A11 as a map on Rmd(k+1).
We will use the same symbol, A11 , for this matrix operator. Similarly, A12 can be identified with
an element of Rmd(k+1),A21 can be identified with an element of the dual (Rmd(k+1)), hence a row
vector, and A22 is a real number. Finally, define the polynomial function mj: [1,1] [sj, sj+1],
mj(t) = sj+1 sj
2t+sj+1 +sj
By construction, mj(xk
) = sj,ℓ and
u(mj(t)) =
(t), t (1,1).
3.4.1 Ybound
We write Y= max{Y1, r 1
0Y(2)}, where each of Y(1) and Y(2) will be defined such that
i,j,ℓ |[π1(Am,k ˜
j,ℓ| ≤ Y(1) (36)
|π2Am,k ˜
G(w)| ≤ Y(2) (37)
If Y(1) and Y(2) satisfy the above, then (29) holds. Implementing the left-hand side of (37) is
straightforward, so we will not discuss the Y(2) bound. For the Y(1) bound, we have
π1(Am,kG(w)) = A11 (g(u, δ)u) + A22 h(δ, u(1)) (38)
A22h(δ , u(1)) can be computed directly, so we focus on computing g(u, δ). Recall
g(u, δ)(t) = u(s
j) + Zt
δf (u(s),Eφ(h(sδ, u(s)))ds, t (sj, sj+1].(39)
Computing [g(u, δ)]j,ℓ requires evaluating the integrand above at sj,ℓ . If all functions involved are
polynomials, this can be done directly and exactly. If not, one must generally resort to numerical
quadrature. The implementation of this integral is, arguably, the hardest part about completing
proofs with multiple steps, the latter of which is discussed in Section 4.
Once the integral (39) is implemented (in the sense that we can rigorously enclose it), we can
obtain a suitable bound Y(1) in (36). Combined with the previous discussion concerning Y(2), we can
compute a Ybound satisfying (29).
3.4.2 Ybound
A suitable Ybound as in (30) is straightforward to compute using Proposition 7. Namely, we can
j=0,...,m1 (sj+1 sj)k+1 sup
f(t, u(t), δ)!(40)
where the norms on the right-hand side are the norm on Rd, and ˜
fis defined in (21). To implement
this bound in practice, we subdivide [sj, sj+1 ] with an interval mesh and compute coarse enclosures of
the derivative using interval arithmetic. In the case of multiple steps, we incorporate the error from
previous steps into the derivatives; see Section 4 for details.
Remark 3. The Ybound is often the limiting bound in obtaining a small radius rin a computer-
assisted proof. Increasing kis generally expensive if one is interested in doing proofs with multiple
integration steps, while increasing mrequires ever more memory. See Section 3.7 for further discus-
3.4.3 Zbound
To begin, we write the components of ΠDT (w+ξ1)ξ2in the following form:
ΠDT (w+ξ1)ξ2= Am,kΠDG(w+ξ1)) ξ2
= (IAm,kA
m,kξ2+Am,k (A
m,k DΠG(w))ξ2+Am,k (DΠG(w)DΠG(w+ξ1))) ξ2,(41)
taking into account that the projection operator Π, being bounded, commutes with the Fr´echet
deriavtive. We will identify Z0,Z1and Z2such that
Z0≥ ||(IAm,kA
m,k ,(42)
Z1≥ ||Am,k(A
m,k DΠG(w))ξ2||Xd
m,k ,(43)
Z2≥ ||Am,k(DΠG(w+ξ1)DΠG(w))ξ2||Xd
m,k (44)
for all ξ1Br(0) and ξ2B1(0). Then, we can take Zdef
=Z0+Z1+Z2, and the bound (31) will be
The Z0bound is trivial to compute. Indeed, since Πξ2Xd
m,k Rmd(k+1)+1, we can take Z0to
be the operator norm of IAm,k A
m,k on Xd
m,k. Due to the isomorphism, this reduces to a finite-
dimensional matrix norm computation, appropriately weighted so it is consistent with the norm on
m,k. We omit the details.
Denote Πξ2=ξ
2. Before we begin, we remark that ξ
2(sj) = 0 for j= 0,...,m due to the
interpolation. Similarly, ξ
j) = 0. This will be used a few times.
Observe that DΠG(w)ξ2=A
m,kΠξ2+DΠG(wξ2. Therefore, we need only construct Z1such
that Z1≥ ||Am,kDΠG(wξ2||.Denote ξ2= (u2, δ2) and Πξ2=ξ
2. Then
2|=|D1h(δ, u(1))0 + D2h(δ, u(1))ξ
2(1)|= 0 (45)
since ξ
2(1) = 0, and the other zero being due to ξ
2having trivial component in R. As for the π1
component, since ξ
j) = 0 for j= 0,...,m,
2]j,ℓ =Zsj,ℓ
f(s, u(s), δ)ξ
Since |ξ
2(s)| ≤ rbecause ||ξ2||X1, we can majorize as follows:
j,ℓ| ≤ r|sj,ℓ sj|sup
f(t, u(t), δ)(i)|.
Combining the above with (35) and (45), it follows that if we define ρSd
m,k by
j,ℓ =|sj,ℓ sj|sup
f(t, u(t), δ)(i)|,(46)
then we can take any Z1such that
i,j,ℓ ||(abs(A11)ρ)(i)
for abs(A11) being the matrix whose entries are the absolute values of the entries of A11. To implement
ρ, we can use direct interval evaluations, further subdividing [sj, sj,ℓ] to get a tight enclosure.
Remark 4. If the mesh mis equally spaced, ρscales linearly with respect to 1
m; see (46). However,
in the Z1bound, this inverse scaling with mis generally negated by the multiplication abs(A11 )ρin
(47), since A11 is a square matrix with (k+ 2)m+ 1 rows. Similarly, the Z1bound is generally linear
in k. Therefore, in general, the only way to control Z1is to decrease r. The result is that there is an
even greater need to keep Yand Zsmall, since decreasing rhas a penalty on the radii polynomial
p; see (34). One way the bound Z1could be improved is to use the a priori bootstrap developed in
[4], since there, the Z1bound truly can be controlled by tuning mand the order of the bootstrap. To
keep the presentation simple, we have avoided doing this in the present paper. See Section 3.7 for
further discussion.
Here we will assume that k2, although a Z2bound can generally still be computed even if k= 1.
To begin, let ξ1= (u1, δ1) and ξ2= (u2, δ2). Let H(x, y ) denote the Hessian of hat (x, y)R×Rd.
[u2(1) δ2]H(δ+1, u(1) + tu1(1)) u1(1)
δ1dt (48)
provided k2. Let 1= [1,1]d×[r0, r0]Rd×Rand 1r0= (1,1,...,1, r0)Rd×R. Then
|π2(DΠG(w+ξ1)DΠG(w))ξ2| ≤ sup
For the π1component, first define Θ(·,·;s) : R×RdRdby
Θ(x, y;s) = ˜
f(s, y, x).
Define 1Sd
m,k by
j,ℓ =|sj,ℓ sj|sup
rmax{Λk+r, r0}|D2Θ(i)(δ+x, u(t) + y;t)|12
where 1r0,r= (Λk+r,...,Λk+r, r0)Rd×R, and products with 12
r0= (1r0,1r0) and analogous;y
ones with 12
r0,rare interpreted as the action of the appropriate bilinear maps (second Fechet
derivative of Θ(i)) on these elements. Then, one can compute Z2=Z2(r) such that
Z2(r)≥ ||abs(Am,k)||Xd
m,k (51)
with = (1, 2), then (44) holds. Remarks concerning the implementation of analogous to those
for the Z1bound can also be made.
Remark 5. A similar comment can be made here as we did for the Z1bound. The bound (51) can
generally only be controlled by decreasing r. For this to be feasible, it is necessary that the Ybound
is suitably small. Thankfully, the size of the latter bound is primarily determined by the precision of
the floating point number system and the wrapping effect of error from any previous computer-assisted
proofs, and can therefore be controlled. See Section 3.7 for further discussion.
3.4.4 Zbound
Let 1Udenote the indicator function on the set U. Once again writing ξ1= (u1, δ1) and ξ2= (u2, δ2),
the Zbound requires us to obtain a bound (in the norm || · ||Xd
) for
Π(DG(u+δ1, δ +δ1)ζ2ζ2) = Π t7→
DΘ(δ+δ1, u(s) + u1;s)(δ2, u2(s))ds!.
Using Proposition 7 and Proposition 8, we can establish that if Zis chosen such that
j=0,...,m1|sj+1 sj|sup
t[sj,sj+1]|DΘ(i)(δ+x, u(t) + y;t)1r0,r|
then (32) holds.
3.5 A priori maximum radius
In practice, we select r>0 and consider the polynomials
˜p(r) = Y+r(Z0+Z1+Z2(r)) r
˜p(r) = Y+rZ(r)rr
If we can find r(0, r) such that ˜p(r)<0 and ˜p(r)<0, then (33) and (34) will both be
negative. This is a consequence of our bounds Z2(r) and Z(r) being nondecreasing functions
of r. We use this implementation for our examples, since it is less expensive than computing
roots of the full radii polynomials pand pdirectly.
3.6 Verifying the inclusion h(tδ, u(t)) [1,0]
Recall that the lag function is h(t, u) = tτ(t, u). To begin, remark that by definition of the
norm on Xand the numerical enclosure afforded by the radii polynomial approach, the true
zero (u, δ) of Gmust satisfy
||uu||rk+r),|δδ| ≤ rr0.(53)
Introduce an interval-valued function ˆu: [0,1] IRd,
ˆu(t) = u(t) + rk+r)[1,1].
Denote ˆ
δ=δ+rr0[1,1] IR. It follows that
h(tδ, u(t)) = τ(t, u(t)) tˆ
δτ(tδ, ˆu(t)).(54)
This inclusion alone will not be sufficient to prove that h(tδ, u(t)) [1,0] for t[0,1],
necessarily because where t0 or t1, the right-hand side of (54) will intersect the
complement of [1,0]. To remedy this, we remark that due to Lemma 5, uis k+ 1 times
continuously differentiable. In particular,
dtu(t) = δf(u(t),Eϕ(h(tδ, u(t)))) ˆ
δf ( ˆu(t),Eϕ(tˆ
δ, ˆu(t)))) def
= ˆu(t).(55)
It follows that
dth(tδ, u(t)) ˆ
δ, ˆu(t))ˆ
δ, ˆu(t))ˆu(t).(56)
If the enclosures in (53) are tight enough and the derivative (56) is strictly positive at t= 0 and
t= 1, one can use (54), (55), (56) and elementary calculus to prove the inclusion h(tδ, u(t))
[1,0] for all but the most wildly oscillatory solutions u.
3.7 Parameter tuning for computer-assisted proofs
Selection m, k and weights ris crucial for a successful proof. We will not discuss them
exhaustively, but rather provide some general guidelines. In the examples we considered,
tuning r0wasn’t hugely important in obtaining proofs, and we will not discuss it here.
The Z1bound can only be controlled reliably by adjusting r, given the discussion of
Remark 4. As such, one should typically compute Z1first with r= 1, and then adjust r
accordingly to ensure that Z1<1. The bound Z0will typically be machine-precision small
(or wrapping effect small) and can be ignored. In practice, the Z2bound typically does not
cause much of a problem, and its contribution to the radii polynomial is quadratic. Therefore,
as long as Z0+Z1<1 and Yis sufficiently small (we expect machine precision or wrapping
effect smallness), the “finite proof” will typically succeed, which is to say that there will exist
rsuch that p(r)<0. In summary, for the finite proof to succeed, it should generally only
be necessary to adjust the rweight. This weight should be chosen as large as possible for
the proof to succeed (with success being weighted by how large a value of rwe are willing to
Once the rweight has been suitably tuned, we need to work on the Yand Zbounds.
In conditioning for the finite proof, it might have been necessary to take ra bit small. In
general, the smaller ris, the smaller we will need to make Z. Unfortunately, the only
way to tune Zis to increase mor k. The scaling of Zis inverse linear in m, whereas
for fixed m, we have Z=O(ln(k+1)
k+1 ). It is generally better, for this reason, to adjust mif
one needs to control Z. As for the Ybound, this too benefits from inverse polynomial-
order scaling with respect to m. However, khas a much larger effect due to the (sj+1 sj)k
term. The downside is that the derivatives of ˜
f(t, u(t), δ) can be quite large. This is largely
system- and data-dependent, and it is nearly impossible to predict the impact of increasing k
on the Ybound. Even less so when multiple steps of integration are involved. To conclude,
controlling mresults in a more reliable control of Yand Z, for fixed k, than does fixing
mand controlling k. However, depending on the properties of the numerical solution uand
the vector field, Ymight scale incredibly well with respect to k.
4 Multiple implicit steps
The Ybound of Section 3.4, needed to obtain computer-assisted proofs of solutions of DE-
SDD, is not directly implementable. Also, we have not explained how exactly the error from
one proof should be propagated forward rigorously, if one is proving a long solution using
multiple integration steps. The latter step ends up being less than trivial, and there are
several non-equivalent ways it can be done depending on how the numerical data from each
step is stored and how this interacts with the quadrature rule used to evaluate (39), for
instance. We emphasize the storage of the numerical data, since the out-of-bounds evaluation
problem makes this more complicated than merely storing a representative of each solution
on the canonical domain [0,1].
To provide a constructive answer, we will make a few simplifying assumptions. We will
assume that the function fand the delay τare polynomial. This might seem rather strong,
but it can sometimes be avoided; see Section 4.4. As for the initial condition ϕ, we will explore
in this section how it (as well as the extension Ekϕ) can be rigorously enclosed using various
polynomial-like functions. This will allow for effective evaluation of the bounds needed for
computer-assisted proofs, as well as a systematic way to complete multiple steps of integration.
Before we begin, let us make a disclaimer. This section provides one way by which error
from one computer-assisted proof can be rigorously propagated to the next, while allowing
for effective evaluation of subsequent Yand Zbounds. However, ours is not the only way.
Broadly, the steps we take in this section are designed to move from piecewise-polynomial
representations of numerical solutions u, into polynomial representations of subsequent initial
conditions. The advantage of this set-up is that it makes the Ybounds fairly explicit, even
for multiple steps of integration, since it reduces the calculations to polynomial integration.
See Appendix A for some self-contained discussion. The disadvantage is that extremely high-
degree polynomials can be necessary. We are quite certain that there are more efficient ways of
storing the data, computing the bounds Yand Z, and rigorously propagating error. Whether
or not all of this can be done simultaneously is the big question.
4.1 ϕ-mutations and interpolation of boostrapped vector fields
How to generate an enclosure of ϕwill depend on the scope in which it is considered. If ϕis
the initial condition for the first implicit step, our view is that it could be a fixed (sufficiently
smooth) function, or perhaps it could be a theoretical object in an unstable manifold of a fixed
point or periodic orbit. In both cases, ϕshould be (sufficiently) smooth, so its derivatives and
E-extensions up to order kcan be interpolated by polynomials on arbitrarily large domains.
The error bounds enare then consequences of theoretical interpolation error. Keeping these
errors small might require a high-order interpolant and access to even higher derivatives of ϕ
on [1,0]; see [25] for results concerning interpolation at the Chebyshev nodes.
In the second case, ϕcould actually be u, with its domain appropriately scaled to [1,0],
and ubeing an output function from the implicit method of steps. In this case, ushould be
identified with uplus a C0enclosure of width rdef
=rk+r). Our goal is therefore to obtain
an enclosure of u(or Eku) and its derivatives.
We will now overview a few ways in which an enclosure of a solution u, validated by the
radii polynomials, can be generated. First, we need a definition.
Definition 3. ACnmutation of a Ckfunction ϕ,nk, is a set of functions {ξ0, . . . , ξn},
with ξj:RIRd, such that ϕ(j)ξj|dom(ϕ)for j= 0, . . . , n. If ξand ζare two mutations,
we write ξζif the range of ξnis a subset of that of ζnfor n= 0,...,k.
Generally, if we say that a Cnmutation ξ={ξ0, . . . , ξn}has some property (e.g. is smooth,
is an interval-valued polynomial), this should be understood to mean that each ξjfor n=
0, . . . , n has this property. The Taylor extension Ekϕdefines a single-valued Ckmutation of
ϕin a straightforward way. Mutations will be useful later in making statements and proofs
concerning enclosures more transparent.
Lemma 10. Suppose G(u, δ) = 0, and h(tδ, u(t)) [1,0] for t[0,1]. Let ξbe a Cn
mutation of ϕ,nk, and define
fξ(t, u, δ) = δf(u, ξ0(h(tδ, u))).(57)
That is, ˜
fξis the same as ˜
fexcept that we have replaced Eϕwith ξ0. Then, for j= 1, . . . , n+1,
fξ,j (t, u(t), δ), t [0,1],(58)
where the sequence of vector fields ˜
fξ,j are symbolically defined according to
fξ,j+1 =D1˜
fξ,j + (D2˜
fξ,j )˜
with the addendum that each (symbolic) derivative ξ(j)
0for 0jnis replaced with ξj. If ξ
is a single-valued mutation, then the inclusion in (58) becomes an equality.
We refer to the ˜
fξ,j as bootstrapped vector fields. If a mutation ξhas been selectd, we will
sometimes suppress the subscript. Recall that f, the delay functions and ϕare all assumed
to be Ck, so that these bootstrapped vector fields are indeed well-defined. In what follows,
(u, δ)Xwill always satisfy the conditions of Lemma 10. They will admit the enclosures
uˆuand δˆ
δ, where ˆuand ˆ
δare defined in Section 3.6.
The basic idea of this section is that Lemma 10 allows us to enclose derivatives of uusing
the bootstrapped vector fields. The choice of mutation can make certain calculations easier.
For example, if ξis an interval-valued polynomial mutation, then the bootstrapped vector
fields are interval-valued polynomials.
4.1.1 Part 1: Bootstrapped enclosure of u
As we will see soon, having a polynomial enclosure with padding ϵ= 0 is incredibly useful.
We will show how to generate one in this section.
The following discussions will be easier if we shift the domain of uto the interval [1,1].
Formally, this can be done by defining y(t) = u((t1)/2). If one substitutes yinto (58), the
result is that
y(j)(t) = 1
fξ,j t1
2, y(t), δ.
However, rather than propagate this change forward, we will make a slight abuse of notation
and identify ywith u, so now we think of uas being defined on [1,1]. Inverting the transfor-
mation t7→ t1
2has the effect of inverting the derivative scaling that appears in the displayed
equation above.
Definition 4. A bootstrapped enclosure of u: [1,1] Rdat order k∈ {0,...,k+1}is any
interval-valued function vk:RIRdsuch that
fξ,k (t, u(t), δ)vk(t)
for at least one mutation ξof ϕ, and |t| ≤ 1,where we define ˜
fξ,0(t, u, δ) = u.
A first step is to obtain a bootstrapped enclosure of u– not even its derivatives – on
[1,1]. We will do this with Chebyshev interpolation. To get a theoretical interpolation
error, we compute an upper bound for supt[1,1] |u(k+1)(t)|.
Lemma 11. Suppose a real V > 0is computed such that
t[1,1] |˜
fξ,k+1(t, ˆu(t),ˆ
for some Ckmutation ξof ϕ. Then Vsupt[1,1] |u(k+1)(t)|.
Proof. Let ξbe the mutation of ϕsuch that ξnis single-valued on [1,0] and coincides with
ξnotherwise. Then ξξ.
fξ,k+1(t, u(t), δ )| ∈ {|˜
fξ,k+1(t, x, y )|: (x, y)ˆu(t)׈
δ, h(ty, x)dom(ϕ)=[1,0]}
⊆ {| ˜
fξ,k+1(t, x, y )|: (x, y)ˆu(t)׈
fξ,k+1(t, ˆu(t),ˆ
⊆ | ˜
fξ,k+1(t, ˆu(t),ˆ
Every element of this set is bounded by V.
The bound Vis indeed implementable once an implementable mutation of ϕhas been
selected. This can be with a Taylor extension, or something completely different. Once it
has been computed, consider the following well-known [25] error bound associated to the
Chebyshev interpolant yN=PN
n=0 cnTnof a function yon the domain [1,1]:
t[1,1] |y(t)yN(t)| ≤ 4W
πk(Nk)k., (60)
where Wis a uniform bound for y(k+1). On the other hand, recall the Lebesgue constant
ΛNassociated to the degree NChebyshev interpolation on [1,1], which satisfies ΛN
1 + 1
2ln(N+ 1). We generally advocate the following approach to interpolating uon [1,1].
Lemma 12. Let uNdenote the degree NChebyshev interpolant of (the piecewise-polynomial
function) u. Then
t[1,1] |u(t)uN(t)| ≤ 4V
πk(Nk)k+ ΛNrdef
provided N > k, for Vbeing the constant from Lemma 11. As consequence, u(t)uN(t) +
Proof. Let PNdenote the degree NChebyshev interpolation operator on C0([1,1],Rd). Since
PNis linear and ||PN|| ≤ ΛN, with ||·|| being the operator norm associated to the supremum
norm || · || on C0([1,1],Rd), we have
||uPNu|| ≤ ||uPNu|| +||PN(uu)||,
which admits the bound stated in the lemma.
Remark 6. One could instead directly interpolate the enclosure ˆuof u. However, this results
in wrapping effect of intervals for which the interpolant has C0radius Nr. To compare, ΛN
is sub-linear in N. Consequently, Lemma 12 gives a tighter bound than naively interpolating
the enclosure ˆu.
Now that we have a polynomial enclosure of uon [1,1], we can directly generate a poly-
nomial enclosure of u(j), for j= 1,...,k+ 1. Indeed, the following is a direct consequence of
Lemma 12 and Lemma 10.
Corollary 13. For nkand any Cnmutation ξof ϕ, we have u(j)(t)˜
fξ,j (t, uN(t) +
δ)for j= 1, . . . , n + 1. Consequently,
=uN(t) + e0[1,1], j = 0
fξ,j (t, uN(t) + e0[1,1],ˆ
δ), j 1.(61)
Each pj(t)is a bootstrapped enclosure of uat order n. In particular, if ξis a (interval-valued)
polynomial mutation, then pjis a composition of (interval-valued) polynomials.
The primary utility of Corollary 13 will be in the computation of the bound (59), since this
bound can be computed with respect to any mutation of the initial data ϕ. It is not useful for
obtaining tight enclosures of derivatives because of the wrapping effect, and it does not handle
out-of-bounds evaluations. In the next section, we will resolve this issue by interpolating the
boostrapped vector fields themselves.
4.1.2 Part 2: Bootstrapped enclosures for derivatives of Eku
Consider the CkTaylor extension Ekuassociated to u. Once again, we interpret uas being
defined on [1,1]. What we will do is interpolate dn
dtnEkuon [12ϵ, 1 + 2ϵ], for some ϵ0.
Note that after inverting the transformation to return uto the domain [1,0], this will result
in a polynomial enclosure on [1ϵ, ϵ]. See the first paragraph of Section 4.1.1. In this
section, a given Ckmutation ξof ϕwill be fixed, and we will suppress the subscript ξon the
bootstrapped vector fields.
To begin, let us construct a few polynomials. Define
pL(t) =
fn(1, u(1), δ)(t+ 1)n,
pR(t) =
fn(+1, u(+1), δ)(t1)n,
where ˜
f0(t, y, δ) = y. Define also the interval polynomials
pL(t) =
fn(1, u(1) + r[1,1],ˆ
δ)(t+ 1)n,
pR(t) =
fn(+1, u(+1) + r[1,1],ˆ
By construction, the inclusions
dtn(Eku)|(−∞,1] dn
are satisfied for n= 0,...,k. We also have the bound
dtkEkuV, (63)
where Vis any number satisfying (59). This is a direct consequence of the definition of
Ek, Lemma 10 and Lemma 11. We can finally prove our enclosure result which, due to its
importance in rigorously proving multiple-step integration of DE-SDD, we give the title of
Theorem 14. For n= 1,...,k, suppose
fn(t, u(t) + η1, δ +η2)˜
fn(t, u(t), δ)| ≤ rn(64)
is satisfied for some rn>0, whenever |η1| ≤ rk+r) = rand |η2| ≤ rr0. Denote r0=r.
Let pndenote the degree NChebyshev interpolant on the interval [12ϵ, 1+2ϵ]of the
wn(t) =
fn(t, u(t), δ), t [1,1]
L(t), t < 1
R(t), t > 1.
If 0n < k< N, then, for |t| ≤ 1+2ϵ,
|(Eku)(n)(t)pn(t)| ≤ en(t)def
=4V(1 + 2ϵ)k+1n
π(kn)(Nk+n)kn+ ΛN(rn,|t| ≤ 1
Proof. Define y(t) = Eku((1 + 2ϵ)t) for t[1,1]. Then
t[1,1] |y(k+1)(t)|= (1 + 2ϵ)k+1 sup
|t|≤1+2ϵ|(Eku)k+1(t)| ≤ (1 + 2ϵ)k+1 V.
Also, it follows that pn(t) = ˜pn((1 + 2ϵ)1t), where ˜pnis the Chebyshev interpolant of t7→
˜wn(t) = wn((1 + 2ϵ)t), for t[1,1]. Then, using (Theorem 7.2, [25]), we have
|y(n)(t)(1 + 2ϵ)n˜pn(t)|≤|y(n)(t)PNy(n)(t)|+|PN(y(n)(1 + 2ϵ)n˜wn)(t)|
π(kn)(Nk+n)kn+ ΛNsup
|t|≤1|y(n)(t)(1 + 2ϵ)n˜wn(t)|.
We have ||y(k+1)|| ≤ (1 + 2ϵ)k+1 V. As for the other difference, let s= (1 + 2ϵ)tfor t[1,1].
|y(n)(t)(1 + 2ϵ)n˜wn(t)|= (1 + 2ϵ)n|(Eku)(n)(s)wn(s)| ≤ (1 + 2ϵ)n
for all |s| ≤ 1+2ϵ. This follows from the Lipschitz-like bound (64), the inclusions (62), and
the definition of polynomials pL, pRand pL, pR. We get the desired result by combining these
two bounds with the previous one for |y(n)(t)(1 + 2ϵ)n˜pn(t)|and dividing both sides by
(1 + 2ϵ)n.
Remark 7. Similar to Remark 6, the reason for introducing the Lipschitz-like constant rn
in (64) is to allow for some amount of control of the wrapping effect. We want to avoid
interpolating an interval-valued function as much as possible. Further discussion will appear
in Section 4.2.
By (65), we have the enclosure
dtnEku(t)pn(t) + en(t),|t| ≤ 1+2ϵ, n = 0,...,k1
Remark 8. When n= 0 and ϵ= 0 in (65), we see that Theorem 14 directly generalizes Lemma
12. Indeed, in that case, e0(t) = e0for |t| ≤ 1. Also, we see that only the Lipschitz-like bounds
r0,...,rnare needed to enclose the derivative u(n)(t)for |t| ≤ 1.
4.1.3 Part 3: Hybrid enclosures
It can be challenging to compute tight Lipschitz-like constants rnneeded for Theorem 14.
Indeed, directly evaluating the left-hand side of (64) generally results in a poor bound, while
using the mean-value inequality requires explicitly computing derivatives. This can quickly
become technical; see Section 5 for an example. It is therefore of interest to avoid these
calculations when possible. To this end, we can recycle our inclusions (62) from the previous
section to define hybrid enclosures.
Let a Ckmutation ξof ϕbe fixed. Let p0and e0be computed according to Theorem 14,
with ϵ= 0, so that Eku(t)p0(t) + e0[1,1] for |t| ≤ 1. See Remark 8. Then, define the
sequence of hybrid enclosures
hn(t) =
p0(t) + e0[1,1], n = 0,|t| ≤ 1
fn(t, p0(t) + e0[1,1],ˆ
δ), n = 1,...,k,|t| ≤ 1
j=0 1
fj+n(1, p0(1) + e0[1,1],ˆ
δ)(t+ 1)j, n = 1,...,k, t < 1
j=0 1
fj+n(+1, p0(+1) + e0[1,1],ˆ
δ)(t1)j, n = 1,...,k, t > 1
It is then straightforward to prove that dn
dtn(Eku)(t)hn(t) for tRand n= 0,...,k.
4.2 Discussion on the interpolation of piecewise polynomials
In Section 4.1, we introduced three strategies to compute enclosures of a solution uof the
implicit method of steps, together with its derivatives. These all require one to make a choice
of mutation for the initial condition ϕ. It might be unclear at this point which strategy should
be used. We find it is much more instructive to explain this with an example, so we will
postpone this to Section 5. However, we can make some general comments.
The polynomial interpolants pnof (61), and pnof Theorem 14, are not themselves inter-
polants of differentiable functions. This can result in a slow decay of the coefficients of the
interpolant, since the function being interpolated could even fail to be continuous at some
of the mesh points in ∆m. Despite this fact, the theoretical enclosures afforded by (61) and
Theorem 14 can be rather tight. However, the slow decay of the coefficients can result in bad
wrapping interval wrapping when only 64-bit floating point arithmetic is used. This is the pri-
mary reason we have opted to use Julia for the implementation, since the interval arithmetic
package IntervalArithmetic.jl, a package of JuliaIntervals [1], supports the BigFloat type for
extended precision arithmetic.
4.3 Mutations, multiple steps, and the out-of-bounds problem
The point of introducing mutations is that they can used to connect one round of the implicit
method of steps to another, with rigorous error propagation. They also provide a formal
mechanism for handling the out-of-bounds evaluation problem. We will outline one way this
can be done here. In what follows, Ekdenotes the Taylor extension.
Suppose without loss of generality that at the first implicit step, the initial condition ϕ
is an interval-valued polynomial defined on the real line, or can be theoretically extended to
an arbitrary compact subset of the real line. We formally write ˜
ϕ=Ekϕ. The first solution,
denoted u, is then proven using the radii polynomials, and we check the monotonicity of the
associated lag argument. We now store three representations of u.
A polynomial interpolant of the solution and its first derivative on [0,1], with rigorous
error bounds for both. Denote these by ˜u(j)for j= 0,1.
Hybrid enclosures of its derivatives up to order k. Denote these by (Hu)(j)for j=
The restrictions of the hybrid enclosures (Hu)(j)to [0,1], extended to the real line by
polynomial evaluation. Denote these by by (Hu)(j).
Note that the latter representation exists because ϕand ˜u(0) are (interval) polynomial. To
compute ˜u(j), we use Theorem 14 with ϵ= 0, computing Vusing the fact that ˜
ϕis a mutation
of ϕ. Being (interval) polynomials, ˜u(j)can be stored on the computer. As the the Hybrid
enclosures, they can be stored using a copy of ˜u(0) and evaluations of the the bootstrapped
derivatives up to order kat the time arguments t∈ {0,1}. The third class, Hu, are equivalent
to (interval) polynomials and are therefore representable as interval vectors.
For the second (and subsequent) implicit step, we now think of ϕ=uprevious step. There-
fore, Ekϕadmits two “computable” mutations.
AC1polynomial mutation, defined by affinely shifting the domain of ˜u(j)(from the
previous step) for j= 0,1 to the interval [1,0], and extending to the real line by
polynomial evaluation.
ACkmutation, defined by affinely shifting the domain of Hu (from the previous step).
Let it be denoted Hϕ.
ACkmutation, defined by affinely shifting the domain of Hϕ(from the previous step)
The first of these mutations will have far tighter error bounds (for most interval-based cal-
culations) than the second, and is therefore well-suited to computations of the Y,Z0and Z1
bounds, since these only require access to the derivatives of Ekϕon the domain [1,0]. See
Section 4.5 for some related discussion. The second mutation in fact encloses Ekϕand its
derivatives up to order k, and can therefore be used in computations of the Y,Z2and Z
Having proven the new solution – which we also denote by u– we proceed to the computable
representations of u. We generate the same three classes as in the previous step, with some
changes. To compute ˜u(j), we use the fact that Hϕdefines a mutation of ϕto compute V,
and use Theorem 14 with ϵ= 0. For the hybrid enclosures Hu, we store a copy of ˜u(0) and
compute the bootstrapped derivatives as before.
The setup we have presented above results in a lot of recursion. Indeed, each hybrid enclo-
sure depends on all previously computed hybrid enclosures. This results in heavy computation
overhead, and at present, we do not know if there is a way to avoid it. Also, although Hϕis
a polynomial, it is itself a nested composition of every previous iterate, and can therefore be
extremely high-degree.
4.4 On the generality of polynomial vector fields
At the beginning of Section 4, we introduced the assumption that the vector field fand lag
function τare polynomial. This was done primarily to allow use to compute any integrals
reliably without the use of an approximate quadrature rule. When non-polynomial terms
in the vector field and delay are themselves solutions of ordinary differential equations, it is
sometimes possible to construct a [16]. To illustrate the point, we will briefly present two
such examples now, and how the machinery of the present paper can be adapted to handle
non-polynomial nonlinearities. However, we have not implemented the rigorous integrator for
these examples.
4.4.1 Ikeda equation with state-dependent delay
Consider an initial-value problem for a state-dependent delayed perturbation of the Ikeda
˙x(t) = ax(t) + bsin(x(tcpx(t))), x0=ϕ
for constants a, b, c, p. Define y(t) = sin(x(t)) and z(t) = cos(x(t)). Then the triple (x, y, z)
solves the initial-value problem
˙x(t) = ax(t) + by(tcpx(t)), x0=ϕ
˙y(t) = z(t)(ax(t) + by(tcpx(t))), y0= sin(ϕ)
˙z(t) = y(t)(ax(t) + by(tcpx(t))), z0= cos(ϕ),
and vice-versa. The vector field and delay function are now polynomials.
4.4.2 Cubic Ikeda equation with state-dependent delay
Consider a state-dependent delay perturbation of the cubic Ikeda equation:
˙x(t) = x(tcepx(t)) + x(tcepx(t))3, x0=ϕ,
for constants c, p. Define y(t) = epx(t). Then (x, y) solves the initial-value problem
˙x(t) = x(tcy(t)) + x(tcy(t))3, x0=ϕ
˙y(t) = py(t)(x(tcy(t)) + x(tcy(t))3), y0=e ,
and vice-versa. The vector field and delay function are now polynomials.
4.5 Centering of numerical data for the finite bounds
In Section 4.3, we discussed using a bootstrapped enclosure with ϵ= 0 for the initial data ϕ
to compute the Y,Z0and Z1bounds. This, however, can only be done if t7→ h(tδ, u(t)) is
contained in [1,0] for t[0,1]. For a typical candidate numerical solution (u, δ), this will
not be the case. However, if (u, δ) is a good numerical approximation, it might be possible to
perform near-trivial modifications to the numerical data uthat results in the inclusion being
Let us try to find small ϵ1, ϵ2such that the modified data (v, ∆) = ((1 + ϵ1)u, (1 + ϵ2)δ)
satisfies h(t, v(t)) [1,0] for t[0,1]. If (u, δ) is a good numerical approximation, we
should have h(0, u(0)) ≈ −1 and h(δ, u(1)) 0, but they may be outside of [1,0]. Therefore,
let η1, η2>0 be small, and solve the equations
h(0,(1 + ϵ1)u(0)) = 1 + η1(66)
h((1 + ϵ2)δ, (1 + ϵ1)u(1)) = η2(67)
for (ϵ1, ϵ2) using Newton’s method, initialized at (0,0). If t7→ h(tδ, u(t)) is strictly monotone
and (ϵ1, ϵ2) is small, then the time lag associated to the modified data should have the same
property, and therefore be contained in the interval [1 + η1,η2][1,0].
It is possible that (66)–(67) does not have a solution for any η1, η2>0. As a contrived
example, take uscalar-valued, h(t, u) = t1u, and numerical data such that u(0) = 0. Then
(66) has a solution if and only if η1= 0. In such a case, and in general, it might be necessary
to consider more sophisticated modifications to u. For example, if uis stored as a vector
in Rmd(k+1)+1 (see Section 3.3), then one could try near-unit scaling of suitable components
5 Example
This section concerns an application of our validated integrator to the following initial-value
˙u(t) = γu(t)κu(tαcu(t)), x0=ϕ(68)
with γ, κ, α, c all being non-negative. In particular, we will require κ, c positive so as not to
trivialize the state-dependent delay. A similar equation with two delays was studied in [5],
where Hopf bifurcations and periodic orbits were analyzed using delay expansions and normal
forms. In particular, (68) is a restricted version of that equation. We have implemented our
rigorous integrator for this equation using a few initial functions ϕand sets of parameters.
Despite being similar to a linear delay differential equation, the composition of u(t) with the
lag tαcu(t) can result in nonlinear phenomena including attracting periodic orbits.
5.1 Set-up
For (68), we have ˜
f(t, u, δ) = (γu κϕ(αcu))δ.
We computed the bootstrapped vector fields ˜
fkfor k= 1,...,8 using symbolic algebra. We will
not list all of them here, since the higher derivatives are quite lengthy. They can be viewed
in our Julia implementation; see the functions Dx,D2x,...,D8xwithin vector fields.jl.
However, to illustrate the computation, we will at least calculate ˜
f2explicitly. Recall that
f2(t, u, δ) = D1˜
f1+ (D2˜
= (κ˙
ϕ(αcu)δ)δ+ (γκ˙
ϕ(αcu))(γu κϕ(αcu))δ2
We follow Section 4.3 to handle out-of-bounds evaluations and multiple steps of integration.
This requires the Lipschitz-like bound r1of Theorem 14 to be computed analytically. Using
the mean-value inequality, one can choose any
η1(|δ|+η2)| − γ+κc ˙
ϕ|+η2|γ(u(t) + η1) + κϕ|+|(δ+η2)κt ˙
where the supremum is over t[0,1], |η1| ≤ rk+r) and |η2| ≤ rr0, and with an abuse of
notation (to keep the expression compact), we write ϕ=Ekϕ(t(δ+η2)αc(u(t) + η1)) and
similarly, ˙
ϕ= (Ekϕ)(1)(t(δ+η2)αc(u(t) + η1)).
5.2 Model parameter and initial conditions
We have elected to present proofs for three sets of parameters.
P.1 (γ , κ, α, c) = (1,2,1/2,1): The equilibrium at x= 0 is locally attracting.
P.2 (γ , κ, α, c) = (1/3,2,1,1): There is a nontrivial periodic orbit, and the equilibrium at
x= 0 is unstable.
P.3 (γ , κ, α, c) = (1/3,2,1,2): Same as the previous case, with larger cparameter.
The first set of parameters results in the easiest proofs, and the final set the hardest.
We consider two classes of initial conditions. For the simplest example, we have the constant
initial condition ϕ0(θ) = 1
4. For our second initial condition, we select a degree two polynomial
ϕ1(θ) = η0+η1θ+η2θ2with the following properties.
ϕ1is on the solution manifold; see [29].
The lag at time t= 0 is equal to 1.
The derivative of the time lag is equal to 2 at time t= 0.
In order, these three conditions result in a system of three algebraic equations:
η1=γη0κ(η0+η1(α0) + η2(α0)2)
1 = α0
which can be solved directly, yielding the unique polynomial
ϕ1(θ) = 1α
cθ+1+(α1)γ+ (α2)κ
κc θ2.(69)
Being on the solution manifold, the solution of the initial-value problem from ϕ1will be
globally C1-smooth.
5.3 Proof parameters and implementation details
Our initial conditions will be polynomials, but we formally treat them as elements of Ck([1,0],R)
for k= 8. The piecewise polynomial interpolation order k, number of subdivisions m, and
the weight rwere chosen at each step to ensure proofs that resulted in near-optimal C0
enclosures, but we did not make a serious, systematic effort to also optimize them for memory
usage. In all cases, the subdivision mesh was equally-spaced, and we used r= 104and
r0= 1 unless explicitly noted. After each proof, the solution uwas interpolated at an order
Nthat ensured the contribution of ΛNr0in (65) was no smaller than the other term of that
equation. Succinctly, we chose Nsuch that the interpolation error was no worse than the
wrapping effect.
In all cases, we completed four steps of integration. The time required to complete a
proof obviously depends on the interpolation parameters kand m, but is also exponential
in the number of proofs due to the nested calls to previously computed solutions. This is
main reason we have not attempted to prove more than four implicit steps. To illustrate the
point, consider that on our machine, the first three implicit steps of the proof with parameters
P.3 and initial condition ϕ0=1
4took a total of fifteen minutes, while the proof of the final
implicit step required almost an hour. Figure 3 includes plots of the solutions from the two
initial conditions ϕ0and ϕ1, for the three parameter sets. The technical results are presented
in Section 5.4 and Section 5.5.
Figure 3: Solutions of the initial-value
problem (68) plotted for four implicit
steps. Top row: parameter sets P.1 and
P.2. Bottom row: parameter set P.3. The
line styles indicate the initial condition of
the associated plot, be it ϕ0or ϕ1. Dots in-
dicate terminal integration points for each
of the four implicit steps. Note that solu-
tions from ϕ1are globally C1, since ϕ1is
on the solution manifold.
All proofs except for the first step required 512-bit floating point precision. Anything less
resulted in massive error propagation during the computation of the Ybound in the second
and subsequent steps. We suspect this is partly due to our implementation of the Ybound,
which requires computing compositions of high-order polynomials and reliably computing their
coefficients in the Chebyshev basis. As discussed at the beginning of Section 4, we suspect some
improvements can be made in this direction. Our implementation computes several objects
(e.g. Am,k) with lower precision when possible, always with appropriate directed rounding
when we move from one level of precision to another.
The function main from our Julia implementation [6] takes as its input a data file (all of
which are included) consisting of pre-set proof parameters and numerically computed solutions,
and does the following for each implicit step:
refine the numerical solution with Newton’s method;
center the numerical solution (see Section 4.5);