Page 1

Splines in Higher Order TV Regularization

Gabriele Steidl∗

Stephan Didas†

Julia Neumann‡

March 27, 2006

Abstract

Splines play an important role as solutions of various interpolation

and approximation problems that minimize special functionals in some

smoothness spaces. In this paper, we show in a strictly discrete setting

that splines of degree m − 1 solve also a minimization problem with

quadratic data term and m-th order total variation (TV) regularization

term.In contrast to problems with quadratic regularization terms

involving m-th order derivatives, the spline knots are not known in

advance but depend on the input data and the regularizationparameter

λ.More precisely, the spline knots are determined by the contact

points of the m–th discrete antiderivative of the solution with the tube

of width 2λ around the m–th discrete antiderivative of the input data.

We point out that the dual formulation of our minimization problem

can be considered as support vector regression problem in the discrete

counterpart of the Sobolev space Wm

solution of our minimization problem has a sparse representation in

terms of discrete fundamental splines.

2,0. From this point of view, the

1Introduction

In this paper, we are interested in the solution of the minimization problem

1

2

?1

0

(u(x) − f(x))2+ λ|u(m)(x)|dx

→

min(1)

and some of its 2D versions involving first and second order partial deriva-

tives. More precisely, we work in a strictly discrete setting which is ap-

propriate for tasks in digital signal processing. For a discrete signal u =

∗steidl@math.uni-mannheim.de, University of Mannheim, Faculty of Mathematics and

Computer Science , 68131 Mannheim, Germany

†didas@mia.uni-saarland.de, Mathematical Image Analysis Group, Faculty of Mathe-

matics and Computer Science, Saarland University, 66123 Saarbr¨ ucken, Germany

‡jneumann@uni-mannheim.de, University of Mannheim, Faculty of Mathematics and

Computer Science , 68131 Mannheim, Germany

1

Page 2

(u(1),...,u(n))T, we use the m-th forward difference

△mu(j) :=

m

?

k=0

(−1)k+m

?m

k

?

u(j + k),j = 1,...,n − m (2)

as discretization of the m-th derivative. Then, for given input data f ∈ Rn,

we are looking for the solution of the minimization problem

1

2

n

?

j=1

(u(j) − f(j))2+ λ

n−m

?

j=1

|△mu(j)|→

min,(3)

where we refer to the penalty term as m–order TV regularization. Of course,

other discretizations of (1) are possible. In contrast to the solution of the

well examined version of (3) with quadratic penalty term |△mu(j)|2, the

solution of (3) does not linearly depend on the input data. This results in

some advantages over the linear solution as better edge preserving. For two

dimensions and first order derivatives in the penalizer, problem (3) becomes

the classical approach of Rudin, Osher and Fatemi (ROF) [23] which has

many applications in digital image processing. Meanwhile there exist various

solution methods for this problem, see [30] and the references therein. Most

of these methods introduce a small additional smoothing parameter to cope

with the non differentiability of | · |. There are two approaches which avoid

such an additional parameter, namely a wavelet inspired technique [32] and

the Legendre–Fenchel dualization technique, see, e.g., [1, 4] which is also

relevant in the present considerations. We further mention that other cost

functionals than the quadratic one have to come into the play when dealing,

e.g., with denoising of images corrupted with other than white Gaussian

noise. In this context we only refer to recent papers of Nikolova et al. [21, 3]

and the references therein.

In this paper, we are interested in the structure of the solution u even

for m > 1. We show that u is a discrete spline of degree m − 1, where the

spline knots, in contrast to the linear problem with quadratic regularization

term, depend on the input data f and on the regularization parameter λ.

More precisely, the spline knots are determined by the contact points of the

m–th discrete antiderivative of u with the tube of width 2λ around the m–th

discrete antiderivative of f. We will see that the dual formulation of our

minimization problem can be considered as support vector regression (SVR)

problem in the discrete counterpart of the Sobolev space Wm

problem can be solved by standard quadratic programming methods. This

provides us with a sparse representation of u in terms of discrete funda-

mental splines. We formally extend the approach to two dimensions. Here

further research has to be involved to see the relation, e.g., to classical radial

basis functions.

This paper is organized as follows: since discrete approaches can be best

described in matrix–vector notation, the next section introduces the basic

2,0. The SVR

2

Page 3

difference operators as matrices. Section 3 shows that our minimization

problem (3) is equivalent to a spline contact problem. To this end, we have

to define discrete splines. Based on the dual formulation of our problem,

Section 4 treats the spline contact problem as support vector regression prob-

lem and presents some denoising results. Section 5 gives future prospects to

twodimensional problems. The paper is concluded with Section 6.

2 Difference Matrices

The discrete setting can be best handled using matrix-vector notation. To

this end, we introduce the lower triangular n × n Toeplitz matrix

Dn:=

10

1

...

...

...

...

...

0

0

0

0

−1

...

0

0

0

0

10

1

−1

.

By straightforward computation we see that the inverse of Dnis the addition

matrix

1

An:= D−1

n =

1

1

0

1

...

...

...

...

...

0

0

0

0

...

11

1

1

1

0

1

.(4)

Remark 2.1 While application of Dm

ferentiation, Am

of the m-th antiderivative of f. For example, the components of Am

given for m = 1,2 by

nis a discrete version of m times dif-

nf is a discrete version

nrealizes m–fold integration, i.e., Am

nf are

m = 1m = 2

f(1)

f(1) + f(2)

f(1) + f(2) + f(3)

...

f(1) + f(2) + ... + f(n)

and may be considered as discrete version of A1f(x) =?x

?x

j?

m − 1) for m ≥ 1 is a discrete equivalent of the m–th power function.

f(1)

2f(1) + f(2)

3f(1) + 2f(2) + f(3)

...

nf(1) + (n − 1)f(2) + ... + f(n)

0f(t)dt and A2f(x) =

0

?t1

(j+1−k)(m−1)

(m−1)!

0f(t)dtdt1, respectively. For general m, the j–th component of Am

nf is

k=1

f(k). Here k(m):= 1 for m = 0 and k(m):= k(k+1)...(k+

3

Page 4

Let 0n,mdenote the matrix consisting of n × m zeros, 1n,mthe matrix

consisting of n × m ones and Inthe n × n identity matrix. Then the m–

th forward difference (2) can be realized by applying the m–th forward

difference matrix

Dn,m:= (0n−m,m|In−m) Dm

n

and our minimization problem (3) can be rewritten as

1

2?f − u?2

2+ λ?Dn,mu?1

→

min.(5)

The functional in (5) is strictly convex and has therefore a unique minimizer.

The matrix Dn,mhas full rank n − m, i.e., R(Dn,m) = Rn−m. Moreover,

the range R(DT

n,m) of DT

n,mand the kernel N(Dn,m) of Dn,mare given by

R(DT

n,m)=

{f ∈ Rn:

n

?

j=1

jrf(j) = 0, r = 0,...,m − 1},

N(Dn,m)=span{(jr)n

j=1: r = 0,...,m − 1} = Πm−1,

see, e.g., [7]. The space Πmcollects just the discrete polynomials of degree

≤ m. Then we have the orthogonal decomposition

Rn= R(DT

n,m) ⊕ N(Dn,m).(6)

Obviously, Dn,mis given by cutting of the first m rows of Dm

relations between Dm

nand Dn,mare proved in the appendix.

n. The following

Proposition 2.2 The difference matrices fulfill the properties

i) DT

n,m= (−1)mDm

n

?

In−m

0m,n−m

?

,

ii) Dn,mDm

n= Dn+m,2m

?

0m,n

In

?

,

iii) Dn+m,m

?

0m,n

In

?

= Dm

n.

Proof.

i) Since Dn,mf = (∆mf(1),...,∆mf(n − m))Twe can rewrite Dn,mas

Dm,n

=

Dn−(m−1),1· ... · Dn,1

(0n−m,1|In−m)Dn−(m−1)· ... · (0n−1,1|In−1)Dn

=

Using that by definition

DT

n,1= DT

n

?

01,n−1

In−1

?

= −Dn

?

In−1

01,n−1

?

4

Page 5

we obtain for the transposed matrix

DT

n,m

=

DT

n,1· ... · DT

n−(m−1),1

=(−1)mDn

?

In−1

O1,n−1

?

· ... · Dn−(m−1),1

?

In−m

01,n−m

?

.

Multiplication of fTfrom the left is again successive application of first

order differences. Equivalently we can apply m–th order finite differ-

ences and cut off all additional components which results in assertion

i).

ii) By definition of Dn,mwe have

Dn+m,2m

?

0m,n

In

?

=(0n−m,2m|In−m) D2m

n+m

?

0m,n

In

?

=(0n−m,m|In−m) (0n,m|In) D2m

n+m

?

0m,n

In

?

.

Since the cutoff of the first m rows and columns of a Toeplitz matrix

results in the same Toeplitz matrix but with m times reduced order

the last equation can be rewritten as

Dn+m,2m

?

0m,n

In

?

= (0n−m,m|In−m) D2m

n

and finally, by applying again the definition of Dn,mas

Dn+m,2m

?

0m,n

In

?

= Dn,mDm

n.

iii) Using the definition of Dn,m, we obtain

Dn+m,m

?

0m,n

In

?

= (0n,m|In) Dm

m+n

?

0m,n

In

?

= Dm

n.

This completes the proof.

?

3Spline Contact Problem

In this section, we will see that our higher order TV problem (5) is equivalent

to a discrete spline interpolation problem, where the spline knots are not

known in advance but depend on the input data f and λ. For m = 1, the

resulting spline contact problem is well examined and can be solved by the

so–called ’taut string algorithm’, see, e.g., [10].

5

Page 6

A necessary and sufficient condition for u to be the minimizer of (5) is

that the zero vector is an element of the functional’s subgradient

0n,1∈ u − f + λ∂?Dn,mu?1.

By [22, Theorem 23.9] and since the subgradient of |x| is given by

x

|x|:=

1if x > 0,

if x < 0,

if x = 0,

−1

[−1,1]

this can be rewritten as

u ∈ f − λDT

n,m

Dn,mu

|Dn,mu|,

where ·/| · | is taken componentwise. These inclusions in their present form

are not very convenient for the computation of u. However, multiplying

with Am

nand applying Proposition 2.2i) leads to

Am

nu ∈ Am

nf − (−1)mλ

?

In−m

0m,n−m

?

Dn,mu

|Dn,mu|.

Setting

?

FI

FR

?

:= Am

nf,

?

UI

UR

?

:= Am

nu

(7)

with the splitting into the inner vector FI∈ Rn−mand the right boundary

vector FR∈ Rm, the inclusions can be rewritten as

UI

∈

FI− (−1)mλDn,mu

|Dn,mu|,

UR

=

FR.

It remains to replace Dn,mu. By (7) and (4), we see that

f = Dm

n

?

FI

FR

?

,

u = Dm

n

?

UI

UR

?

(8)

and further by Proposition 2.2ii) that

Dn,mu = Dn+m,2m

?

0m,n

In

??

UI

UR

?

.

Introducing an artificial left boundary UL:= 0m,1and extending our vector

by

U := (UT

L,UT

I,UT

R)T

6

Page 7

our inclusions become finally

UI

∈

FI− (−1)mλDn+m,2mU

|Dn+m,2mU|,

UR

=

FR.

Consequently, U is the unique solution of the following spline contact pro-

blem, where we have to explain the spline notation later.

Spline Contact Problem

(C1)

(C2)

Boundary conditions: UL= 0m,1and UR= FR.

Tube condition: ?FI− UI?∞≤ λ

UIlies in a tube around FIof width 2λ.

Contact condition:

Let ΛI:= {j ∈ {m + 1,...,n − m} : △2mU(j − m) ?= 0}.

If j ∈ ΛI, then U(j) contacts the boundary of the tube, where

(−1)m△2mU(j − m) > 0 =⇒ U(j) = F(j) − λ (lower contact),

(−1)m△2mU(j − m) < 0 =⇒ U(j) = F(j) + λ (upper contact).

(C3)

Remark 3.1 (Continuous and Discrete Natural Splines)

We recall that a natural polynomial spline of degree 2m − 1 with knots

x1< ... < xris a function s ∈ C2m−2such that

s(2m)(x)

s(m)(x)

=0,for x ∈ (xj,xj+1), j = 1,...,r − 1,

=0, for x < x1, x > xr.

These splines are the solutions in Wm, the Sobolev space of (m − 1) times

continuousely differentialble functions with m-th weak derivative in L2, of

1

2?f(m)?2

s.t.f(xj) = γj,

2

→

min

j = 1,...,r.

Mangasarian and Schumaker [17, 18] have introduced the discrete natural

polynomial spline of degree 2m − 1 with knots Ξ = {i1,...,ir}, ij < ik

for j < k, as a vector s = (s(1),...,s(N))Twhich satisfies for j ?∈ Ξ the

relations

△2ms(j − m)

△ms(j)

=0,j = m + 1,...,N − m;

j = 1,...,i1− 1; ir+ 1,...,N − m.=0,

As its continuous analogue the discrete natural polynomial spline of degree

2m − 1 solves the minimization problem

1

2

N−m

?

j=1

s.t.

(△my(j))2

→

min(9)

y(ij) = γj,j = 1,...,r.

7

Page 8

For relations between continuous and natural spline in the limiting process

N → ∞ see also [17, 18].

Setting N := n + m and using the spline knots Ξ = {1,...,m} ∪ ΛI∪

{n − m + 1,...,n}, we can interpret U defined by (C1) - (C3) is a discrete

natural polynomial spline of degree 2m − 1. In contrast to (9), the inner

spline knots ΛI are only determined by (C3) and not known in advance.

This reflects the nonlinear character of our problem solution.

We extend the discrete spline concept to splines of even degree as follows:

we call s = (s(1),...,s(n))Ta discrete spline of degree m−1 with inner knots

Ξ = {i1,...,ir} ⊆ {⌊m

2⌋ + 1,...,n − ⌊m+1

2⌋} if

△ms(j − ⌊m

2⌋) = 0,j = ⌊m

2⌋ + 1,...,n − ⌊m + 1

2

⌋; j ?∈ Ξ.

Then the discrete interpolation problem

s(ij) = γj,ij∈ Ξ ∪ {1,...,⌊m

2⌋} ∪ {n − ⌊m + 1

2

⌋ + 1,...,n}

has a unique solution. Thus, for given spline knots ΛI, we could solve a spline

interpolation problem. Unfortunately, the spline knots depend on the input

data f and λ. Therefore, the solution of the spline contact problem in its

present form is only convenient for m = 1, see Remark 3.2. For larger m and

the continuous setting, an attempt to solve the contact problem is contained

in [16]. For our discrete setting, we will see in the following section that the

contact problem can be treated by simply solving a constraint quadratic

minimization problem.

Remark 3.2 (Taut String Algorithm for m = 1)

For m = 1, condition (C3) means that the polygon through U is convex at

upper contact points and concave at lower contact points. Thus, the con-

struction of U satisfying (C1) – (C3) is equivalent to the construction of the

uniquely determined taut string within the tube around F of width 2λ fixed

at (0,0) and (n,F(n)). In other words, the polygon through U has minimal

lengths within the tube, i.e., it minimizes

n−1

?

j=0

?1 + (U(j + 1) − U(j))2?1/2,

subject to the tube and boundary conditions. An example of a taut string

is shown in Figure 1. For solving this problem there exists a very efficient

algorithm of complexity O(n), the so–called ’taut string algorithm’, which

is based on a convex hull algorithm, see, e.g., [6, 16].

8

Page 9

0 102030 4050

−20

0

20

U

0 1020 30 4050

−2

0

2

sgn(D2 U)

Figure 1: Solution of the spline contact problem (C1) – (C3) for a signal F

of lengths n + m with n = 40 and m = 1.

Interestingly, it was shown in [27, 33] that for m = 1 the spline knots

fulfill a so–called ’tree–property’.

Remark 3.3 (Tree Property of Spline Knots for m = 1)

Let λmax be the smallest regularization parameter such that ΛI = ∅. It

is not hard to show that λmax = ?Pf?W1(Dn,1)′, where P denotes the or-

thogonal projection of f onto R(DT

W1(Dn,1) := R(DT

If λ moves from λmaxto 0 and ΛI(λ) denotes the corresponding set of

inner spline knots, then, for λj> λk,

n,1) and W1(Dn,1)′is the dual space of

n,1) equipped with the norm ?u?W1(Dn,1):= ?Dn,1u?1.

∅ = ΛI(λmax) ⊆ ΛI(λj) ⊆ ΛI(λk) ⊆ ΛI(0) = {m + 1,...,n − m}.

Figure 2 shows a tree of inner spline knots. The tree property does not hold

for m ≥ 2.

4Support Vector Regression with Spline Kernels

In this section we want to show the relation of the discrete spline contact

problem with discrete SVR. We start by a brief introduction to SVR in the

continuous setting, where we emphasize the role of splines in the solution of

the SVR problem in Sobolev spaces. Then we switch to the discrete context

to explain the solution of (5) from the SVR point of view.

9

Page 10

0

2

4

6

8

10

12

0 5 10 15 20

Initial signal

0

2

4

6

8

10

12

14

16

18

0 5 10 15 20

regularization parameter

region center

Figure 2: Original signal f (left), tree of spline knots with increasing regu-

larization parameter λ from leaves to root (right).

4.1 Support Vector Regression - Continuous Approach

The SVR method searches for approximations of functions in reproducing

kernel Hilbert spaces (RKHS) and plays an important role, e.g., in Learn-

ing Theory. Among the large amount of literature on SVR we refer to [29,

Chapter 11]. SVR can be briefly explained as follows: Let H ⊂ L2(Rd)

be a Hilbert space with inner product (·,·)H having the property that

the point evaluation functional is continuous.

called reproducing kernel K ∈ L2(Rd× Rd) with reproducing property

(F,K(·,xj))H = F(xj) for all F ∈ H and is called a reproducing kernel

Hilbert space (RKHS). Given some function values F(xj), j = 1,...,p, the

soft margin SVR problem consists in finding a function U ∈ H which mini-

mizes

p

?

j=1

Then H possesses a so–

µ

Vλ(F(xj) − U(xj)) +1

2?U?2

H,

where Vλ(x) := max{0,|x|−λ} denotes Vapnik’s λ-insensitive loss function.

In other words, Vapnik’s cost functional penalizes those U(xj) lying not in

a λ neighbourhood of F(xj). If µ tends to infinity, then our cost functional

must become zero and we obtain the hard margin SVR problem

1

2?U?2

|F(xj) − U(xj)|∞≤ λ,

H

→

min (10)

s.t.j = 1,...,p.

By the Representer Theorem of Kimmeldorf and Wahba [14], the solution

of (10) has the form

p

?

k=1

U(x) =

c(k)K(xk,x),

10

Page 11

i.e., only the given knots xkare involved into the representation. Then (10)

can be rewritten as

1

2cTKc

?F − Kc?∞≤ λ

→

min(11)

s.t.

with F := (F(xj))p

the usual hard margin SVR formulation.

Based on the Karush – Kuhn – Tucker conditions it follows that c(k) ?= 0

implies |F(xk) − U(xk)| = λ. Let

j=1, c := (c(k))p

k=1and K := (K(xj,xk))p

j,k=1. This is

Λ := {k ∈ {1,,...,p} : c(k) ?= 0}.

Then the solution U can be rewritten as

U(x) =

?

k∈Λ

c(k)K(xk,x).(12)

The functions K(xk,x) with k ∈ Λ are called support vectors. Obviousely,

U depends only on these support vectors and has a sparse representation

in terms of the support vectors if |Λ| is small compared to p. In the image

processing context, SVR regression is mainly applied in high dimensional

function spaces (d ≫ 1), where often the Gaussian is involved as reproducing

kernel.

For our purposes we will consider other well–known reproducing kernel

Hilbert spaces, namely the Sobolev spaces H = Wm

on R having a weak m–th derivative in L2[0,1] and fulfilling F(r)(0) = 0 for

r = 0,...,m − 1 with inner product

2,0of real–valued functions

?F,G?Wm

2,0:=

?1

0

F(m)(x)G(m)(x)dx.

These RKHS were for example considered in [31, p. 5–14]. The reproducing

kernel in Wm

2,0is

K(x,y) :=

?1

0

(x − t)m−1

+

(y − t)m−1

+

/((m − 1)!)2dt,(13)

where (x)+ := max{0,x}. For fixed y, the functions K(·,y) are splines

fulfilling K(·,y) ∈ C2m−2, K(·,y) ∈ Π2m−1in [0,y] and K(·,y) ∈ Πm−1in

[y,1].

In this context we mention that another minimization problem having

so–called smoothing splines as solutions was considered the literature, see,

e.g., [31, 28]: find U ∈ Wm

2,0such that

1

2

p

?

j=1

(F(xj) − U(xj))2+ λ?U?2

Wm

2,0

→

min

11

Page 12

Again by the Representer Theorem, this problem has a solution of the form

p?

with knots xk, k = 1,...,p. However, in contrast to the solution (12) of (10),

all coefficients c(k) are in general ?= 0 and we obtain no sparse representation.

U =

k=1

c(k)K(·,xk). Consequently, U is a continuous spline of degree 2m−1

4.2 Support Vector Regression - Discrete Approach

To see the relation between our spline contact problem and SVR methods,

we consider the dual formulation of problem (5).

Proposition 4.1 The solution u of (5) is given by u = f−DT

VIis the unique solution of the minimization problem

n,mVI, where

1

2?f − DT

?VI?∞≤ λ.

n,mVI?2

2

→

min(14)

s.t.

For a proof see, e.g., [25].

By (8) and Proposition 2.2 i) and iii) we obtain that

?f − DT

n,mVI?2

=

?Dm

n

?

FI

FR

?

− (−1)mDm

n

?

In−m

0m,n−m

?

VI?2

=

?Dn+m,m(F − (−1)mV )?2,

where V := (0T

can be rewritten as

m,1,VT

I,0T

m,1)T. Setting U := F − (−1)mV , problem (14)

1

2?Dn+m,mU?2

?FI− UI?∞≤ λ,

2

→

min (15)

s.t.

UR= FR.

The unique solution U of this problem which can be computed by standard

quadratic programming (QP) methods is also the unique solution of our

spline contact problem. Figure 3 illustrates the solution for m = 3.

Remark 4.2 Regarding Remark 3.2, we see that for m = 1 the minimiza-

tion problems

n

?

j=1

?1 + (U(j + 1) − U(j))2?1/2

→

min,

and

?Dn+1,1U?2

2=

n

?

j=1

(U(j + 1) − U(j))2

→

min

subject to the tube and boundary constraints lead to the same solution.

12

Page 13

0 1020 3040 50

−20

0

20

U

0 1020 30 40 50

−2

0

2

sgn(D6 U)

Figure 3: Solution of the spline contact problem (C1) – (C3) for a signal F

of lengths n + m with n = 40 and m = 3 .

We will see that problem (15) can be considered as a hard margin SVR

problem. To this end, we only have to define the appropriate RKHS. Let

Wm

product

2,0:= {F ∈ Rn+m: F(j) = 0, j = 1,...,m} equipped with the inner

?F,G?Wm

2,0

:=

n

?

j=1

?Dn+m,mF,Dn+m,mG?

?

FR

△mF(j)△mG(j)

:=

=

Dm

n

?

FI

?

,Dm

n

?

GI

GR

??

.

Then the minimization term in (15) is just the norm of U in Wm

can straightforwardly determine the reproducing kernel in Wm

2,0. Now we

2,0. Setting

K := ((Dm

n)TDm

n)−1= Am

n(Am

n)T,(16)

we see that the columns K0,kof

K0:= (0n,m|K)T∈ Rn+m,n

form a special basis of Wm

F(j + m). Let us have a closer look at the structure of K. Straightforward

computation shows that the components of our discrete kernel are given by

the discrete counterpart of (13), namely

2,0, namely with reproducing property ?F,K0,j?Wm

2,0=

K(j,k) =

min(j,k)−1

?

r=0

(j − r)(m−1)(k − r)(m−1)/((m − 1)!)2,

13

Page 14

05 101520 25 3035

0

2

4

6

8

10

12

14

16

18

20

05 101520253035

0

1000

2000

3000

4000

5000

6000

Figure 4: Discrete splines K0,k, k = 1,5,10,20, for n = 32 and m = 1 (left),

m = 2 (right).

with (m) defined as in Remark 2.1. By Proposition 2.2 ii) and i) we obtain

that

Dn+m,2mK0

=

Dn+m,2m

?

0m,n

In

?

Am

n(Am

n)T= Dn,mDm

nAm

n(Am

n)T

=(−1)m(In−m,0n−m,m).

In other words, we have for j = m + 1,...,n − m that

△2mK0,k(j − m)

△2mK0,k(k − m)

△2mK0,k(j − m)

=0,

(−1)m,

k = 1,...,n − m; j ?= k,

k = 1,...,n − m,

k = n − m + 1,...,n,

=(17)

=0,

i.e., K0,kis a discrete spline of degree 2m−1 with one inner knot k+m for

k = 1,...,n−m and a discrete polynomial in Π2m−1for k = n−m+1,...,n.

For n = 32 and m = 1,2, some columns of K0are depicted in Figure 4.

For every U ∈ Wm

that U = K0c and by the reproducing property of K0, problem (15) can

be rewritten as

1

2cTKc

s.t.

?FI− (Kc)I?∞≤ λ,

2,0, there exists a uniquely determined c ∈ Rnsuch

→

min(18)

(Kc)R= FR.

This is the usual form (11) of a hard margin SVR problem. Let c be the

solution of (18) and let

˜ΛI:= {j ∈ {m + 1,...,n} : c(j − m) ?= 0}

so that

U =

?

j∈˜ΛI

c(j − m)K0,j−m+

n

?

j=n−m+1

c(j)K0,j.(19)

The vectors K0,j−m, j ∈˜ΛIare called (inner) support vectors. By (19) and

property (17) of K0they are related to the spline knots as follows:

14

Page 15

05101520253035

0

0.2

0.4

0.6

0.8

1

1.2

1.4

05 10152025 3035

0

2

4

6

8

10

12

14

16

18

20

Figure 5: Discrete splines (Am

(left), m = 2 (right). For m = 1, we have added 0.1, 0.2 and 0.3 to the last

columns to better visualize the discrete step functions.

n)T

k, k = 1,5,10,20, for n = 32 and m = 1

Proposition 4.3 The support vector indices˜ΛI of the solution U in (19)

of the SVR problem are exactly the spline knots ΛI, i.e.,

△2mU(j − m) ?= 0

⇐⇒

j ∈˜ΛI.

If the number of contact points |ΛI| is small compared to n, then c has

only a small number of nonzero coefficients and (19) provides us with a

sparse representation of U. This can also be seen by noting that our SVR

problem (18) means to find U = K0c such that the equality constraints are

fulfilled and

1

2?F − U?2

Compare with [9] in a general SVR context. In contrast to the 2–norm, the

1–norm of c in the penalty term implies for sufficiently large λ that some of

the coefficients c(j) are 0. This implies a sparse representation of U from

another point of view.

Finally, we see by (16) and (8) that

Wm

2,0+ λ||c?1

→ min.

u = Dm

nAm

n(Am

n)Tc = (Am

n)Tc

(20)

is the corresponding sparse representation of our original solution u. By

Proposition 2.2 i) we have that Dm,n(Am

that the first n −m columns of (Am

n)Tare splines of degree m − 1 with one

inner knot and the last m columns are polynomials in Πm−1. For m = 1

and 2 some columns of (Am

n)Tare illustrated in Figure 5. In the context of

sparse representation, the following observation is interesting: by (20), (8)

and Proposition 2.2 i) and iii), our original problem (5) can be rewritten as

n)T= (−1)m(In−m|0n−m,m) so

1

2?f − (Am

n)Tc?2

2+ λ?(In−m|0n−m,m)c?1

→

min.(21)

15

Page 16

Remark 4.4 Finally, let us mention that a continuous version of our con-

siderations reads as follows: For a function u := Φ(2m)

k ∗ u, where k is the causal fundamental solution of the 2m–th derivative

operator, i.e., the spline k(x) = x2m−1

+

then our discrete function (UT

I,UT

U := Φ(m)

u

= k ∗ u(m).

u

we have that Φu=

. If u plays the discrete role of u

R)T= Am

nu = KDm

nu plays the role of

4.3Denoising Example

In this section, we show the performance of our approach (5) and (15) by

a denoising example. We are mainly interested in the behaviour for various

differentiation orders m. Our aim is to demonstrate the spline interpolation

with variable knots for various m and not to create an optimal denoising

method.To this end, we have used the signal shown in Figure 6 (top,

left) and have added white Gaussian noise. First, we have determined the

optimal parameters λ with respect to the maximal signal–to–noise–ratio

(SNR) defined by SNR(g,u) := 10log10

For the solution of the quadratic problem (15) we have applied the Matlab

quadratic programming routine which is based on an active set method.

Then we compared the quality of the results obtained for various m. The

following table contains the results for λ, the SNR and the peak signal–to–

noise–ratio (PSNR) defined by PSNR(g,u) := 10log10

denotes the number of pixels. The noisy signal in Figure 6 (top,right) has

SNR 6.94 and PSNR 10.72.

?

?g?2

?g−u?2

2

2

?

with original signal g.

?

n?g?2

?g−u?2

∞

2

?

, where n

m

1

2

3

4

λ SNR

16.00

18.41

17.97

17.22

PSNR

19.78

22.18

21.69

20.99

20.2

57.8

275.0

1453.1

The corresponding signal plots are given in Figure 6. For this signal the

methods with orders m ≥ 2 perform better than the usual method with

m = 1 where the the linear method (m = 2) achieves the best restoration.

In general higher order methods with l1 regularization term neglect the

staircasing effect appearing in the piecewise constant approximation with

m = 1 and preserve on the other hand local singularities better than linear

methods with quadratic regularization term. Various other examples for the

denoising of signals by solving (5) were presented in [26].

5Generalization to Two Dimensions

In this section, we briefly consider a possible generalization of our concept

to two dimensions. This may be considered as starting point for future

16

Page 17

-40

-20

0

20

40

0 50 100 150 200 250

original signal

-40

-20

0

20

40

0 50 100 150 200 250

with Gaussian noise

-40

-20

0

20

40

0

50 100 150 200 250

TV regularisation, derivative order 1

-40

-20

0

20

40

0

50 100 150 200 250

TV regularisation, derivative order 2

-40

-20

0

20

40

0 50 100 150 200 250

TV regularisation, derivative order 3

-40

-20

0

20

40

0 50 100 150 200 250

TV regularisation, derivative order 4

Figure 6: Denoising results with (5). Top left: original signal. Top right:

noisy signal. Middle left: denoised signal for m = 1. Middle right: denoised

signal for m = 2. Bottom left: denoised signal for m = 3. Bottom right:

denoised signal for m = 4.

17

Page 18

research.

Concerning first order derivatives, we consider the ROF model

1

2

?

Ω

(u(x) − f(x))2+ λ|∇u|dx

→

min (22)

and the model

1

2

?

Ω

(u(x) − f(x))2+ λ(|ux| + |uy|)dx

→

min(23)

treated, e.g., in [12]. Of course the second model is not rotationally invariant.

In the following, we restrict our attention for simplicity to quadratic

n×n images and reshape them columnwise into a vector of length N = n2.

We discretize the first order derivatives as proposed by Chambolle in [1]. To

this end, we introduce the gradient matrix

D :=

?

In⊗ D0

D0

n⊗ In

n

?

∈ R2N,N

with

D0

n:=

?

Dn,1

01,n

?

and the Kronecker product ⊗. The matrix D has rank N −1 and DTplays

the role of −div = ∇∗. Further, we have that △N:= DTD is the finite

difference discretization of the Laplace operator with the five point scheme

and Neumann boundary conditions and that

R(DT)=

R(△N) = {f ∈ RN:

N

?

j=1

f(j) = 0}, (24)

N(D)=

N(△N) = {µ1N,1: µ ∈ R} = Π0.

Finally, the discrete version of |∇u| = (u2

x+ u2

y)1/2reads |Du|, where

????

?

F1

F2

?????:=?(F1)2+ (F2)2?1/2=?F1◦ F1+ F2◦ F2?1/2∈ RN

and ◦ denotes the componentwise vector product. Now we can discretize

(22) and (23) by

1

2?f − u?2

and

1

2?f − u?2

respectively. Then, by the dual approach, see, e.g. [1, 25], we obtain that

u = f − DTV , where V is the solution of

2+ λ?|Du|?1

(25)

2+ λ?Du?1, (26)

1

2?f − DTV ?2

?|V |?∞≤ λ,

?V ?∞≤ λ,

2

→

min

s.t.in case (25),(27)

s.t.in case (26).(28)

18

Page 19

The first minimization problem can be solved for example by using Cham-

bolle’s semi–implicit gradient descent algorithm [1], while the second prob-

lem can be solved by standard QP methods. An example for the solution

of both problems is presented at the bottom of Figure 8. By the absence

of rotation invariance, the solution of the second problem shows harder seg-

mentation effects in x and y directions.

In the following, we assume that f ∈ R(DT), i.e., f = DTF for some

F ∈ R2N. Otherwise we consider f−mean(f)1N,1. Then, since Du = DuR,

and

1

2?f − u?2

where uRis the orthogonal projection onto R(DT) and uN the orthogonal

projection onto N(DT), it follows that the minimizer u of (25) and (26) is

also in R(DT). Now U = F − V solves the problem

2=1

2?f − uR?2

2+1

2?uN?2

2,

1

2?DTU?2

?|F − U|?∞≤ λ,

?F − U?∞≤ λ,

2

→

min

s.t. in case (25),

s.t.in case (26).

With respect to Remark 3.3 we note that the discrete G–norm defined for

v ∈ R(DT) by ?v?G := inf

norm.

For higher order derivatives even the choice of an appropriate disretiza-

tion which preserves the basic integral identities satisfied by the contin-

uous differential operators is a nontrivial question, see, e.g., [13].

ever, operators of higher order were considered in image processing, e.g., in

[5, 2, 11, 15, 20, 24, 34, 25]. Here we restrict our attention to

v=DTV?|V |?∞ plays the role of the W1(Dn,1)′

How-

1

2

?

Ω

(u(x) − f(x))2+ λ|△u|dx

→

min.

As discretization we choose

1

2?f − u?2

2+ λ?△Du?1

→

min(29)

where △Ddenotes the finite difference discretization of the Laplace operator

with the five point scheme and Dirichlet boundary conditions. Then △Dis

invertible. The dual approach to (29) leads with f = △DF and u = △DU

to the contact problem

1

2?△DU?2

?F − U ?∞≤ λ,

2

→

min (30)

s.t.

which can be solved by standard QP methods. An example for the solution

of this problem in shown at the top of Figure 8. The solution contains

19

Page 20

5 10 15 202530

5

10

15

20

25

30

510 15 20 2530

5

10

15

20

25

30

Figure 7: Column 528 of △−2

D(left) and of △−1

D(right) for n = 32.

some artefacts in form of white points which were also mentioned in [34].

Therefore the approach (29) seems to be not suited for applications in image

processing. Obviously, △−2

the norm given by the minimization term and U = △−2

are in general sparse representations. The images corresponding to a central

row of △−2

Dare depicted in Figure 7.

With respect to the kernel △−2

Dlet us finally note the following remark.

Dis a reproducing kernel in RNequipped with

Dc and u = △−1

Dc

Dand △−1

Remark 5.1 (Thin Plate Splines)

The so–called thin plate spline [8] K(x) :=

solution of the biharmonical operator △2. For appropriately chosen xj the

solution of

1

8π|x|2ln|x| is the fundamental

1

2

N

?

j=1

(f(xj) − u(xj))2+ λ

?

Ω

u2

xx+ 2u2

xy+ u2

yydx

→

min

has the form u(x) =?N

j=1cjK(x − xj) + a0+ a1x + a2y.

20

Page 21

50100150 200250

50

100

150

200

250

50100150200 250

50

100

150

200

250

50100150 200250

50

100

150

200

250

50 100150 200250

50

100

150

200

250

Figure 8: Top: Original 256×256 image (left). Solution of (30) (right). The

image involves artefacts (white points). Bottom: Solution of (27) (left). So-

lution of (28) (right). The right-hand image shows a stronger segmentation

in x and y direction. All problem were solved with λ = 10. For problem

(27) we have used the semi–implicit gradient descent algorithm [1]. Prob-

lems (30) and (28) were computed by the ILOG CPLEX Barrier Optimizer

version 7.5. This routine uses a modification of the primal–dual predictor–

corrector interior point algorithm described in [19].

21

Page 22

6Conclusions

We have shown the equivalence of the following problems in a discrete 1D

setting:

i) minimzation of a functional with quadratic data term and TV regu-

larization term with higher order derivatives,

ii) spline interpolation with variable knots depending on the input data

and the regularization parameter,

iii) hard margin SVR in the discrete counterpart of the Sobolev space

Wm

2,0,

iv) sparse representation in terms of fundamental splines with penalization

the of l1norm of the coefficients.

Based on (6) a slightly different approach which handles the boundary con-

ditions in advance (as done in 2D) is possible. Moreover, more general spline

concepts as those of exponential splines, see, e.g., [28] and other data terms

incorporating only few knots or related to other than Gaussian white noise

can be considered in a similar way. Finally, the 2D setting deserves stronger

investigation.

References

[1] A. Chambolle. An algorithm for total variation minimization and ap-

plications. Journal of Mathematical Imaging and Vision, (20):89–97,

2004.

[2] A. Chambolle and P.-L. Lions. Image recovery via total variation min-

imization and related problems. Numerische Mathematik, 76:167–188,

1997.

[3] R. H. Chan, C. W. Ho, and M. Nikolova. Salt-and-pepper noise removal

by median noise detectors and detail preserving regularization. IEEE

Transactions on Image Processing, page to appear.

[4] T. F. Chan, G. H. Golub, and P. Mulet.

method for total-variation based image restoration. SIAM Journal on

Scientific Computing, 20(6):1964–1977, 1999.

A nonlinear primal–dual

[5] T. F. Chan, A. Marquina, and P. Mulet. High-order total variation-

based image restoration.

SIAM Journal on Scientific Computing,

22(2):503–516, 2000.

[6] P. L. Davies and A. Kovac. Local extremes, runs, strings and multires-

olution. Annals of Statistics, 29:1–65, 2001.

22

Page 23

[7] S. Didas. Higher order variational methods for noise removal in signals

and images. Diplomarbeit, Universit¨ at des Saarlandes, 2004.

[8] J. Duchon. Splines minimizing rotation-invariant seminorms in sobolev

spaces. In Constructive Theory of Functions of Several Variables, pages

85–100, Berlin, 1997. Springer–Verlag.

[9] F. Girosi. An equivalence between sparse approximation and support

vector machines. Neural computation, 10(6):1455–1480, 1998.

[10] W. Hinterberger, M. Hinterm¨ uller, K. Kunisch, M. von Oehsen, and

O. Scherzer. Tube methods for BV regularization. Journal of Mathe-

matical Imaging and Vision, 19:223 – 238, 2003.

[11] W. Hinterberger and O. Scherzer. Variational methods on the space of

functions of bounded Hessian for convexification and denoising. Tech-

nical report, University of Innsbruck, Austria, 2003.

[12] W. Hinterm¨ uller and K. Kunisch. Total bounded variation regulariza-

tion as a bilaterally constrained optimization problem. SIAM Journal

on Applied Mathematics, 64(4):1311–1333, May 2004.

[13] J. M. Hyman and M. J. Shashkov. Natural discretizations for the diver-

gence, gradient, and curl on logically rectangular grids. Comput. Math.

Appl., 33(4):81–104, 1997.

[14] G. S. Kimmeldorf and G. Wahba. Some results on Tchebycheffian spline

functions. J. Anal. Appl., 33:82–95, 1971.

[15] M. Lysaker, A. Lundervold, and X. Tai. Noise removal using fourth-

order partial differential equations with applications to medical mag-

netic resonance images in space and time. IEEE Transactions on Image

Processing, 12(12):1579 – 1590, 2003.

[16] E. Mammen and S. van de Geer. Locally adaptive regression splines.

Annals of Statistics, 25(1):387–413, 1997.

[17] O. L. Mangasarian and L. L. Schumaker. Discrete splines via mathe-

matical programming. SIAM Journal on Control, 9(2):174–183, 1971.

[18] O. L. Mangasarian and L. L. Schumaker. Best summation formulae

and discrete splines via mathematical programming. SIAM Journal on

Numerical Analysis, 10(3):448–459, 1973.

[19] S. Mehrotra. On the implementation of a primal-dual interior point

method. SIAM Journal on Optimization, 2(4):575–601, 1992.

23

Page 24

[20] M. Nielsen, L. Florack, and R. Deriche. Regularization, scale-space and

edge detection filters. Journal of Mathematical Imaging and Vision,

7:291–307, 1997.

[21] M. Nikolova. A variational approach to remove outliers and impulse

noise. Journal of Mathematical Imaging and Vision, 20:99–120, 2004.

[22] R. T. Rockafellar. Convex Analysis. Princeton University Press, Prince-

ton, 1970.

[23] L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based

noise removal algorithms. Physica D, 60:259–268, 1992.

[24] C. Schn¨ orr. A study of a convex variational diffusion approach for image

segmentation and feature extraction. Journal of Mathematical Imaging

and Vision, 8(3):271–292, 1998.

[25] G. Steidl. A note on the dual treatment of higher order regularization

functionals. Computing, 2005, to appear.

[26] G. Steidl, S. Didas, and J. Neumann. Relations between higher or-

der TV regularization and support vector regression. In R. Kimmel,

N. Sochen, and J. Weickert, editors, Scale-Space and PDE Methods in

Computer Vision, volume 3459 of Lecture Notes in Computer Science,

pages 515–527. Springer, Berlin, 2005.

[27] G. Steidl, J. Weickert, T. Brox, P. Mr´ azek, and M. Welk. On the equiv-

alence of soft wavelet shrinkage, total variation diffusion, total varia-

tion regularization, and SIDEs. SIAM Journal on Numerical Analysis,

42(2):686–713, 2004.

[28] M. Unser and T. Blu. Generalized smoothing splines and the opti-

mal discretization of the Wiener filter. IEEE Transactions on Signal

Processing, 53(6):2146–2159, 2005.

[29] V. N. Vapnik. Statistical Learning Theory. John Wiley and Sons, Inc.,

1998.

[30] C. R. Vogel. Computational Methods for Inverse Problems. SIAM,

Philadelphia, 2002.

[31] G. Wahba. Spline Models for Observational Data. SIAM, Philadelphia,

1990.

[32] M. Welk, J. Weickert, and G. Steidl. A four-pixel scheme for singular

differential equations. In R. Kimmel, N. Sochen, and J. Weickert, edi-

tors, Scale-Space and PDE Methods in Computer Vision, Lecture Notes

in Computer Science. Springer, Berlin, 2005, to appear.

24

Page 25

[33] A. M. Yip and F. Park.

behaviour of the regularization parameter in total variation denoising

problems. UCLA Report, 2003.

Solution dynamics, causality, and critical

[34] Y.-L. You and M. Kaveh. Fourth-order partial differential equations for

noise removal. IEEE Transactions on Image Processing, 9(10):1723–

1730, 2000.

25