Page 1

IMAGE INPAINTING CONSIDERING BRIGHTNESS CHANGE

AND SPATIAL LOCALITY OF TEXTURES

Norihiko Kawai, Tomokazu Sato, Naokazu Yokoya

Graduate School of Information Science, Nara Institute of Science and Technology

8916-5 Takayama, Ikoma, Nara 630-0192, Japan

norihi-k@is.naist.jp, tomoka-s@is.naist.jp, yokoya@is.naist.jp

Keywords:

image inpainting, image completion, energy minimization

Abstract:

Image inpainting is a tequnique for removing undesired visual objects in images and filling the missing re-

gions with plausible textures. Conventionally, the missing parts of an image are completed by optimizing the

objective function, which is defined based on pattern similarity between the missing region and the rest of

the image (data region). However, unnatural textures are easily generated due to two factors: (1) available

samples in the data region are quite limited, and (2) pattern similarity is one of the required conditions but is

not sufficient for reproducing natural textures. In this paper, in order to improve the image quality of com-

pleted texture, the objective function is extended by allowing brightness changes of sample textures (for (1))

and introducing spatial locality as an additional constraint (for (2)). The effectiveness of these extensions is

successfully demonstrated by applying the proposed method to one hundred images and comparing the results

with those obtained by the conventional methods.

1INTRODUCTION

Image inpainting is a tequnique for removing unde-

sired visual objects in images and filling the missing

regions with plausible textures. This research can be

classified into two categories. One is a non-exemplar-

based method and the other is an exemplar-based

method. The non-exemplar-based methods(A. Levin

et al., 2003; C. Ballester et al., 2001a; C. Ballester

et al., 2001b; D. Tschumperl´ e, 2006; E. Vill´ eger

et al., 2004; M. Bertalmio et al., 2001; M. Bertalmio

et al., 2000; S. Esedoglu and J. Shen, 2003; S. Mas-

nou and J.M. Morel, 1998; T. Chan and J. Shen,

2001; T. Chan et al., 2002) are based on pixel inter-

polation considering the continuity of pixel intensity.

These methods are effective for small image gaps like

scratches in a photograph. However, the resultant im-

age easily becomes unclear when the missing region

is large. Therefore, recently many exemplar-based

inpainting methods have been intensively developed

because they can synthesize complex textures in the

missing region.

Exemplar-basedmethods basicallysynthesize tex-

tures for the missing region based on pattern simi-

larity that is defined between the missing region and

the rest of the image. Some of the exemplar-based

methods use the distance in the feature space as a

similarity measure.As the feature space, Fourier

space, wavelet domain and eigenspace have been

used (A.N. Hirani and T. Totsuka, 1996; S.D. Rane

et al., 1996; T. Amano, 2004). Most of the other

exemplar-based methods simply employ SSD (sum

of squared differences)-based pattern similarity mea-

sures (A. Criminisi et al., 2004; A.A. Efros and T.K.

Leung, 1999; B. Li et al., 2005; C. All` ene and N.

Paragios, 2006; I. Drori et al., 2003; J. Jia and C.

Tang, 2003; J. Sun et al., 2005; N. Komodakis and

G. Tziritas, 2006; R. Bornard et al., 2002; Y. Wexler

et al., 2007). Efros et al. (A.A. Efros and T.K. Le-

ung, 1999) have proposed a method that successively

copies the most similar patternfrom the data region to

the missing region. Although this method can gener-

ate complex textures, the quality of resultant images

is severely affected by the order of texture copy. To

obtain good results with the successive texture copy,

confidence maps such as the number of fixed pixels

in a window, strength of isophotes around the missing

regions and pattern similarity have been used to de-

Page 2

termine the order of texture copy (A. Criminisi et al.,

2004; B. Li et al., 2005; R. Bornard et al., 2002). Al-

though the duplication of similar textures preserves

the local texture continuity in these methods, discon-

tinuous textures are easily synthesized in the com-

pleted image. To avoid the ordering problem, recent

inpainting methods employ the iterative global opti-

mization approach (C. All` ene and N. Paragios, 2006;

Y. Wexler et al., 2007; N. Komodakis and G. Tziritas,

2006). In these methods, the objective functions that

evaluate the pattern similarity are defined and opti-

mized by using the EM algorithm, Belief Propagation

approach and graph cut approach.

Although the global optimization methods have

obtained good results for many images, unnatural im-

ages are still generated due to two factors: (1) avail-

able samples in the data region are quite limited, and

(2) pattern similarity is one of the required condi-

tions but is not sufficient for reproducing natural tex-

tures. Thus, in order to improve the image quality,

these two factors should be considered. There have

already been some attempts at this. For (1), the scale

and orientation of textures have been considered to

obtain effective samples (I. Drori et al., 2003). For

(2), Sun et al. (J. Sun et al., 2005) and Jia et al. (J.

Jia and C. Tang, 2003) have proposed techniques that

use explicit constraints for texture boundaries. These

methods synthesize textures preserving the edges or

boundaries of the texture. However, automatic and

effective determination of these explicit constraints is

still difficult.

In this paper, in order to obtain good results for

many images, we employ a new approach different

from conventional ones. For (1), brightness change

of sample textures that has not been considered in the

literature is allowed to obtain effective samples. For

(2), the spatial locality of texture patternis considered

as an implicit constraint that is usually satisfied in a

lot of real scenes. In this study, these ideas are im-

plemented with the framework of energy minimiza-

tion based on Wexler’s objective function (Y. Wexler

et al., 2007). The effectiveness of our extensions is

demonstrated with comparison of completed images

and subjective evaluation by a questionnaire.

2IMAGE INPAINTING BY

ENERGY MINIMIZATION

Figure 1 shows the flow of the proposed method.

First, a user manually selects regions to be repaired

suchasphysicallydamagedregionsandundesiredob-

ject regions in an image (a). Next, initial values are

given to the missing regions (b). Finally, selected re-

(a) Input target region(a) Input target region(a) Input target region

(c) Update of pixels

by minimizing energyby minimizing energyby minimizing energy

Is energy converged?Is energy converged?Is energy converged?

YesYes Yes

No No No

StartStartStart

End EndEnd

(b) Initial values are given (b) Initial values are given(b) Initial values are given

(c) Update of pixels (c) Update of pixels

Figure 1: Procedure of the proposed method.

Missing

regionregion

Data regionΦ

Ω′

Ω

window windowwindow

xxx

WWW

x ˆ

x ˆ

WW

windowwindow

Missing

Data regionΦ

Ω′

Ω

Figure 2: Missing and data regions in an image.

gions are completed by minimizing the energy func-

tion (c). In the following sections, first, we describe

the conventional objective function which is defined

based on pattern similarity SSD in Section 2.1. The

new energy function considering brightness change

and spatial locality is then defined in Section 2.2, and

finally Section 2.3 describes the inpainting procedure

that minimizes the energy function.

2.1Energy function based on pattern

similarity

This section briefly describes the SSD-based objec-

tive function for image inpainting originally proposed

by Wexler et al. (Y. Wexler et al., 2007). Here, al-

though the original objective function is defined as a

probability density function, we redefine this objec-

tive function as an equivalent energy function.

As illustratedinFigure2,first, animageis divided

intoregionΩ

by a user and the data region Φ, which is the rest of

the image. The plausibility in the missing region Ω is

defined by using image patterns in the data region Φ.

Here, Ω

Ω in which there is a central pixel of a square win-

dow W of size NW(where NWis a constant) overlap-

pingtheregionΩ. Theenergyfunctionthatrepresents

the plausibility in the missing region is defined as the

?includingthemissingregionΩselected

?is the expanded area of the missing region

Page 3

weighed sum of SSD between the pixels around the

pixel x in region Ω

region Φ as follows:

Eorg

x

where ˆ xorg in the data region Φ denotes the pixel

around which the pattern is the most similar to that

around x in the region Ω

as follows.

SSD?x

p?W

Here, I

pixel ˆ xorgfor minimizing Eorgis decided as follows.

ˆ xorg

?and those around the pixel ˆ xorgin

?∑

?Ω

?wxSSD?x? ˆ xorg

??

(1)

?, and SSD?x? ˆ xorg

? is defined

? ˆ xorg

??∑

?I

?x

?p

??I

?ˆ xorg

?p??2

?

(2)

?x? represents the intensity of pixel x. The

? forg

?x?? argmin

x

?

?Φ

SSD

?x?x

?

??

(3)

Note that the weight wxis set as 1 if x is inside of

the region Ω

are fixed: otherwise wxis set as c

from the boundary of Ω and c is a constant) because

pixel values around the boundary have higher confi-

dence than those in the center of the missing region.

In Wexler’s work (Y. Wexler et al., 2007) , the

missing region is completed by calculating the pixel

valueI

pixel ˆ xorgthat minimizes the energy function Eorg.

?

?Ω because pixel values in this region

?d(d is the distance

?x?inthemissingregionandthepositionofthe

2.2Energy function extended by

considering brightness change and

spatial locality

In this study, we extend the original energy function

Eorgdefined in Eq. (1) considering brightness change

and spatial locality of texture patterns. Concretely,

we introduce a modification coefficient toallow linear

brightness change of the texture pattern. For consid-

ering spatial pattern locality, the cost function based

on distance between the pixel in the missing region

and the corresponding pixel in the data region is also

added to the original energy function. The extended

energy function is defined as follows:

E

x

where SSD

considering brightness change, and SD?x

the cost term for the spatial locality. wdisis the weight

representing the strength of spatial locality. ˆ x is de-

termined as the pixel position that minimizes the ex-

tended energy function E:

ˆ x

x

In the following sections, definitions of SSD

and SD?x

?∑

?Ω

?wx

?SSD

?

?x

? ˆ x??wdisSD?x

? ˆ x?

?

?

(4)

?

?x

? ˆ x? represents the pattern similarity

? ˆ x? means

? f

?x??argmin

?

?Φ

?SSD

?

?x?x

?

??wdisSD?x

?x

?

??? (5)

?

?x?ˆ x?

? ˆ x

? are detailed.

2.2.1Pattern similarity considering brightness

change

The similarity measure SSD

?is defined as:

SSD

?

?x?ˆ x??∑

p?W

?I

?x

?p

??αxˆ xI

?ˆ x

?p

??2

?

(6)

To allow the brightness change, we introduce an in-

tensity modification coefficient αxˆ x. In this paper,

we employ the ratio of average pixel values around

the pixels x and ˆ x as the modification coefficient

αxˆ x. However, an unnatural image is easily generated

if large brightness change is approximated by linear

transformation. Therefore, we limit the range of the

value αxˆ xwhere brightness change can be approxi-

mated linearly as given in Eq. (7).

αxˆ x

?

?

?

?

?

?

1

βxˆ x

1

?D

(βxˆ x

(1

(βxˆ x

? 1

?D)

?D

? βxˆ x

? 1

?D)

?D

? 1

?D)?

(7)

where D is a constant (0

as follows:

? D

? 1) and βxˆ xis defined

βxˆ x

?

?

∑q

?WI

?x

?q

?2

?

∑q

?WI

?ˆ x

?q

?2

?

(8)

The modification coefficient αxˆ xmakes brightness of

generated textures smooth while preserving texture

patterns and enables utilization of the texture that has

different brightness but the same pattern.

2.2.2Spatial locality of texture

Spatial locality of a texture pattern is defined by using

a sigmoid function:

SD?x

? ˆ x??

?W

?

1

?e

??K

??x

?ˆ x

??X0

??

?

(9)

where K and X0are constant and

of pixels in a window. This cost function is defined

based on the assumption that the probability of sim-

ilar texture existence for a certain pixel is uniformly

high for the object region where the pixel exists. On

the other hand, outside the object region, the proba-

bility can be assumed to be uniformly low. Note that

a constant-sized object region is currently assumed in

Eq. (9) because we could not know the range of the

object in the missing region. By adding the constraint

of spatial locality, even when the deformation of tex-

ture pattern exists around the target region, appropri-

ate textures that exist near the target region are prefer-

entially selected. Thus, textures are less likely to blur

by not selecting blurry textures of low frequency far

from the missing region.

?W

? is the number

Page 4

2.3Update of pixel values and window

pairs for energy minimization

TheenergyfunctionE definedinEq. (4)is minimized

by using a framework of greedy algorithm similar to

Wexler’s EM approach (Y. Wexler et al., 2007). In

our method, we pay attention to the fact that the en-

ergy function E for each pixel can be treated inde-

pendently if similar pattern pairs (x,ˆ x) calculated by

Eq. (5) can be fixed and the change of coefficient αxˆ x

in the iteration is very small. Thus, we repeat the fol-

lowing two processes until the energy converges: (I)

update of pairs of windows for fixed pixel values, and

(II) parallel update of all the pixel values in the miss-

ing region for fixed similar pairs of windows in the

missing region and the data region.

In process (I), we update all the similar pattern

pairs of windows in the missing region and in the data

regionfixing the pixel values calculatedin the process

(II).Concretely, the updateofthe pairof windows can

be performed by calculating SSD

Eq. (5) and determining the position ˆ x.

In process (II), we update all the pixel values I

in the missing regions in parallel by minimizing the

energy defined by Eq. (4). In the following, we de-

scribe the method for calculating the pixel values I

for fixed pairs of windows. First, the energy E is re-

solved into the element energy E

the missing region. As shown in Figure 3, the tar-

get pixel to be updated is x, and the pixel position

inside a window can be expressed as x

and is corresponded to f

the position of the pixel corresponding to the pixel

x is f

can be defined in terms of the pixel values of x and

f

tance between x and f

?and SD that satisfy

?x?

?x?

?x? for each pixel in

?p (p

? W)

?x

?p? by Eq. (5). Thus,

?x

? p??p. Now, the element energy E

?x?

?x

?p??p, the coefficient αxˆ xand the Euclid dis-

?x

? as follows:

E

?x??∑

p?W

wx?p

?I

?x

??αx?pf

?x?p?I

?f

?x

?p??p??2

?wdis

?W

?

1

?e

??K

??x

?ˆ x

??X0

??

?

(10)

The relationship between the energy E for all of the

missing region and the element energy E

pixel can be written as follows:

?x

? for each

E

?∑

x?Ω

E

?x??C

?

(11)

C is the energy of pixels in the region Ω

treated as a constant because pixel intensities in the

region and all pairs of windows are fixed here. By

differentiating E with respect to I

region, the requirement for minimizing the energy E

?Ω

?, and is

?x

? in the missing

ΦΦ

Ω′Ω′

xx

))((

ppx+

x+

ff

ff

ppx+

x+

ppppxx

−−+ )

+ )( f

( f

most similar windowsmost similar windows

Figure 3: Relationship between pixels in energy calculation.

can be obtained as follows:

∂E

∂I

?xk

?

?∑

x

?Ω

∂E

∂I

?x?

?xk

?

? 0?

(12)

Here, if it is assumed that the change of intensity

modification coefficient αxˆ x is smaller than that of

the pixel intensity I

equation.

∂αxix

∂I

From this equation, the equation ∂E

(x

E by calculating I

equation.

∂E

∂I

∂I

By generalizing Eq. (14) to all the pixels in the miss-

ing region, each pixel value I

can be calculated as follows:

?xk

?, we can obtain the following

?

?xj

?

? 0

??xi

?xj

? Ω ??x

?

? Φ ??

(13)

?x??∂I

?xk

?? 0

??xk) is formed. Thus, we can minimize the energy

?xk

?, which satisfies the following

?xk

?

?∂E

?xk

?

?xk

?

? 0?

(14)

?x? in the missing region

I

?x

??∑p?Wwx?pαx?pf

?x

?p?I

?f

?x

?p

??p

?

∑p?Wwx?p

? (15)

Eq. (15) gives an approximate solutionwhen Eq. (13)

is satisfied. We can obtain a good solution as the en-

ergy converges because the value of intensity modifi-

cation coefficient αxˆ xconverges as I

Additionally, in order to avoid local minima ef-

ficiently, a coarse-to-fine approach is also employed.

Specifically, an image pyramid is generated and the

energy minimization processes (i) and (ii) are re-

peated from higher-level to lower-level layers succes-

sively using a certain size of window. Good initial

values are given to the lower layer by projecting re-

sults from the higher layer. This makes it possible to

decrease computational cost and avoid local minima.

In the lowest layer (original size), the energy mini-

mization process is repeated while reducing the size

of the window, and it enables reproduction of more

detailed textures.

?x

? converges.

Page 5

3 EXPERIMENTS

To demonstrate the effectiveness of our extensions

of the objective function, we have applied five kinds

of image inpainting methods including the conven-

tional methods and the proposed method to one hun-

dred images (200

tiveness of the proposed method is demonstrated by

comparing the characteristic results of the proposed

method and our implemented Wexler’s method (Y.

Wexler et al., 2007). Next, by the subjective evalua-

tion based on a questionnaire using our implemented

Wexler’s method, Criminisi’s method (A. Criminisi

et al., 2004) and the proposed method, the effective-

ness of our method is objectively assessed.

In these experiments, we used a standard PC

(CPU: Xeon 3.2 GHz, Memory: 8 GB) and each pa-

rameter in the energy function was set as shown in

Table 1. Here, the missing region was manually spec-

ified, and the average pixel value of the boundary of

the missing region is given as an initial value in the

missing region.

? 200 pixels). First, the effec-

3.1Comparison of inpainted images

In this section, four images that have different char-

acteristics are selected from one hundred images as

shown in Figure 4(a). The missing regions for these

images are given in Figure 4(b). Figure 4(c) indicates

the resultant images by our implemented Wexler’s

method described in Section 2.1. Figure 4(d) shows

the images completed by the proposed method.

Image (I) includes little brightness change under

constant illumination and little pattern change in the

same object region around the missing region. It can

be confirmed that images generated by the conven-

tional and proposed methods are natural. The subjec-

tive difference is very small.

Image (II) includes little pattern change without

complex textures but large brightness changes un-

der nonconstant illumination around the missing re-

gion. Bytheconventionalmethod, theresultantimage

looks unnatural because sudden intensity changes ap-

pear at the seat and the seat back. By allowing bright-

Table 1: Parameters in experiment.

Window size

Nw

max 9?9

min 3?3

120

0.4

20

0.1

Weight for distance

Parameter in sigmoid function

wdis

K

X0

D

Range of coefficient α

ness changes of sample textures, the sudden intensity

change is suppressed by the proposed method.

Image (III) includes little brightness change un-

der constant illumination but large pattern change due

to the various sizes and shapes of objects around the

missing regions. In this image, although the same

kinds of textures apparently exist around the miss-

ing regions, texture pattern greatly changes due to the

different sizes of stones. A part of the missing re-

gions is blurred in white by the conventional method

because the SSD-based similarity is sensitive to the

patternchanges especiallyforhigh-frequencycompo-

nents, and thus inappropriate textures are selected for

the missing regions. It should be noted that there ex-

ists spatial locality of texture pattern such as water

color and stones around the missing region in image

(III). By considering spatial locality of texture pat-

tern, neighboring textures are preferentially selected

and thus the missing region is completed successfully

by the proposed method.

Image(IV)includeslargebrightnesschangeunder

nonconstant illumination conditions and texture pat-

tern continuously changes due to the perspective pro-

jection effect. In the resultant image of the conven-

tional method, an unnatural image is generated due

to the blurs on the textured area with black squares

and the discontinuous brightness changes at the wall

and floor. On the other hand, in the proposed method,

by using the constraint of the spatial locality of tex-

tures, neighboring textures are selected for comple-

tion of the missing region and windows of the post

are reproduced in Figure 4(d). In addition, by allow-

ing brightness changes of sample textures, brightness

change inside the missing region becomes more natu-

ral than that by the conventional method.

Next, we have compared the conventional and

proposed methods with respect to computational cost.

Table 2 shows the processing time of the conven-

tional and proposed methods. The proposed method

requires about three to five times as much time as the

conventional method. This is because the computa-

tional cost for calculating intensity modification coef-

ficients and cost function considering spatial locality

is increased.

Table 2: Processing time.

Conventional

method

2’17”

3’17”

3’28”

5’25”

Proposed

method

8’45”

12’15”

18’59”

18’32”

Image (I)

Image (II)

Image (III)

Image (IV)