IMAGE INPAINTING CONSIDERING BRIGHTNESS CHANGE
AND SPATIAL LOCALITY OF TEXTURES
Norihiko Kawai, Tomokazu Sato, Naokazu Yokoya
Graduate School of Information Science, Nara Institute of Science and Technology
8916-5 Takayama, Ikoma, Nara 630-0192, Japan
firstname.lastname@example.org, email@example.com, firstname.lastname@example.org
image inpainting, image completion, energy minimization
Image inpainting is a tequnique for removing undesired visual objects in images and filling the missing re-
gions with plausible textures. Conventionally, the missing parts of an image are completed by optimizing the
objective function, which is defined based on pattern similarity between the missing region and the rest of
the image (data region). However, unnatural textures are easily generated due to two factors: (1) available
samples in the data region are quite limited, and (2) pattern similarity is one of the required conditions but is
not sufficient for reproducing natural textures. In this paper, in order to improve the image quality of com-
pleted texture, the objective function is extended by allowing brightness changes of sample textures (for (1))
and introducing spatial locality as an additional constraint (for (2)). The effectiveness of these extensions is
successfully demonstrated by applying the proposed method to one hundred images and comparing the results
with those obtained by the conventional methods.
Image inpainting is a tequnique for removing unde-
sired visual objects in images and filling the missing
regions with plausible textures. This research can be
classified into two categories. One is a non-exemplar-
based method and the other is an exemplar-based
method. The non-exemplar-based methods(A. Levin
et al., 2003; C. Ballester et al., 2001a; C. Ballester
et al., 2001b; D. Tschumperl´ e, 2006; E. Vill´ eger
et al., 2004; M. Bertalmio et al., 2001; M. Bertalmio
et al., 2000; S. Esedoglu and J. Shen, 2003; S. Mas-
nou and J.M. Morel, 1998; T. Chan and J. Shen,
2001; T. Chan et al., 2002) are based on pixel inter-
polation considering the continuity of pixel intensity.
These methods are effective for small image gaps like
scratches in a photograph. However, the resultant im-
age easily becomes unclear when the missing region
is large. Therefore, recently many exemplar-based
inpainting methods have been intensively developed
because they can synthesize complex textures in the
Exemplar-basedmethods basicallysynthesize tex-
tures for the missing region based on pattern simi-
larity that is defined between the missing region and
the rest of the image. Some of the exemplar-based
methods use the distance in the feature space as a
similarity measure. As the feature space, Fourier
space, wavelet domain and eigenspace have been
used (A.N. Hirani and T. Totsuka, 1996; S.D. Rane
et al., 1996; T. Amano, 2004). Most of the other
exemplar-based methods simply employ SSD (sum
of squared differences)-based pattern similarity mea-
sures (A. Criminisi et al., 2004; A.A. Efros and T.K.
Leung, 1999; B. Li et al., 2005; C. All` ene and N.
Paragios, 2006; I. Drori et al., 2003; J. Jia and C.
Tang, 2003; J. Sun et al., 2005; N. Komodakis and
G. Tziritas, 2006; R. Bornard et al., 2002; Y. Wexler
et al., 2007). Efros et al. (A.A. Efros and T.K. Le-
ung, 1999) have proposed a method that successively
copies the most similar patternfrom the data region to
the missing region. Although this method can gener-
ate complex textures, the quality of resultant images
is severely affected by the order of texture copy. To
obtain good results with the successive texture copy,
confidence maps such as the number of fixed pixels
in a window, strength of isophotes around the missing
regions and pattern similarity have been used to de-
termine the order of texture copy (A. Criminisi et al.,
2004; B. Li et al., 2005; R. Bornard et al., 2002). Al-
though the duplication of similar textures preserves
the local texture continuity in these methods, discon-
tinuous textures are easily synthesized in the com-
pleted image. To avoid the ordering problem, recent
inpainting methods employ the iterative global opti-
mization approach (C. All` ene and N. Paragios, 2006;
Y. Wexler et al., 2007; N. Komodakis and G. Tziritas,
2006). In these methods, the objective functions that
evaluate the pattern similarity are defined and opti-
mized by using the EM algorithm, Belief Propagation
approach and graph cut approach.
Although the global optimization methods have
obtained good results for many images, unnatural im-
ages are still generated due to two factors: (1) avail-
able samples in the data region are quite limited, and
(2) pattern similarity is one of the required condi-
tions but is not sufficient for reproducing natural tex-
tures. Thus, in order to improve the image quality,
these two factors should be considered. There have
already been some attempts at this. For (1), the scale
and orientation of textures have been considered to
obtain effective samples (I. Drori et al., 2003). For
(2), Sun et al. (J. Sun et al., 2005) and Jia et al. (J.
Jia and C. Tang, 2003) have proposed techniques that
use explicit constraints for texture boundaries. These
methods synthesize textures preserving the edges or
boundaries of the texture. However, automatic and
effective determination of these explicit constraints is
In this paper, in order to obtain good results for
many images, we employ a new approach different
from conventional ones. For (1), brightness change
of sample textures that has not been considered in the
literature is allowed to obtain effective samples. For
(2), the spatial locality of texture patternis considered
as an implicit constraint that is usually satisfied in a
lot of real scenes. In this study, these ideas are im-
plemented with the framework of energy minimiza-
tion based on Wexler’s objective function (Y. Wexler
et al., 2007). The effectiveness of our extensions is
demonstrated with comparison of completed images
and subjective evaluation by a questionnaire.
2 IMAGE INPAINTING BY
Figure 1 shows the flow of the proposed method.
First, a user manually selects regions to be repaired
ject regions in an image (a). Next, initial values are
given to the missing regions (b). Finally, selected re-
(a) Input target region(a) Input target region(a) Input target region
(c) Update of pixels
by minimizing energyby minimizing energy by minimizing energy
Is energy converged?Is energy converged? Is energy converged?
(b) Initial values are given(b) Initial values are given (b) Initial values are given
(c) Update of pixels (c) Update of pixels
Figure 1: Procedure of the proposed method.
Figure 2: Missing and data regions in an image.
gions are completed by minimizing the energy func-
tion (c). In the following sections, first, we describe
the conventional objective function which is defined
based on pattern similarity SSD in Section 2.1. The
new energy function considering brightness change
and spatial locality is then defined in Section 2.2, and
finally Section 2.3 describes the inpainting procedure
that minimizes the energy function.
2.1Energy function based on pattern
This section briefly describes the SSD-based objec-
tive function for image inpainting originally proposed
by Wexler et al. (Y. Wexler et al., 2007). Here, al-
though the original objective function is defined as a
probability density function, we redefine this objec-
tive function as an equivalent energy function.
As illustratedinFigure2,first, animageis divided
by a user and the data region Φ, which is the rest of
the image. The plausibility in the missing region Ω is
defined by using image patterns in the data region Φ.
Ω in which there is a central pixel of a square win-
dow W of size NW(where NWis a constant) overlap-
the plausibility in the missing region is defined as the
?is the expanded area of the missing region
weighed sum of SSD between the pixels around the
pixel x in region Ω
region Φ as follows:
where ˆ xorg in the data region Φ denotes the pixel
around which the pattern is the most similar to that
around x in the region Ω
pixel ˆ xorgfor minimizing Eorgis decided as follows.
?and those around the pixel ˆ xorgin
?wxSSD?x? ˆ xorg
?, and SSD?x? ˆ xorg
? is defined
? ˆ xorg
?x? represents the intensity of pixel x. The
Note that the weight wxis set as 1 if x is inside of
the region Ω
are fixed: otherwise wxis set as c
from the boundary of Ω and c is a constant) because
pixel values around the boundary have higher confi-
dence than those in the center of the missing region.
In Wexler’s work (Y. Wexler et al., 2007) , the
missing region is completed by calculating the pixel
pixel ˆ xorgthat minimizes the energy function Eorg.
?Ω because pixel values in this region
?d(d is the distance
2.2Energy function extended by
considering brightness change and
In this study, we extend the original energy function
Eorgdefined in Eq. (1) considering brightness change
and spatial locality of texture patterns. Concretely,
we introduce a modification coefficient toallow linear
brightness change of the texture pattern. For consid-
ering spatial pattern locality, the cost function based
on distance between the pixel in the missing region
and the corresponding pixel in the data region is also
added to the original energy function. The extended
energy function is defined as follows:
considering brightness change, and SD?x
the cost term for the spatial locality. wdisis the weight
representing the strength of spatial locality. ˆ x is de-
termined as the pixel position that minimizes the ex-
tended energy function E:
In the following sections, definitions of SSD
? ˆ x??wdisSD?x
? ˆ x?
? ˆ x? represents the pattern similarity
? ˆ x? means
? ˆ x
? are detailed.
2.2.1 Pattern similarity considering brightness
The similarity measure SSD
?is defined as:
To allow the brightness change, we introduce an in-
tensity modification coefficient αxˆ x. In this paper,
we employ the ratio of average pixel values around
the pixels x and ˆ x as the modification coefficient
αxˆ x. However, an unnatural image is easily generated
if large brightness change is approximated by linear
transformation. Therefore, we limit the range of the
value αxˆ xwhere brightness change can be approxi-
mated linearly as given in Eq. (7).
? βxˆ x
where D is a constant (0
? 1) and βxˆ xis defined
The modification coefficient αxˆ xmakes brightness of
generated textures smooth while preserving texture
patterns and enables utilization of the texture that has
different brightness but the same pattern.
2.2.2Spatial locality of texture
Spatial locality of a texture pattern is defined by using
a sigmoid function:
? ˆ x??
where K and X0are constant and
of pixels in a window. This cost function is defined
based on the assumption that the probability of sim-
ilar texture existence for a certain pixel is uniformly
high for the object region where the pixel exists. On
the other hand, outside the object region, the proba-
bility can be assumed to be uniformly low. Note that
a constant-sized object region is currently assumed in
Eq. (9) because we could not know the range of the
object in the missing region. By adding the constraint
of spatial locality, even when the deformation of tex-
ture pattern exists around the target region, appropri-
ate textures that exist near the target region are prefer-
entially selected. Thus, textures are less likely to blur
by not selecting blurry textures of low frequency far
from the missing region.
? is the number
2.3 Update of pixel values and window
pairs for energy minimization
TheenergyfunctionE definedinEq. (4)is minimized
by using a framework of greedy algorithm similar to
Wexler’s EM approach (Y. Wexler et al., 2007). In
our method, we pay attention to the fact that the en-
ergy function E for each pixel can be treated inde-
pendently if similar pattern pairs (x,ˆ x) calculated by
Eq. (5) can be fixed and the change of coefficient αxˆ x
in the iteration is very small. Thus, we repeat the fol-
lowing two processes until the energy converges: (I)
update of pairs of windows for fixed pixel values, and
(II) parallel update of all the pixel values in the miss-
ing region for fixed similar pairs of windows in the
missing region and the data region.
In process (I), we update all the similar pattern
pairs of windows in the missing region and in the data
regionfixing the pixel values calculatedin the process
(II).Concretely, the updateofthe pairof windows can
be performed by calculating SSD
Eq. (5) and determining the position ˆ x.
In process (II), we update all the pixel values I
in the missing regions in parallel by minimizing the
energy defined by Eq. (4). In the following, we de-
scribe the method for calculating the pixel values I
for fixed pairs of windows. First, the energy E is re-
solved into the element energy E
the missing region. As shown in Figure 3, the tar-
get pixel to be updated is x, and the pixel position
inside a window can be expressed as x
and is corresponded to f
the position of the pixel corresponding to the pixel
x is f
can be defined in terms of the pixel values of x and
tance between x and f
?and SD that satisfy
?x? for each pixel in
?p? by Eq. (5). Thus,
? p??p. Now, the element energy E
?p??p, the coefficient αxˆ xand the Euclid dis-
? as follows:
The relationship between the energy E for all of the
missing region and the element energy E
pixel can be written as follows:
? for each
C is the energy of pixels in the region Ω
treated as a constant because pixel intensities in the
region and all pairs of windows are fixed here. By
differentiating E with respect to I
region, the requirement for minimizing the energy E
?, and is
? in the missing
+ )( f
most similar windowsmost similar windows
Figure 3: Relationship between pixels in energy calculation.
can be obtained as follows:
Here, if it is assumed that the change of intensity
modification coefficient αxˆ x is smaller than that of
the pixel intensity I
From this equation, the equation ∂E
E by calculating I
By generalizing Eq. (14) to all the pixels in the miss-
ing region, each pixel value I
can be calculated as follows:
?, we can obtain the following
? Ω ??x
? Φ ??
??xk) is formed. Thus, we can minimize the energy
?, which satisfies the following
?x? in the missing region
Eq. (15) gives an approximate solutionwhen Eq. (13)
is satisfied. We can obtain a good solution as the en-
ergy converges because the value of intensity modifi-
cation coefficient αxˆ xconverges as I
Additionally, in order to avoid local minima ef-
ficiently, a coarse-to-fine approach is also employed.
Specifically, an image pyramid is generated and the
energy minimization processes (i) and (ii) are re-
peated from higher-level to lower-level layers succes-
sively using a certain size of window. Good initial
values are given to the lower layer by projecting re-
sults from the higher layer. This makes it possible to
decrease computational cost and avoid local minima.
In the lowest layer (original size), the energy mini-
mization process is repeated while reducing the size
of the window, and it enables reproduction of more
To demonstrate the effectiveness of our extensions
of the objective function, we have applied five kinds
of image inpainting methods including the conven-
tional methods and the proposed method to one hun-
dred images (200
tiveness of the proposed method is demonstrated by
comparing the characteristic results of the proposed
method and our implemented Wexler’s method (Y.
Wexler et al., 2007). Next, by the subjective evalua-
tion based on a questionnaire using our implemented
Wexler’s method, Criminisi’s method (A. Criminisi
et al., 2004) and the proposed method, the effective-
ness of our method is objectively assessed.
In these experiments, we used a standard PC
(CPU: Xeon 3.2 GHz, Memory: 8 GB) and each pa-
rameter in the energy function was set as shown in
Table 1. Here, the missing region was manually spec-
ified, and the average pixel value of the boundary of
the missing region is given as an initial value in the
? 200 pixels). First, the effec-
3.1 Comparison of inpainted images
In this section, four images that have different char-
acteristics are selected from one hundred images as
shown in Figure 4(a). The missing regions for these
images are given in Figure 4(b). Figure 4(c) indicates
the resultant images by our implemented Wexler’s
method described in Section 2.1. Figure 4(d) shows
the images completed by the proposed method.
Image (I) includes little brightness change under
constant illumination and little pattern change in the
same object region around the missing region. It can
be confirmed that images generated by the conven-
tional and proposed methods are natural. The subjec-
tive difference is very small.
Image (II) includes little pattern change without
complex textures but large brightness changes un-
der nonconstant illumination around the missing re-
gion. Bytheconventionalmethod, theresultantimage
looks unnatural because sudden intensity changes ap-
pear at the seat and the seat back. By allowing bright-
Table 1: Parameters in experiment.
Weight for distance
Parameter in sigmoid function
Range of coefficient α
ness changes of sample textures, the sudden intensity
change is suppressed by the proposed method.
Image (III) includes little brightness change un-
der constant illumination but large pattern change due
to the various sizes and shapes of objects around the
missing regions. In this image, although the same
kinds of textures apparently exist around the miss-
ing regions, texture pattern greatly changes due to the
different sizes of stones. A part of the missing re-
gions is blurred in white by the conventional method
because the SSD-based similarity is sensitive to the
nents, and thus inappropriate textures are selected for
the missing regions. It should be noted that there ex-
ists spatial locality of texture pattern such as water
color and stones around the missing region in image
(III). By considering spatial locality of texture pat-
tern, neighboring textures are preferentially selected
and thus the missing region is completed successfully
by the proposed method.
nonconstant illumination conditions and texture pat-
tern continuously changes due to the perspective pro-
jection effect. In the resultant image of the conven-
tional method, an unnatural image is generated due
to the blurs on the textured area with black squares
and the discontinuous brightness changes at the wall
and floor. On the other hand, in the proposed method,
by using the constraint of the spatial locality of tex-
tures, neighboring textures are selected for comple-
tion of the missing region and windows of the post
are reproduced in Figure 4(d). In addition, by allow-
ing brightness changes of sample textures, brightness
change inside the missing region becomes more natu-
ral than that by the conventional method.
Next, we have compared the conventional and
proposed methods with respect to computational cost.
Table 2 shows the processing time of the conven-
tional and proposed methods. The proposed method
requires about three to five times as much time as the
conventional method. This is because the computa-
tional cost for calculating intensity modification coef-
ficients and cost function considering spatial locality
Table 2: Processing time.
Image (I): little change in brightness and pattern
(a) (b)(c) (d)
Image (II): large change in brightness
(a)(b) (c) (d)
Image (III): change in texture pattern
Image (IV): large change in brightness and continuous deformation of texture pattern
Figure 4: Image inpainting for four representative images: (a) original image, (b) missing region specified manually, (c)
resultant image by conventional method, and (d) resultant image by our method.
3.2Evaluation by a questionnaire
In this section, completed images forone hundred im-
ages by five kinds of inpainting methods are subjec-
tively evaluated by 37 subjects. The subjects are men
and women in their twenties and all of them often
use computers. This experiment aims to illustrate the
effectiveness of the proposed method objectively by
evaluating the resultant images by subjects.
3.2.1 Evaluation method
The subjects were requested to access the web page
for questionnaire evaluation and evaluate the 500 re-
sultant images for 100 input images by giving a score
of 1 to 5. In this experiment, images were com-
pleted by five methods: our implemented Criminisi’s
method (method A) that is the representative of suc-
cessive synthesis methods, our implemented Wexler’s
method (method B), proposed method allowing only
brightness change (method C), proposed method con-
sidering only spatial locality (method D), and pro-
spatial locality (method E). On the evaluation web
page, the resultant images generated by five methods
were arranged in random order so that subjects could
not know the relationship betweenimageandmethod.
The evaluation criteria were that the lowest score 1
was for an image that could not be used and the high-
est score 5 was for an image that was natural enough
to use for personal homepages or magazines.
3.2.2 Results and discussion
The average score for the 100 resultant images and
the number of images that obtained the highest score
are shown in Table 3 for each method.1Table 3 shows
that the average score of the resultant images by the
conventional methods (methods A and B) and the in-
painted images by the proposed method are scored as
the best most frequently. In this experiment, scores of
methods (methods A and B) were also compared by
using the t-test with a 5% significant level. In the
result, significant difference was observed between
these scores. Therefore, the proposed method can be
verified to be better than the conventional methods
A and B. In addition, both method considering only
brightness change (method C) and method consider-
ing only spatial locality (method D) obtained higher
scores than methods A and B and the significant dif-
ference was also observed by using the t-test with a
1100 input images and resultant images are shown on
the web page [http://yokoya.naist.jp/research/visapp/]
(a) image with missing
(b) result by our
(d) resultby our
Figure 5: Example images for which both the proposed and
conventional methods have a problem.
5% significant level. This means that the introduction
of each factor to the energy function is clearly effec-
It should be noted here that there exist some im-
ages for which both the proposed (method E) and
conventional (method B) methods have a problem as
shown in Figure 5. The average scores for both meth-
ods are less than 2 points. In this image, due to the
fairly large change in texture patterns, the texture in
the missing region blurs despite the consideration of
spatial locality. Here, when the weight for spatial lo-
cality was increased (wdis
age was generatedas shownin Figure5 (d). Thecom-
? 2700) a more natural im-
Table 3: Average score and the number of best-scoring im-
ages from 100 images.
pletion with a single weight coefficient does not al-
ways work well, and thus it is necessary to determine
the parameter adaptively considering the characteris-
tics of the image in order to obtain good results for
many images containing complex textures.
In this paper, the objective function for image inpaint-
ing is extended to acquire natural images. To obtain
good results, two factors were considered: (1) bright-
ness change of sample textures was allowed, (2) spa-
tial locality was introduced as a new constraint. By
considering these two factors, the missing region was
completed successfully for many images. In experi-
ments, we have demonstrated the effectiveness of our
methodby comparing the resultant images of the con-
ventional and proposed methods. In addition, by a
questionnaire evaluation using 37 subjects, we have
verified that the proposed method could obtain good
In experiments, parameters such as the size of win-
dow and the weight in the energy function were de-
cided empirically. In future work, we shouldestablish
a method to decide optimum parameters.
A. Criminisi, P. P´ erez, and K. Toyama (2004). Region Fill-
ing and Object Removal by Exemplar-Based Image
Inpainting. In Trans. on Image Processing, volume
13, No. 9, pages 1200–1212.
A. Levin, A. Zomet, and Y. Weiss (2003). Learning How to
Inpaint from Global Image Statistics. In Proc. ICCV,
volume 1, pages 305–312.
A.A. Efros and T.K. Leung (1999).
by Non-parametric Sampling. In Proc. ICCV, pages
A.N. Hirani and T. Totsuka (1996). Combining Frequency
and Spatial Domain Information for Fast Interactive
Image Noise Removal.
In Proc. SIGGRAPH1996,
B. Li, Y. Qi, and X. Shen (2005). An Image Inpainting
Method. In Proc. IEEE Int. Conf. on Computer Aided
Design and Computer Graphics, pages 531–536.
C. All` ene and N. Paragios (2006). Image Renaissance Us-
ing Discrete Optimization. In Proc. ICPR, pages 631–
C. Ballester, M. Bertalmio, V. Sapiro, and J. Verdera
(2001a). Filling-In by Joint Interpolation of Vector
Fields and Gray Levels. In Trans. on Image Process-
ing, volume 10, No. 8, pages 1200–1211.
C. Ballester, V. Caselles, J. Verdera, M. Bertalmio, and G.
Sapiro (2001b). A Variational Model for Filling-In
Gray Level and Color Images. In Proc. ICCV, pages
D. Tschumperl´ e (2006). Curvature-Preserving Regulariza-
tion of Multi-valued Images Using PDE’s. In Proc.
ECCV, volume 2, pages 295–307.
E. Vill´ eger, G. Aubert, and L. Blanc-F´ eraud (2004). Image
Disocclusion Using a Probabilistic Gradient Orienta-
tion. In Proc. ICPR, volume 2, pages 52–55.
I. Drori, D. Cohen-Or, and H. Yeshurun (2003). Fragment-
Based Image Completion. In Proc. SIGGRAPH2003,
J. Jia and C. Tang (2003). Image Repairing: Robust Image
Synthesis by Adaptive ND Tensor Voting. In Proc.
CVPR, pages 643–650.
J. Sun, L. Yuan, J. Jia, and H. Shum (2005). Image Com-
pletion with Structure Propagation.
GRAPH2005, pages 861–868.
M. Bertalmio, A.L. Bertozzi, and G. Sapiro (2001). Navier-
Stokes, Fluid Dynamics, and Image and Video In-
painting. In Proc. CVPR, pages 355–362.
M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester
(2000). Image Inpainting. In Proc. SIGGRAPH2000,
N. Komodakis and G. Tziritas (2006). Image Completion
Using Global Optimization. In Proc. CVPR, pages
R. Bornard, E. Lecan, L. Laborelli, and J. Chenot (2002).
Missing Data Correction in Still Images and Image
Sequences. In Proc. ACM Int. Conf. on Multimedia,
S. Esedoglu and J. Shen (2003). Digital Inpainting Based
on the Mumford-shah-euler Image Model. In Euro-
pean J. of Applied Mathematics, volume 13, pages
S. Masnou and J.M. Morel (1998). Level Lines Based Dis-
occlusion. In Proc. ICIP, volume 3, pages 259–263.
S.D. Rane, J. Remus, and G. Sapiro (1996).
Domain Reconstruction of Lost Blocks in Wireless
Image Transmission and Packet-Switched. In Proc.
ICIP, volume 1, pages 309–312.
T. Amano (2004). Image Interpolation by High Dimen-
sional Projection Based on Subspace Method. In Proc.
ICPR, volume 4, pages 665–668.
T. Chan and J. Shen (2001).
Curvature-Driven Diffusions (CDD). In J. of Visual
Communication and Image Representation, volume
12, No. 4, pages 436–449.
T. Chan, S. Kang, J. Shen, and S. Osher (2002). Euler’s
Elastica and Curvature Based Inpaintings. In SIAM
J. of Applied Mathematics, volume 63, No. 2, pages
Y. Wexler, E. Shechtman, and M. Irani (2007). Space-Time
Completion of Video. In Trans. on Pattern Analysis
and Machine Intelligence, volume 29, No. 3, pages
In Proc. SIG-
Non-texture Inpainting by