Page 1

Robust Photometric Stereo via Low-Rank Matrix

Completion and Recovery?

Lun Wu∗, Arvind Ganesh†, Boxin Shi‡, Yasuyuki Matsushita§, Yongtian Wang∗

and Yi Ma†,§

∗School of Optics and Electronics, Beijing Institute of Technology, Beijing

†Coordinated Science Lab, University of Illinois at Urbana-Champaign

§Visual Computing Group, Microsoft Research Asia, Beijing

‡Key Laboratory of Machine Perception, Peking University, Beijing

lun.wu@hotmail.com, abalasu2@illinois.edu, shiboxin@cis.pku.edu.cn,

yasumat@microsoft.com, wyt@bit.edu.cn, mayi@microsoft.com

Abstract. We present a new approach to robustly solve photometric

stereo problems. We cast the problem of recovering surface normals from

multiple lighting conditions as a problem of recovering a low-rank matrix

with both missing entries and corrupted entries, which model all types of

non-Lambertian effects such as shadows and specularities. Unlike previ-

ous approaches that use Least-Squares or heuristic robust techniques, our

method uses advanced convex optimization techniques that are guaranteed

to find the correct low-rank matrix by simultaneously fixing its missing

and erroneous entries. Extensive experimental results demonstrate that

our method achieves unprecedentedly accurate estimates of surface nor-

mals in the presence of significant amount of shadows and specularities.

The new technique can be used to improve virtually any photometric stereo

method including uncalibrated photometric stereo.

1 Introduction

Photometric stereo [1,2] estimates surface orientations from photographs taken

from a fixed viewpoint under different lighting conditions. Since photometric stereo

can produce a dense normal field at the level of detail that cannot be achieved

by any other triangulation-based approaches, it has generated a lot of interest for

accurate shape reconstruction.

It is well understood that when a Lambertian surface is illuminated by at least

three known lighting directions, the surface orientation at each visible point can be

uniquely determined from its intensities. From different perspectives, it has long

been shown that if there are no shadows, the appearance of a convex Lamber-

tian scene illuminated from different lighting directions span a three-dimensional

subspace [3] or an illumination cone [4]. Basri and Jacobs [5] and Georghiades et

al. [6] have further shown that the images of a convex-shaped object with cast

shadows can also be well-approximated by a low-dimensional linear subspace. The

aforementioned works indicate that there exists a degenerate structure in the ap-

pearance of Lambertian surfaces under variation in illumination. This is the key

?This work was supported by grants ONR N00014-09-1-0230, NSF CCF 09-64215, and

NSF ECCS 07-01676.

Page 2

2 Lun Wu, Arvind Ganesh, Boxin Shi et al.

property that all photometric stereo methods harness to determine the surface

normals.

Previously, photometric stereo algorithms for Lambertian surfaces generally

find surface normals as the Least Squares solution to a set of linear equations

that relate the observations and known lighting directions, or equivalently, try

to identify the low-dimensional subspace using conventional Principal Component

Analysis (PCA) [7]. Such a solution is known to be optimal if the measurements

are corrupted by only i.i.d. Gaussian noise of small magnitude. Unfortunately,

in reality, photometric measurements rarely obey such a simplistic noisy linear

model: the intensity values at some pixels can be severely affected by specular

reflections (deviation from the basic Lambertian assumption), sensor saturations,

or shadowing effects. As a result, the Least Squares solution normally ends up

with incorrect estimates of surface orientations in practice.

To overcome this problem, researchers have explored various approaches to

eliminate such deviations by treating the corrupted measurements as outliers,

e.g., using a RANSAC scheme [8,9], or a median-based approach [10]. To identify

the different types of corruptions in images more carefully, Mukaiegawa et al. [11]

have proposed a method for classifying diffuse, specular, attached, and cast shadow

pixels based on RANSAC and outlier elimination.

Contributions: In this paper, we propose a simple but principled solution to pho-

tometric stereo that can deal with any kind of deviation from the basic Lambertian

assumption in a unified framework. We cast the photometric stereo problem as a

problem of recovering and completing a low-rank matrix subject to sparse, gross

errors like corrupted and missing pixels. Unlike previous heuristic methods, un-

der fairly broad conditions, the new method is guaranteed to correctly recover

the low-rank Lambertian diffuse component from the highly corrupted and in-

complete observations. Based on advanced convex optimization tools for nuclear

norm and ?1-norm minimization, the new method can efficiently obtain highly

accurate estimates of surface orientations. Our method can be used to improve

virtually any existing photometric stereo method, including uncalibrated photo-

metric stereo [12], where traditionally, corruption in the data (e.g., by specularity)

is either neglected or ineffectively dealt with conventional heuristic robust estima-

tion methods.

In contrast to previous robust approaches, our method is computationally more

efficient and provides theoretical guarantees for robustness to large errors. More

importantly, our method is able to use all the available information simultaneously

for obtaining the optimal result, instead of discarding informative measurements,

e.g., by either selecting the best set of illumination directions [9] or using the

median estimator [10].

2 Photometric Stereo as Low-Rank Matrix Recovery with

Sparse Errors

In this section, we formulate the problem of estimating the normal map as a

rank minimization problem. We first review the basic Lambertian image forma-

Page 3

Robust Photometric Stereo via Low-Rank Matrix Completion and Recovery3

tion model, and then discuss how to model large deviations like shadows and

specularities. In the following discussion, we make a few assumptions:

– The relative position of the camera and object is fixed across all images.

– The object is illuminated by a point light source at infinity.

– The sensor response is linear.

Lambertian Image Formation Model. The appearance I of a Lambertian scene

observed under a lighting direction l ∈ R3is described as the inner product:

I = ρn · l,

where ρ is the diffuse albedo, and n ∈ R3is the surface normal. Suppose that we

are given n images I1,...,In of a scene under different lighting conditions. Let

the region of interest be composed of m pixels in each image.1We order the pixel

locations with a single index k, and let Ij(k) denote the observed intensity at pixel

location k in image Ij. With this notation, we have the following relation about

the observation Ij(k):

Ij(k) = ρknk· lj,

where ρk is the albedo of the scene at pixel location k, nk ∈ R3is the (unit)

surface normal of the scene at pixel location k, and lj ∈ R3represents the nor-

malized lighting direction vector corresponding to image Ij.2We assume that the

light intensity is constant across images to simplify the discussion, although the

proposed method is not limited to such a condition.

(1)

(2)

Low-rank Matrix Structure. Consider the matrix D ∈ Rm×nconstructed by stack-

ing all the vectorized images vec(I) as

D = [vec(I1) | ··· | vec(In)],

(3)

where vec(Ij) = [Ij(1),...,Ij(m)]Tfor j = 1,...,n. It follows from Eq. (2) that

D can be factorized as follows:

D = NL,

where N.= [ρ1n1| ··· | ρmnm]T∈ Rm×3, and L.= [l1| ··· | ln] ∈ R3×n. Suppose

that the number of images n ≥ 3. Clearly, irrespective of the number of pixels m

and the number of images n, the rank of the matrix D is at most 3.

(4)

Modeling Corruptions as Sparse Errors. The low-rank structure of the observation

matrix D described above is seldom observed with real images. This is due to the

presence of shadows and specularities in real images.

– Shadows arise in real images in two possible ways. Some pixels are not visible

in the image because they face away from the light source. Such dark pixels are

referred to as attached shadows [13]. In deriving Eq. (4) from Eq. (2), we have

1Typically, m is much larger than the number of images n.

2The convention here is that the lighting direction vectors point from the surface of the

object to the light source.

Page 4

4 Lun Wu, Arvind Ganesh, Boxin Shi et al.

implicitly assumed that all pixels of the object are illuminated by the light

source in each image. However, if the pixel faces away from the light source,

then the relation no longer holds. Mathematically, this implies that Eq. (2)

must be rewritten as follows:

Ij(k) = max{ρknk· lj,0}.

(5)

Shadows can also occur in images when the shape of the object’s surface is not

convex: parts of the surface can be occluded from the light source by other

parts. Even though the normal vectors at such occluded pixels may form a

sharp angle with the lighting direction, these pixels appear entirely dark. We

refer to such dark pixels as cast shadows. Irrespective of the type, all shadows

occur in images as dark pixels with very small, if not zero, intensity values.

– Specularities. Specular reflection arises when the object of interest is not

perfectly diffusive, i.e., when the surface luminance is not purely isotropic.

Thus, the intensity of reflected light depends on the viewing angle, and light is

reflected in a mirror-like fashion accompanied by a specular lobe when viewed

from certain angles. This gives rise to some bright spots or shiny patches

on the surface of the object that significantly deviate from the Lambertian

assumption.

Suppose we represent all these deviations from the ideal low-rank diffusive

model Eq. (4) by an error matrix E ∈ Rm×n. Thus, instead of Eq. (4), the image

measurements should be modeled as

D = NL + E,

(6)

where the matrix E accounts for corruption by shadows or specularities. Now

suppose that only a small fraction of the pixels in each image exhibit strong

specular reflectance and that a large majority of the pixels are illuminated by the

light source. Then, most pixels in the input images obey the low-rank diffusive

model given by Eq. (4), and hence, most entries in the error matrix E will be zero,

i.e., E is a sparse matrix. If the matrix L of lighting directions is known, then we

can compute the surface normals, provided that we can decompose D as the sum

of a low-rank matrix and a sparse error matrix. Thus, the problem can be stated

more formally as follows:

Let I1,...,In be n images of an object under different illumination con-

ditions. If D ∈ Rm×nis defined as given in Eq. (3), then find a sparse

matrix E such that the matrix A.= D − E has the lowest possible rank.

Using a Lagrangian formulation, we can write the above problem as the fol-

lowing optimization problem:

min

A,Erank(A) + γ ?E?0

s.t.D = A + E,

(7)

where ? · ?0denotes the ?0-norm (number of non-zero entries in the matrix), and

γ > 0 is a parameter that trades off the rank of the solution A versus the sparsity of

Page 5

Robust Photometric Stereo via Low-Rank Matrix Completion and Recovery5

the error E. Let (ˆA,ˆE) be the optimal solution to Eq. (7). If the lighting directions

L are given, we can easily recover the matrix N of surface normals fromˆA as:

N =ˆAL†,

(8)

where L†denotes the Moore-Penrose pseudo-inverse of L. The surface normals

n1,...,nmcan be estimated by normalizing each row of N to have unit norm.

While Eq. (7) follows from our formulation, it is not tractable since both rank

and ?0-norm are non-convex and discontinuous functions. Solving this optimization

problem efficiently will be the topic of discussion in the next section.

3 Efficient Solution via Convex Programming

As discussed above, the optimization problem given in Eq. (7) is extremely difficult

(NP-hard in general) to solve. In this section, we propose to solve it efficiently

based on recent advances in algorithms for matrix rank minimization [14–16].

3.1 Convex relaxation and modification

Recently, Wright et al. [14] and Chandrasekaran et al. [15] have proposed that the

problem in Eq. (7) can be solved by replacing the cost function with its convex

surrogate, provided that the rank of the matrix A is not too high and the number of

non-zero entries in the matrix E is not too large. This convex relaxation, dubbed

Principal Component Pursuit (PCP) in [14], replaces rank(·) with the nuclear

norm (sum of the singular values of the matrix) and the ?0-norm with the matrix

?1-norm (sum of the absolute values of all entries of the matrix). Under quite

general conditions, it has been proved in [14,15] that the following optimization

problem has the same optimal solution as Eq. (7):

min

A,E?A?∗+ λ?E?1

s.t.D = A + E,

(9)

where ? · ?∗and ? · ?1represent the nuclear norm and ?1-norm, respectively, and

λ > 0 is a weighting parameter. Theoretical considerations in [14] suggest that

λ must be of the form C/?max{m,n}, where C is a constant, typically set to

not affected by the magnitude of the singular values of the solution A or by the

magnitude of the non-zero entries of the error matrix E.

In the framework of PCP, the locations of the non-zero entries in the sparse

matrix E are assumed to be unknown a priori. But if the locations of some of

the corrupted entries are known, then we can incorporate that information into

the recovery procedure and hence, make the problem somewhat easier to solve.

This is similar in spirit to the matrix completion problem [17–19]. Notice that

although both shadows and specularities corrupt the low-rank matrix, they have

different characteristics. While the locations of the specular pixels are hard to

detect, especially that of pixels in specular lobes, it is relatively easy to detect the

location of shadows in an image (e.g., by a simple thresholding of the pixel values).

Thus, we have more information about the shadows than specularities, and such

unity. It is interesting to note that the equivalence between Eq. (7) and Eq. (9) is