Content uploaded by F. Pitié
Author content
All content in this area was uploaded by F. Pitié
Content may be subject to copyright.
Reflection Detection in Image Sequences
Mohamed Abdelaziz Ahmed Francois Pitie Anil Kokaram
Sigmedia, Electronic and Electrical Engineering Department, Trinity College Dublin {www.sigmedia.tv/People}
Abstract
Reflections in image sequences consist of several layers
superimposed over each other. This phenomenon causes
many image processing techniques to fail as they assume the
presence of only one layer at each examined site e.g. motion
estimation and object recognition. This work presents an
automated technique for detecting reflections in image se-
quences by analyzing motion trajectories of feature points.
It models reflection as regions containing two different lay-
ers moving over each other. We present a strong detector
based on combining a set of weak detectors. We use novel
priors, generate sparse and dense detection maps and our
results show high detection rate with rejection to patholog-
ical motion and occlusion.
1. Introduction
Reflections are often the result of superimposing differ-
ent layers over each other (see Fig. 1,2,4,5). They mainly
occur due to photographing objects situated behind a semi
reflective medium (e.g. a glass window). As a result the
captured image is a mixture between the reflecting surface
(background layer) and the reflected image (foreground).
When viewed from a moving camera, two different layers
moving over each other in different directions are observed.
This phenomenon violates many of the existing models for
video sequences and hence causes many consumer video
applications to fail e.g. slow-motion effects, motion based
sports summarization and so on. This calls for the need of
an automated technique that detects reflections and assigns
a different treatment to them.
Detecting reflections requires analyzing data for specific
reflection characteristics. However, as reflections can arise
by mixing any two images, they come in many shapes and
colors (Fig. 1,2,4,5). This makes extracting characteris-
tics specific to reflections not an easy task. Furthermore,
one should be careful when using motion information of re-
flections as there is a high probability of motion estimation
failure. For these reasons the problem of reflection detec-
tion is hard and was not examined before.
Reflection can be detected by examining the possibility
of decomposing an image into two different layers. Lots of
work exist on separating mixtures of semi-transparent lay-
ers [17,11,12,7,4,1,13,3,2]. Nevertheless, most of the
still image techniques [11,4,1,3,2] require two mixtures
of the same layers under two different mixing conditions
while video techniques [17,12,13] assume a simple rigid
motion for the background [17,13] or a repetitive one [12].
These assumptions are hardly valid for reflections on mov-
ing image sequences.
This paper presents an automated technique for detect-
ing reflections in image sequences. It is based on analyzing
spatio-temporal profiles of feature point trajectories. This
work focuses on analyzing three main features of reflec-
tions: 1) the ability of decomposing an image into two in-
dependent layers 2) image sharpness 3) the temporal be-
havior of image patches. Several weak detectors based on
analyzing these features through different measures are pro-
posed. A final strong detector is generated by combining
the weak detectors. The problem is formulated within a
Bayesian framework and priors are defined in a way to re-
ject false alarms. Several sequences are processed and re-
sults show high detection rate with rejection to complicated
motion patterns e.g. blur, occlusion, fast motion.
Aspects of novelty in this paper include: 1) A technique
for decomposing a color still image containing reflection
into two images containing the structures of the source lay-
ers. We do not claim that this technique could be used to
fully remove reflections from videos. What we claim is that
the extracted layers can be useful for reflection detection
since on a block basis, reflection is reduced. This technique
can not compete with state of the art separation techniques.
However we use this technique because it works on single
frames and thus does not require motion, which is not the
case with any existing separation technique. 2) Diagnos-
tic tools for reflection detection based on analyzing feature
points trajectories 3) A scheme for combining weak de-
tectors in one strong reflection detector using Adaboost 4)
Incorporating priors which reject spatially and temporally
impulsive detections 5) The generation of dense detection
maps from sparse detections and using thresholding by hys-
1
Figure 1. Examples of different reflections (shown in green). Reflection is the result of superimposing different layers over each other. As a
result they have a wide range of colors and shapes.
teresis to avoid selecting particular thresholds for the system
parameters 6) Using the generated maps to perform better
frame rate conversion in regions of reflection. Frame rate
conversion is a computer vision application that is widely
used in the post-production industry. In the next section we
present a review on the relevant techniques for layer separa-
tion. In section 3 we propose our layer separation technique.
We then go to propose our Bayesian framework followed by
the results section.
2. Review on Layer Separation Techniques
A mixed image Mis modeled as a linear combination
between the source layers L1and L2according to the mix-
ing parameters (a, b)as follows.
M=aL1+bL2(1)
Layer separation techniques attempt to decompose reflec-
tion Minto two independent layers. They do so by ex-
changing information between the source layers (L1and
L2) until their mutual independence is maximized. This
however requires the presence of two mixtures of the same
layers under two different mixing proportions [11,4,1,3,
2]. Different separation techniques use different forms of
expressing the mutual layer independence. Current forms
used include minimizing the number of corners in the sep-
arated layers [7] and minimizing the grayscale correlation
between the layers [11].
Other techniques [17,12,13] avoid the requirement of
having two mixtures of the same layers by using tempo-
ral information. However they often require either a static
background throughout the whole image sequence [17],
constraint both layers to be of non-varying content through
time [13], or require the presence of repetitive dynamic mo-
tion in one of the layers [12]. Yair Weiss [17] developed a
technique which estimates the intrinsic image (static back-
ground) of an image sequence. Gradients of the intrinsic
layer are calculated by temporally filtering the gradient field
of the sequence. Filtering is performed in horizontal and
vertical directions and the generated gradients are used to
reconstruct the rest of the background image.
3. Layer Separation Using Color Independence
The source layers of a reflection Mare usually color in-
dependent. We noticed that the red and blue channels of
Mare the two most uncorrelated RGB channels. Each of
these channels is usually dominated by one layer. Hence the
source layers (L1, L2)can be estimated by exchanging in-
formation between the red and blue channels till the mutual
independence between both channels is maximized. Infor-
mation exchange for layer separation was first introduced
by Sarel et. al [12] and it is reformulated for our problem as
follows
L1=MR−αMB
L2=MB−βM R(2)
Here (MR, MB) are the red and blue channels of the
mixture Mwhile (α, β)are separation parameters to be
calculated. An exhaustive search for (α, β)is performed.
Motivated by Levin et. al. work on layer separation [7], the
best separated layer is selected as the one with the lowest
cornerness value. The Harris cornerness operator is used
here. A minimum texture is imposed on the separated lay-
ers by discarding layers with a variance less than Tx. For an
8-bit image, Txis set to 2. The removal of this constraint
can generate empty meaningless layers. The novelty in this
layer separation technique is that unlike previous techniques
[11,4,1,3,2], it only requires one image.
Fig.2shows separation results generated by the proposed
technique for different images. Results show that our tech-
nique reduces reflections and shadows. Results are only dis-
played to illustrate a preprocess step, that is used for one of
our reflection measures and not to illustrate full reflection
removal. Blocky artifacts are due to processing images in
50 ×50 blocks. These artifacts are irrelevant to reflection
detection.
4. Bayesian Inference for Reflection Detection
(BIRD)
The goal of the algorithm is to find regions in image
sequences containing reflections. This is achieved by an-
(a) (b) (c)
(d) (e) (f)
Figure 2. Reducing reflections/shadows using the proposed layer separation technique. Color images are the original images with reflec-
tions/shadows (shown in green). The uncolored images represent one source layer (calculated by our technique) with reflections/shadows
reduced. In (e) reflection still remains apparent however the person in the car is fully removed.
alyzing trajectories of feature points. Trajectories are gen-
erated using KLT feature point tracker [9,14]. Denote Pi
n
as the feature point of ith track in frame nand Fi
nas the
50 ×50 image patch centered on Pi
n. Trajectories are ana-
lyzed by examining all feature points along tracks of length
more than 4 frames. For each point, analysis are carried
over the three image patches (Fi
n−1,Fi
n,Fi
n+1). Based on
the analysis outcome, a binary label field li
nis assigned to
each Fi
n.li
nis set to 1 for reflection and 0 otherwise.
4.1. Bayesian Framework
The system derives an estimate for li
nfrom the posterior
P(l|F)(where (i,n) are dropped for clarity). The posterior
is factorized in a Bayesian fashion as follows
P(l|F) = P(F|l)P(l|lN)(3)
The likelihood term P(F|l)consists of 9 detectors D1−D9
each performing different analysis on Fand operating at
thresholds T1−9(see Sec. 4.5.1). The prior P(l|lN)en-
forces various smoothness constraints in space and time to
reject spatially and temporally impulsive detections and to
generate dense detection masks. Here Ndenote the spatio-
temporal neighborhood of the examined site.
4.2. Layer Separation Likelihood
This likelihood measures the ability of decomposing an
image patch Fi
ninto two independent layers. Three detec-
tors are proposed. Two of them attempts to perform layer
separation before analyzing data while the third measures
the possibility of layer separation by measuring the color
channels independence.
Layer Separation via Color Independence D1:Our
technique (presented in Sec.3) is used to decompose the im-
age patch Fi
ninto two layers L1i
nand L2i
n. This is applied
for every point along every track. Reflection is detected
by comparing the temporal behavior of the observed image
patches Fwith the temporal behavior of the extracted lay-
ers. Patches containing reflection are defined as ones with
higher temporal discontinuity before separation than after
separation. Temporal discontinuity is measured using struc-
ture similarity index SSIM [16] as follows
D1i
n= max(SS(Gi
n,Gi
n−1),SS(Gi
n,Gi
n+1))
−max(SS(Li
n,Li
n−1),SS(Li
n,Li
n+1))
SS(Li
n,Li
n−1) = max(SS(L1i
n,L1i
n−1),SS(L2i
n,L2i
n−1)))
SS(Li
n,Li
n+1) = max(SS(L1i
n,L1i
n+1),SS(L2i
n,L2i
n+1))
Here G= 0.1FR+ 0.7FG+ 0.2FBwhere (FR,FG,FB)
are the red, green and blue components of Frespectively.
SS(Gi
n,Gi
n−1)denotes the structure similarity between the
two images Fi
nand Fi
n−1. We only compare the structures
of (Gi
n,Gi
n−1)by turning off the luminance component of
SSIM [16]. SS(., .)returns an a value between 0−1where
1denotes identical similarity. Reflection is detected if D1i
n
is less than T1.
Intrinsic Layer Extraction D2:Let INTRidenote the
intrinsic (reflectance) image extracted by processing the
50 ×50 ith track using Yair technique [17]. In case of re-
flection the structure similarity between the observed mix-
ture Fi
nand INTRishould be low. Therefore, Fi
nis flagged
as containing reflection if SS(Fi
n,INTRi)is less than T2.
Color Channels Independence D3:This approach
measures the Generalized Normalized Cross Correlation
(GNGC) [11] between the red and blue channels of the ex-
amined patch Fi
nto infer whether the patch is a mixture
between two different layers or not. GNGC takes values
between 0 and 1 where 1 denotes perfect match between
the red and blue channels (MRand MBrespectively). This
analysis is applied to every image patch Fi
nand reflection
is detected if GNGC(MR, M B)<T3.
4.3. Image Sharpness Likelihood: D4,D5
Two approaches for analyzing image sharpness are used.
The first, D4, estimates the first order derivatives for the
examined patch Fi
nand flags it as containing reflection if
the mean of the gradient magnitude within the examined
patch is smaller than a threshold T4. The second approach,
D5, uses the sharpness metric of Ferzil et. al. [5] and flags
a patch as reflection if its sharpness value is less than T5.
4.4. Temporal Discontinuity Likelihood
SIFT Temporal Profile D6:This detector flags the ex-
amined patch Fi
nas reflection if its SIFT features [8] are
undergoing high temporal mismatch. A vector p= [xsg] is
assigned to every interest point in Fi
n. The vector contains
the position of the point x= (x, y), scale and dominate ori-
entation from the SIFT descriptor, s= (δ, o), and the 128
point SIFT descriptor g. Interest points are matched with
neighboring frames using [8]. Fi
nis flagged as reflection
if the average distance between the matched vectors pis
larger than T6.
Color Temporal Profile D7:This detector flags the im-
age patch Fi
nas reflection if its grayscale profile does not
change smoothly through time. The temporal change in
color is defined as follows
D7i
n= min(kCi
n− Ci
n−1k,kCi
n− Ci
n+1k)(4)
Here Ci
nis the mean value for Gi
n, the grayscale representa-
tion of Fi
n.Fi
nis flagged as reflection if D7i
n>T7.
AutoCorrelation Temporal Profile D8:This detector
flags the image patch Fi
nas reflection if its autocorrelation
is undergoing large temporal change. The temporal change
in the autocorrelation is defined as follows
D8i
n=rmin( 1
NkAi
n− Ai
n−1k2,1
NkAi
n− Ai
n+1k2)
(5)
Ai
nis a vector containing the autocorrelation of Gi
nwhile N
is the number of pels in Ai
n.Fi
nis flagged as reflection if
D8i
nis bigger than T8.
Motion Field Divergence D9:D9for the examined
patch Fi
nis defined as follows
D9i
n=DFD (kdiv(d(n))k+kdiv(d(n+ 1))k)/2(6)
DFD and div(d(n)) are the Displaced Frame Difference
and Motion Field Divergence for Fi
n.d(n)is the 2D motion
vector calculated using block matching. DFD is set to the
minimum of the forward and backward DFDs. div(d(n))
is set to the minimum of the forward and backward di-
vergence. The divergence is averaged over blocks of two
frames to reduce the effect of possible motion blur gener-
ated by unsteady camera motion. Fi
nis flagged as reflection
if D9>T9.
4.5. Solving for li
n
4.5.1 Maximum Likelihood (ML) Solution
The likelihood is factorized as follows
P(F|l) = P(l|D1)P(l|D2−8)P(l|D9)(7)
The first and last terms are solved using D1<T1and
D9>T9respectively. D2−8are used to form one strong
detector Dsand P(l|D2−8)is solved by Ds>Ts. We
found that not including (D1,D9) in Dsgenerates better de-
tection results than when included. Feature analysis of each
detector are averaged over a block of three frames to gen-
erate temporally consistent detections. T9is fixed to 10 in
all experiments. In Sec. 4.5.2 we avoid selecting particular
thresholds for (T1,Ts) by imposing spatial and temporal
priors on the generated maps.
Calculating Ds:The strong detector Dsis expressed as
a linear combination of weak detectors operating at different
thresholds Tas follows
P(l|D2−8) =
M
X
k=1
W(V(k),T)P(DV(k)|T )(8)
10−2 10−1 100
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
False Alarm Rate
Correct Detection Rate
D1
D2
D3
D4
D5
D6
D7
D8
D9
Adaboost Ds
Figure 3. ROC for D1−9and Ds. The Adaboost detector Dsout-
performs all other techniques and D1is the second best in the
range of false alarms <0.1.
Here Mis the number of weak detectors (fixed to 20) used
in forming Dsand V(k)is a function which returns a value
between 2-8 to indicate which detectors from D2−8are
used. k indexes the weak detectors in order of their impor-
tance as defined by the weights W.Wand Tare learned
through Adaboost [15] (see Tab. 1). Our training set consist
of 89393 images of size 50×50 pels. Reflection is modeled
in 35966 images each being a synthetic mixture between
two different images.
Fig. 3shows the the Receiver Operating Characteristic
(ROC) of applying D1−9and Dson the training samples.
Dsoutperforms all the other detectors due to its higher cor-
rect detection rate and lower false alarms.
D6D8D5D3D2D4D7
W1.31 0.96 0.48 0.52 0.33 0.32 0.26
T0.29 6.76e−60.04 0.95 0.61 7 2.17
Table 1. Weights Wand operating thresholds Tfor the best seven
detectors selected by Adaboost.
4.5.2 Successive Refinement for Maximum A-
Posteriori (MAP)
The prior P(l|lN)of Eq. 3imposes spatial and temporal
smoothness on detection masks. We create a MAP estimate
by refining the sparse maps from the previous ML steps. We
first refine the labeling of all the existing feature points P
in each image and then use the overlapping 50 ×50 patches
around the refined labeled points as a dense pixel map.
ML Refinement: First we reject false detections from
ML which are spatially inconsistent. Every feature point
l= 1 is considered and the sum of the geodesic distance
from that site to the two closest neighbors which are labeled
l= 1 is measured. When that distance is more than 0.005
then that decision is rejected i.e. we set l= 0 . Geodesic
distances allow the nature of the image material between
point to be taken in to account more effectively and have
been in use for some time now [10]. To reduce the compu-
tational load of this step, we downsample the image mas-
sively by 50 in both directions. This retains gross image
topology only.
Spatio-Temporal Dilation: Labels are extended in
space and time to other feature points along their trajecto-
ries. If li
n= 1, all feature points lying along the track iare
set to l= 1. In addition, lis extended to all image patches
(Fn) overlapping spatially with the examined patch. This
generates a denser representation of the detection masks.
We call this step ML-Denser.
Hysteresis: We can avoid selecting particular thresholds
[T1,Ts]for BIRD by applying Hysteresis using a set of dif-
ferent thresholds. Let TH= [−0.4,5] and TL= [0,3] de-
note a high and low configuration for [T1,Ts]. Detection
starts by examining ML-Denser at high thresholds. High
thresholds generate detected points Phwith high confi-
dence. Points within a small geodesic distance (< Dgeo )
and small euclidean distance (< Deuc) to each other are
grouped together. Here we use (Dgeo ,Deuc) = (0.0025,4)
and resize the examined frames as mentioned previously.
The centroids of each group is then calculated. Thresholds
are lowered and a new detection point is added to an exist-
ing group if it is within Dgeo and Deuc to the centroid of this
group. This is the hysteresis idea. If however the examined
point has a large euclidean distance (> Deuc) but a small
geodesic distance (< Dgeo) to the centroid of all existing
groups, a new group is formed. Points at which distances
> Dgeo and > Deuc are regarded as outliers and discarded.
Group centroids are updated and the whole process is re-
peated iteratively till the examined threshold reaches TL.
The detection map generated at TLis made more dense by
performing Spatio-Temporal Dilation above.
Spatio-Temporal ‘Opening’: False alarms of the previ-
ous step are removed by propagating the patches detected
in the first frame to the rest of the sequence along the fea-
ture point trajectories. A detection sample at fame nis
kept if it agrees with the propagated detections from the
previous frame. Correct detections missed from this step
are recovered by running Spatio-Temporal Dilation on the
‘temporally eroded’ solution. This does mean that trajecto-
ries which do not start in the first frame are not likely to be
considered, however this does not affect the performance in
our real examples shown here. The selection of an optimal
frame from which to perform this opening operation is the
subject of future work.
=
Figure 4. From Top: ML (calculated at (T1,Ts) = (−0.13,3.15)), Hysteresis and Spatio-Temporal ‘Opening’ for three consecutive frames
from the SelimH sequence. Reflection is shown in red and detected reflection using our technique is shown in green. Spatio-Temporal
‘Opening’ rejects false alarms generated by ML and by Hysteresis (shown in yellow and blue respectively).
5. Results
5.1. Reflection Detection
15 sequences containing 932 frames of size 576 ×720
are processed with BIRD. Full sequences with reflection de-
tection can be found in www.sigmedia.tv/Misc/CVPR2011.
Fig. 4compares the ML, Hysteresis and Spatio-Temporal
‘Opening’ for three consecutive frames from the SelimH se-
quence. This sequence contains occlusion, motion blur and
strong edges in the reflection (shown in red). The ML so-
lution (first line) generates good sparse reflection detection
(shown in green), however it generates some errors (shown
in yellow). Hysteresis rejects these errors and generates
dense masks with some false alarm (shown in blue). These
false alarms are rejected by Spatio-Temporal ‘Opening’.
Fig. 5shows the result of processing four sequences us-
ing BIRD. In the first two sequences, BIRD detected regions
of reflections correctly and discarded regions of occlusion
(shown in purple) and motion blur (shown in blue). In Girl-
Ref most of the sequence is correctly classified as reflection.
In SelimK1 the portrait on the right is correctly classified
as containing reflection even in the presence of motion blur
(shown in blue). Nevertheless, BIRD failed in detecting the
reflection on the left portrait as it does not contain strong
distinctive feature points.
Fig. 6shows the ROC plot for 50 frames from SelimH.
Here we compare our technique BIRD against DFD and Im-
age Sharpness[5]. DFD, flags a region as reflection if it has
high displaced frame difference. Image Sharpness flags a
region as reflection if it has low sharpness. Frames are pro-
cessed on 50 ×50 blocks. Ground truth reflection masks
are generated manually and detection rates are calculated
on pel basis. The ROC shows that BIRD outperforms the
other techniques by achieving a very high correct detection
rate of 0.9 for a false detection rate of 0.1. This is a major
improvement over a correct detection rate of 0.2 and 0.1 for
DFD and Sharpness respectively.
5.2. Frame Rate Conversion: An application
One application for reflection detection is improving
frame rate conversion in regions of reflection. Frame rate
conversion is the process of creating new frames from ex-
isting ones. This is done by using motion vectors to inter-
polate objects in the new frames. This process usually fails
in regions of reflection due to motion estimation failure.
Fig. 7illustrates the generation of a slow motion effect
for the person’s leg in GirlRef (see Fig. 5, third line). This
is done by doubling the frame rate using the Foundry’s Kro-
nos plugin [6]. Kronos has an input which defines the den-
sity of the motion vector field. The larger the density the
Figure 5. Detection results of BIRD (shown in green) on, From top: BuilOnWind [10,35,49],PHouse 9-11, GirlRef [45,55,65],SelimK1
32-35. Reflections are shown in red. Good detections are generated despite occlusion (shown in purple) and motion blur (shown in blue).
For GirlRef we replace Hysteresis and Spatio-Temporal ‘Opening’ with a manual parameter configuration of (T1,Ts)=(−0.01,3.15)
followed by a Spatio-Temporal Dilation step. This setting generates good detections for all examined sequences with static backgrounds.
more detailed the vector and hence the better the interpo-
lation. However, using highly detailed vectors generate ar-
tifacts in regions of reflections as shown in Fig. 7(second
line). We reduce these artifacts by lowering the motion vec-
tor density in regions of reflection indicated by BIRD (see
Fig. 7, third line). Image sequence results and more exam-
ples are available in www.sigmedia.tv/Misc/CVPR2011.
6. Conclusion
This paper has presented a technique for detecting reflec-
tions in image sequences. This problem was not addressed
before. Our technique performs several analysis on feature
point trajectories and generates a strong detector by com-
bining these analysis. Results show major improvement
over techniques which measure image sharpness and tem-
poral discontinuity. Our technique generates high correct
detection rate with rejection to regions containing compli-
cated motion eg. motion blur, occlusion. The technique
was fully automated in generating most results. As an ap-
plication, we showed how the generated detections can be
used to improve frame rate conversion. A limiting factor
of our technique is that it requires source layers with strong
distinctive feature points. This could lead to incomplete de-
tections.
Acknowledgment: This work is funded by the Irish Re-
serach Council for Science, Engineering and Technology
Figure 7. Slow motion effect for the person’s leg of GirlRef (see Fig: 5third line). Top: Original frames 59-61; Middle: generated frames
using the Foundry’s plugin Kronos [6] with one motion vector calculated for every 4 pels; Bottom; with one motion vector calculated for
every 64 pels in regions of reflection.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
False Alarm Rate
Correct Detection Rate
BIRD
DFD
Sharpness
Figure 6. ROC plots for our technique BIRD, DFD and Sharpness
for SelimH. Our technique BIRD outperforms DFD and Sharp-
ness with a massive increase in the Correct Detection Rate.
(IRCSET) and Science Foundation Ireland (SFI).
References
[1] A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, and Y. Y.
Zeevi. Sparse ICA for blind separation of transmitted and
reflected images. International Journal of Imaging Systems
and Technology, 15(1):84–91, 2005. 1,2
[2] N. Chen and P. De Leon. Blind image separation through
kurtosis maximization. In Asilomar Conference on Signals,
Systems and Computers, volume 1, pages 318–322, 2001. 1,
2
[3] K. Diamantaras and T. Papadimitriou. Blind separation of
reflections using the image mixtures ratio. In ICIP, pages
1034–1037, 2005. 1,2
[4] H. Farid and E. Adelson. Separating reflections from images
by use of independent components analysis. Journal of the
Optical Society of America, 16(9):2136–2145, 1999. 1,2
[5] R. Ferzli and L. J. Karam. A no-reference objective image
sharpness metric based on the notion of just noticeable blur
(jnb). IEEE Trans. on Img. Proc. (TIPS), 18(4):717–728,
2009. 4,6
[6] T. Foundry. Nuke, furnace suite. www.thefoundry.co.uk. 6,
8
[7] A. Levin, A. Zomet, and Y. Weiss. Separating reflections
from a single image using local features. In IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), pages
306–313, 2004. 1,2
[8] D. G. Lowe. Distinctive image features from scale-invariant
keypoints. Int. J. Comput. Vision, 60(2):91–110, 2004. 4
[9] B. D. Lucas and T. Kanade. An iterative image registra-
tion technique with an application to stereo vision (darpa).
In DARPA Image Understanding Workshop, pages 121–130,
1981. 3
[10] D. Ring and F. Pitie. Feature-assisted sparse to dense motion
estimation using geodesic distances. In International Ma-
chine Vision and Image Processing Conference, pages 7–12,
2009. 5
[11] B. Sarel and M. Irani. Separating transparent layers through
layer information exchange. In European Conference on
Computer Vision (ECCV), pages 328–341, 2004. 1,2,4
[12] B. Sarel and M. Irani. Separating transparent layers of repet-
itive dynamic behaviors. In ICCV, pages 26–32, 2005. 1,
2
[13] R. Szeliski, S. Avidan, and P. Anandan. Layer extrac-
tion from multiple images containing reflections and trans-
parency. In CVPR, volume 1, pages 246–253, 2000. 1,2
[14] C. T. Takeo and T. Kanade. Detection and tracking of
point features. Carnegie Mellon University Technical Report
CMU-CS-91-132, 1991. 3
[15] P. Viola and M. Jones. Robust real-time object detection. In
International Journal of Computer Vision, 2001. 5
[16] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli. Image
quality assessment: from error visibility to structural simi-
larity. TIPS, 13(4):600–612, April 2004. 4
[17] Y. Weiss. Deriving intrinsic images from image sequences.
In ICCV, pages 68–75, 2001. 1,2,4