A theory of active object localization
ABSTRACT We present some theoretical results related to the problem of actively searching for a target in a 3D environment, under the constraint of a maximum search time. We define the object localization problem as the maximization over the search region of the Lebesgue integral of the scene structure probabilities. We study variants of the problem as they relate to actively selecting a finite set of optimal viewpoints of the scene for detecting and localizing an object. We do a complexity-level analysis and show that the problem variants are NP-Complete or NP-Hard. We study the tradeoffs of localizing vs. detecting a target object, using single-view and multiple-view recognition, under imperfect dead-reckoning and an imperfect recognition algorithm. These results motivate a set of properties that efficient and reliable active object localization algorithms should satisfy.
- Citations (12)
-
Cited In (0)
-
Article: Active Object Recognition: Looking for Differences
[show abstract] [hide abstract]
ABSTRACT: This paper introduces an information-based methodology for view selection that actively exploits prior knowledge about the objects to be found in a scene. The methodology is used to implement an active recognition strategy which effectively puts prior constraints from the object database into the gaze control (planning) loop. Theoretical results are presented and discussed along with promising experimental data.International Journal of Computer Vision 01/2001; 43(3):189-204. · 3.74 Impact Factor -
SourceAvailable from: John K Tsotsos
Article: Active Object Recognition Integrating Attention and Viewpoint Control.
Computer Vision and Image Understanding. 01/1997; 67:239-260. -
Article: A Computational Model of View Degeneracy.
IEEE Trans. Pattern Anal. Mach. Intell. 01/1999; 21:673-689.
Page 1
A Theory of Active Object Localization
Alexander Andreopoulos
John K. Tsotsos
Technical Report CSE-2009-01
March 10 2009
Department of Computer Science and Engineering
4700 Keele Street Toronto, Ontario M3J 1P3 Canada
Page 2
A Theory of Active Object Localization
Alexander Andreopoulos, John K. Tsotsos
Dept. of Computer Science and Engineering
Centre for Vision Research, York University
Toronto, Ontario, Canada
{alekos, tsotsos}@cse.yorku.ca
Abstract
We present some theoretical results related to the prob-
lem of actively searching for a target in a 3D environment,
under the constraint of a maximum search time. We define
theobjectlocalizationproblemasthemaximizationoverthe
searchregionof theLebesgueintegralof the scene structure
probabilities. We study variants of the problem as they re-
late to actively selecting a finite set of optimal viewpoints
of the scene for detecting and localizing an object. We do
a complexity-level analysis on the problems, by showing
that in the best case scenario, the problems have high or-
der pseudo-polynomial running times or are NP-Complete.
We study the tradeoffs of localizing vs. detecting a target
object,usingsingle-viewandmultiple-viewrecognition,un-
der imperfect dead-reckoning and an imperfect recognition
algorithm. We use these results to propose a set of sufficient
properties that efficient and reliable active object localiza-
tion algorithms should satisfy.
1. Introduction
In one of the earliest known treatises on vision [1], Aris-
totle describes vision as a passive process that is mediated
by what he refers to as the “transparent”(διαϕαν´ ες), an in-
visible property that allows the sense organ to become like
the actual form of the visible object. Much has been learned
sincethenandtoday,a populardefinitionis that visionis the
process of discovering from images what is present in the
world and where it is [10]. Within this context, four levels
of tasks in the vision problem are discernible [17]:
• Detection: is a particular item present in the stimulus?
• Localization: detection plus accurate location of item.
• Recognition: localization of the items present in the
stimulus plus their accurate description through their
association with linguistic labels.
• Understanding: recognitionplusroleofstimulusinthe
context of the scene.
The concept of active perception or active vision was
first introduced by Bajcsy [2], as “a problem of intelligent
control strategies applied to the data acquisition process”.
Active control of a vision based sensor offers a number of
benefits [19]. It allows us to: (i) Bring into the sensor’s
field of view regions that are hidden due to occlusion and
self-occlusion. (ii)Foveateandcompensateforspatialnon-
uniformity of the sensor. (iii) Increase spatial resolution
through sensor zoom and observer motion that brings the
region of interest in the depth of field of the camera. (iv)
Disambiguate degenerate views due to finite camera reso-
lution, lighting changes and induced motion [5]. (v) Deal
with incomplete information and complete a task.
An active vision system’s benefits must outweigh the as-
sociated executioncosts [19]. The associated costs in an ac-
tive vision system include:(i) Deciding the actions to per-
form and their execution order. (ii) The time to execute
the commands and bring the actuators to their desired state.
(iii) Adapt the system to the new viewpoint, find the cor-
respondences between the old and new viewpoint and deal
with the inevitable ambiguities due to sensor noise.
A number of active object detection, localization and
recognition algorithms have been proposed over the years
[3, 4, 6, 8, 11, 12, 13, 15, 16, 20, 21]. A smaller number of
papers have dealt with issues related to the complexity and
reliability of such systems [5, 9, 18, 19, 21]. Limited work
exists on the complexity of search tasks and the effect that
imperfect recognition and imperfect dead-reckoninghas on
object localization. In this paper, we argue that the problem
is likely intractable, by proving that the active object local-
ization problem is NP-Hard and by showing that the prob-
lem remains difficult at best, even under certain simplifying
variants of the main problem. We study the tradeoffs of lo-
calizing vs. detecting a target object under single-view and
multiple-view recognition schemes and show that there are
a number of bias/variance/entropy relationships and trade-
offs between the reliability of target localization and tar-
get detection, that depend on the quality of the recognition
algorithm used and the magnitudes of the correspondence
1
Page 3
or dead-reckoning errors. We exemplify the relevance of
these results in practical computer vision applications, as
first-principles based motivators for a set of properties that
active object localization algorithms should satisfy.
2. Problem Formulation
Assumption 1. We assume that exactly one instance of the
target object exists in the scene.
Definition 1. (Search Space) The search space consists of
a 3D region whose coordinates are expressed with respect
to an inertial coordinate frame.
Definition 2. (Target Map) The target map is a discretiza-
tion of the inertial coordinate frame into non-overlapping
3D cells coinciding with the search space. Each cell is as-
signed the probability of containing the target centroid.
We use a set of positive integers, C ? {1,2,...,|C|},
to index each cell in the target map. Notice, that, since we
assume a single target object exists in the scene, the target
map cell values sum to one.
Definition 3. (Scene Sample Function) A scene sample
function µv(? x) denotes the sensor output, where v repre-
sents the values assigned to the controllable sensor param-
eters (e.g., coordinateframe, zoom, focus) and? x is an index
into the scene sample (e.g., in the case of greyscale images
? x = (i,j) can denote a pixel index).
We define a probability space Υ = (X1,Σ1,p1) for
the sensor parameter states, where v ∈ X1denotes a sen-
sor parameter state, Σ1 is a σ-algebra of X1 and p1 is
a probability measure on X1 whose support includes all
states v that have a non-zero probability of occuring in the
search space. Similarly, for each v, we define a probability
space Υ(v) = (Xv,Σv,pv) with pv(µv(? x)) > 0 for each
µv(? x) ∈ Xv, denoting the probability of occurence of the
corresponding scene sample function given sensor parame-
ter values v. The underlying probability measure, models
the sensed scene uncertainty (e.g., image noise, varying il-
lumination conditions, dead-reckoningerrors, etc.) and it is
largely unknown and difficult to model in practice. Since
we do not know the distribution of p1, pv, we approximate
them by using a finite sample of optimally selected v, µv.
Definition 4. (Sequence Cost) Given a sequence v1,...,vn
of sensor parameter states, the cost T(n) associated with
executing the sequence is given by T(n) ? T(n − 1) +
to(v1,...,vn), where to(v1,...,vn) > 0 denotes the cost of
moving to state vngiven all previous states and T(1) is the
cost of reaching state v1from the initial sensor state.
We definethe 3D object localizationand constrainedac-
tive object localization (CAOL) problems as follows:
Definition 5. (3D Object Localization) Find the cellˆit=
argmaxi
?p(ci|µv(? x))dpvdp1, where we are taking the
Lebesgue integrals[14] over Υ and Υ(v) and ci denotes
the event that the target object’s centroid is in cell i.
p(ci|µv(? x)) is a recognition algorithm depending on v, µv.
If p(ci|µv(? x)) is a “good” algorithm,ˆit = it, where itis
defined as the cell containing the target’s centroid.
Definition 6. (Constrained Active Object Localization)
Find the cellˆit ∈ C maximizing p(cˆit|µvn(? x),...,µv1(? x))
across all n > 0, all sequences v1,...,vnof sensor states
and all corresponding µv1,...,µvn, under the constraint
T(n) ≤ T?, where T?is a search cost bound.
Solutions to the CAOL problem must compensate for
(i) our limited knowledge on Υ, Υ(v) and (ii) the need
to minimize sensor movements, by finding a finite sample
µvn(? x),...,µv1(? x) that best samples the unknownprobabil-
ity spaces without exceeding the maximum alloted search
cost. Even if we know the distributions of the probability
spaces Υ, Υ(v), eliminating point (i) and potentially even
making p(ci|µvn(? x),...,µv1(? x)) a function of v1,...,vn, the
problem remains intractable. As we show later, the CAOL
problem belongs to the class of NP-Hard problems [7],
implying that there is no known polynomial time algo-
rithm that solves the problem. One can attempt to make
it tractable by using variants of the problem:
Definition 7. (Constrained Active Object Localization:
Variant 1) Find a sequence v1,...,vnof sensor states and
the cellsˆit ∈ C satisfying p(cˆit|µvn(? x),...,µv1(? x)) ≥ θ
and T(n) ≤ T?for some µv1,...,µvn, where T?is a search
cost bound and θ is a probability threshold.
Definition 8. (Constrained Active Object Localiza-
tion: Variant 2) Find the cellˆit
ing p(cˆit|µvn(? x),...,µv1(? x)) across all n > 0, all se-
quences v1,...,vn of sensor states and all corresponding
µv1,...,µvn, under the constraint T(n) ≤ T?, where T?is a
search cost bound and each movement cost to(v1,...,vi) is
bounded from below by a positive non-zero constant C?.
∈
C maximiz-
Theorem 1. (Simplified Bayesian Updating)
Assume p(µvn|ci,µvn−1,...,µv1) = p(µvn|ci).
p(ci|µvn,...,µv1) =
Then,
p(ci|µvn−1,...,µv1)p(µvn|ci)
?
jp(cj,µvn|µvn−1,...,µv1).
Proof. p(ci|µvn,...,µv1)p(µvn,...,µv1)=
,µv1)p(µvn|ci) ⇔ p(µvn|ci,µvn−1,...,µv1)= p(µvn|ci).
Notice alsothat
?
?
Whenwearenotusingthesimplifyingassumptionstated
in Theorem 1, we say we are using normal Bayesian up-
dating. Theorem 1 assumes that the scene sample func-
tions are conditionally independent given the cell i where
p(ci,µvn−1,...
jp(cj,µvn|µvn−1,...,µv1)=
jp(µvn|cj)p(cj|µvn−1,...,µv1).
Page 4
the target is centred. By Assumption 1, exactly one in-
stance of the target exists in the scene, which implies that
event ciis sufficient to determine which regions of µvn(if
any) correspond to the projection of the target object on
the image plane and which regions correspond to the back-
ground. We are implicitly assuming that p(µvn|ci) denotes
a generative modeling of the recognition algorithm’s resul-
tant binary segmentation into the foreground (target posi-
tion) and the background, based on a single view. Simi-
larly p(ci|µvn,...,µv1) denotes the correspondingprobabil-
ity of event ci, based on the bayesian fusion of multiple-
views µvn,...,µv1. Notice that for a uniform prior p(ci),
argmaxip(µvn|ci) = argmaxip(ci|µvn). The greater the
uncertaintyimplicit in spaces Υ(v), the weaker the assump-
tion of conditional independencebecomes, due to increased
sources of error. Nevertheless, it is convenient to use Theo-
rem 1 to model various localization and detection tradeoffs.
Inthenextsection,weprovethatif weknowthedistribu-
tionsofΥ, Υ(v) andundernormalBayesianupdating,Vari-
ant 1 of the CAOL problem (Def.7) and the corresponding
detection problem are NP-Hard and NP-Complete respec-
tively. It is easy to see that Def.7 is reducible to the simi-
larly discretized version of Def.6 and thus, the CAOL prob-
lem is NP-Hard. Variant 2 of the problem, has a high-order
pseudo-polynomial solution: Since there are at most ?T?
sensor settings to execute within time T?, an enumeration
andevaluationofall candidatesolutions,runs in Ω(m?T?
wheremisthetotalnumberofpossiblestates. Butthissolu-
tion remains exponential in terms of the size of T?. Using a
reduction from Def.7, we notice that Def.8 is NP-Hard and
if we add to Def.7 the minimum cost constraint of Def.8,
the resulting problem remains NP-Hard—the reductions in-
volve setting C?to the minimum sensor state pair cost. We
could also approach the localization problem by threshold-
ing the generative probability p(µvn(? x)|ci) rather than the
discriminativeprobabilityp(ci|µvn(? x),...,µv1(? x)). Ye [21]
uses a binary classifier with a presumed zero false positive
rate, to show that a similar problem is NP-Complete.
C??
C??),
3. The Constrained Active Object Localization
Problem:Variant 1, is NP-Hard
To analyze the complexity of the constrained active ob-
ject localization problem when we know the distributions
of Υ, Υ(v), we first reformulate the problem into the cor-
responding detection problem, taking into account the fi-
nite precision of floating point arithmetic, and the finite set
V that is necessary to represent the space of scene sam-
ple functions (Xv) achievable across the sensor parameter
states (X1). Let Q+? {p
positive rational numbers. We model each probability by a
non-negativerational in Q+
q: p,q ∈ Z+} denote the set of
1? {x ∈ Q+∪ {0} : x ≤ 1}.
Definition 9. (Valid Sequence) Let v?
denote an ordered set of length l(i), where πi:Z+→ Z+is
a one-to-one mapping. A sequence v?
sets is valid if l(i1) = 1, l(ik+1) = l(ik) + 1 for each
1 ≤ k ≤ n − 1 and πik(j) = πik+1(j), ∀j 1 ≤ j ≤ l(ik).
We define an ordered set of length zero as v?
For any ordered set v?
(vπi(1),...,vπi(l(i)),vc). Also v?
i=(vπi(1),...,vπi(l(i)))
i1,...,v?
inof ordered
0? ().
i(vc) ?
i= (vπi(1),...,vπi(l(i))) let v?
0(vc) ? (vc) and i0? 0.
Definition 10. (Πjt:Constrained Active Object Detec-
tion Problem : Variant 1)
INSTANCE: A finite set V = {v1,...,v|V |}. A cost con-
straint B?∈ Q+, and a cost function C(v?
v?
S?∈ Z+denoting the number of cells in the target map.
A function f1(v?
v?
A function f2(v?
such that?S?
f2(v?
k=1
i) ∈ Q+where
i= (vπi(1),...,vπi(l(i))), i ∈ Z+, vπi(1),...,vπi(l(i))∈ V .
i,j) ∈ Q+
1such that for any ordered set
?
i,j) ∈ Q+
j=1f2(v?
0,j)?n
recognition threshold θ ∈ Q+
QUESTION: Is there a valid sequence v?
?n
Definition 11. (Π :Constrained Active Object Localiza-
tion Problem : Variant 1)
INSTANCE: Same as in Πjt(jtcan be arbitrary). We use a
bar to differentiate the input variables from those of Πjt.
TASK: Find a valid sequence v?
responding cells j, 1≤
?n
As θ,¯θ decrease, the expected running times of Πjt, Π
do not increase (e.g., for θ,¯θ = 0, solutions in O(|V |) are
trivial to find). Notice that for¯θ >1
cell that Π can output. We quote the Knapsack problem (an
NP-Complete problem) as given by Garey and Johnson [7]:
iand any 1 ≤ j ≤ S?,
vc∈Vf1(v?
i(vc),j) = 1.
1defined for 1 ≤ j ≤ S?,
i,j) = 1 for all ordered sets v?
f1(v?
?S?
1. A query cell 1 ≤ jt≤ S?.
iand
in,j) ? f2(v?
ik,j)
c=1f2(v?
ik−1,c)f1(v?
ik,c). A
i1,...,v?
inso that
k=1C(v?
ik) ≤ B?and f2(v?
in,jt) ≥ θ?
i1,...,v?
≤
inand the cor-
¯S?, which satisfy
j
k=1¯C(v?
ik) ≤¯ B?and¯f2(v?
in,j) ≥¯θ.
2, there is at most one
Definition 12. (Π?:Knapsack Problem)
INSTANCE: A finite set U, a “size” s(u) ∈ Z+and a
“value” w(u) ∈ Z+for each u ∈ U, a size constraint
B ∈ Z+, and a value goal K ∈ Z+.
QUESTION: Is there a subset U?
?
Πjtis in NP, since any candidate solution is verifiable in
polynomial time. We assume
wise ?U?⊆ U that satisfies Π?. We define a mapping f
from Π?to Πjtfor which Π?is true iff Πjtis true:
⊆
U such that
u∈U?s(u) ≤ B and?
u∈U?w(u) ≥ K?
K
?
u∈Uw(u)≤ 1 since other-
1. V ← U
2. B?← B
3. C(v?
i) = s(vπi(l(i)))
Page 5
4. S?← 2
5. θ ←
6. Weneedtodefinef1(v?
sets v?
1 ≤ j ≤ S?, such that f1, f2satisfy their preconditions
stated in Πjt.
K
?
u∈Vw(u)
i,j)andf2(v?
i,j)forallordered
ithat are composed of elements in V and all j,
For each distinct set U?⊆ V and each distinct ordering
o of the elements in U?, we assume d(U?,o) ∈ Z+is
unique and denotes the identifier of the corresponding or-
dered set v?
d(U?,o)= (vπd(U?,o)(1),...,vπd(U?,o)(l(d(U?,o))))
where l(d(U?,o))=|U?|.
for 1≤
composed of the first k elements of v?
v?
v?
{vπd(U?,o)(1),...,vπd(U?,o)(l(d(U?,o)))} ⊆ V , we need to de-
fine f1(v?
l(d(U?,o)). We also need to make sure f1(v?
f2(v?
tion of Πjtand only depend on j and the first k parameters
of v?
Furthermore, d(U?,o,k),
l(d(U?,o)), denotes the ordered set
k
≤
d(U?,o)— i.e.,
d(U?,o,k)=
d(U?,o,k)= (vπd(U?,o)(1),...,vπd(U?,o)(k)) and v?
jiff d(U?,o,k) = j. For any ordering o and set U?=
d(U?,o,k),j) and f2(v?
d(U?,o,k),j) for all 1 ≤ k ≤
d(U?,o,k),j) and
d(U?,o,k),j) satisfy the requirements set in the defini-
d(U?,o). For each instance of Π?we define f2in Πjtby
f2(v?
i,j) =
?l(i)
?
1
S?− 1(1 −
k=1w(vπi(k))
u∈Vw(u)
if j = jt
?l(i)
?
k=1w(vπi(k))
u∈Vw(u)
)
otherwise
Since?S?
ments in Πjt. Notice from Def.10 that if f2(v?
then f1(v?
0 < f2(v?
subset U?, each ordering o and each 1 ≤ k ≤ l(d(U?,o)),
we want to define f1so that
j=1f2(v?
i,j) = 1, f2(v?
i,j) satisfies the require-
ik,j) = 1,
ik,j) ?= 0. Also, if 0 < f2(v?
ik−1,j) < 1. From the definition of Πjt, for each
ik,j) < 1, then
f2(v?
ik,j) =
f2(v?
ik−1,j)f1(v?
j?=1f2(v?
ik,j)
?S?
ik−1,j?)f1(v?
ik,j?)
(2)
where ik = d(U?,o,k), 1 ≤ k ≤ l(d(U?,o)), is used to
denote a valid sequence of ordered sets. From Lemma 1
below, we know that for each sensor setting v?
there exists an assignment to function f1(v?
isfies Eq.(2) and depends only on the parameters v?
i.e., given parameters v?
Also Eq.(2) is independent of scaling factors applied on f1,
implying that we can assume that?
as wanted. We see that mapping f runs in polynomial time.
We nowshowthatthereexistsa validsequencev?
that satisfies Πjt, iff ∃U?⊆ U that satisfies Π?: If Πjt
holds, f2(v?
{vπin(1),...,vπin(l(in))} ⊆ U. Conversely, assume that for
ikand ∀j,
ik,j) that sat-
ik,j —
ikand j, f1is independentof set U?.
vc∈Vf1(v?
i(vc),j) = 1
i1,...,v?
in
in,jt) ≥ θ ⇒?
u∈U?w(u) ≥ K where U?=
a subset U?⊆ U problem Π?holds. Choose an arbitrary or-
dering o and let ik= d(U?,o,k), 1 ≤ k ≤ l(d(U?,o)) = n.
We see that f2(v?
proof holds regardless of the ordering assigned to U?. Re-
gardless of the ordering o assigned to U?,?l(in)
B?iff?
U?satisfying Π?iff an ordered set satisfies Πjt. This proves
that Πjt, undernormal Bayesian updating, is NP-Complete.
To prove that Π is NP-Hard, we define a mapping from
Πjtto Π as follows:¯V ← V ,¯ B?← B?,¯C(v?
¯S?← 2,¯θ ←2
1
3IA(k)+1
that takes a value of 1 iff boolean variable X is true and
A(k),¯A(k) are true iff f2(v?
respectively. By Lemma 1, this also implicitly defines¯f1.
We see that Πjtholds iff Π finds a valid sequence that is
satisfied by cell j = 1. This shows that Π is NP-Hard.
In the reduction from Π?to Πjt, each call to f2 is in
O(|V |) and takes O(|V |·S?) space to encode. We are mak-
ing the implicit assumption that f1, f2in Πjtand¯f1,¯f2in
Π have running times and encoding sizes that are polyno-
mial functions of |V |, S?and |¯V |,¯S?respectively, imply-
ing that the scene structure must exhibit a minimum degree
of “non-randomness”. From the above proofs and Lemma
1, we notice that f1(v?
p(µvk|cj,µvk−1,...,µv1). Only if¯f1(v?
clusively on j and vπik(l(ik)), would this constitute a proof
that Def.11is NP-Hard undersimplified Bayesian updating.
f2(v?
target maps and are typically set to a uniform distribution.
in,jt) ≥ θ. The converse direction of the
k=1C(v?
ik) ≤
u∈U?s(u) ≤ B, which proves that there is a subset
i) = C(v?
ik,2) =
i),
3,¯f2(v?
A(k), where IX∈ {0,1}is an indicator function
ik,1) =2
3IA(k)+1
2I ¯ A(k),¯f2(v?
2I ¯
ik,jt) ≥ θ or f2(v?
ik,jt) < θ
ik,j) and¯f1(v?
ik,j) correspond to
ik,j) depended ex-
0,j) and¯f2(v?
0,j) denote the prior distributions of the
Lemma 1. Let β,α1,...,αm∈ Q+
if β = 1, then α1?= 0 and if 0 < β < 1, then 0 < α1< 1.
If m > 1, ∃x1,...,xm∈ Q+
Proof. If β = 1, let x1 = 1 and let xi = 0 for i ?= 1.
If β = 0, let x2 = 1 and let xi = 0 for i ?= 2. Oth-
erwise, if 0 < β < 1, assume x1 > 0 and notice that
α1x1
?m
equation of yi=
consequently?m
and?m
which satisfy the linear equation. We leave it as an exercise
for the reader to verify that for any y2,...,yn∈ Q+∪ {0},
∃x1,x2,...,xn∈ Q+
4. Localization vs. Detection
1suchthat?m
i=1αi= 1,
1such that
α1x1
i=1αixi= β.
?m
i=1αixi= β ⇔ α1− βα1 =?m
xi
x1. Since 0 < β < 1, 0 < α1< 1 and
i=2αi> 0, which implies α1− βα1> 0
i=1βαi > 0. Therefore, there exist y2,...,yn ≥ 0
i=2(βαi)yi, a linear
1(x1?= 0) which satify yi=xi
x1.
We formalize some of the tradeoffs of single-view and
multiple-view recognition schemes for localizing and de-
tecting a target object under simplified Bayesian updating
and under a number of different sources of errors. In Sec.
4.1 we define and discuss the problems and in Sec. 4.2-4.3
we prove the respective theorems.
Page 6
4.1. Definitions and Discussion
Definition 13. (Correspondence Error) Any error in the
calculation of the correspondence(s) between the index
value ? x of a scene sample function µv(? x) and the target
map cell indices whose structure projects on ? x.
Definition 14. (Dead-Reckoning Errors) We are dealing
with dead-reckoning errors when there exists a rigid trans-
formationRT(·) of the sensor’s estimated coordinateframe
with respect to the inertial coordinate frame of the search
space, that corrects all correspondence errors without in-
troducing any new correspondence errors.
Definition 15. (Visibility) Cell i is visible for state vn, if it
falls in the sensor’s field of view and satisfies a set of nec-
essary conditions for localizing a target centered in i, that
only depend on the coordinates of a point in i and the depth
map of µvnwith respect to the sensor coordinate frame.
Definition 16. (Good Single-View Recognition) We have
good single-view recognition at step n if p(µvn|cit) is not
affected by changes to the inertial coordinate frame. Also,
under dead-reckoning errors, p(µvn|ci) ≥ p(µvn|¬ci) for
all target map distributions at step n − 1 iff i ∈ˆV (vn) and
RT(it) = i, or, i ?∈ˆV (vn) and RT(it) ?∈ˆV (vn).
RT(it) denotes the cell containing the transformation
of the target’s centroid under RT(·) (Def.14). p(µvn|¬ci)
is defined in Sec.4.2.
V (vn) is the ground truth of
visible cells for µvn, vn and no correspondence errors,
whileˆV (vn) denotes the calculated visible cells based on
our estimate of the sensor coordinate frame and under no
guaranty of perfect correspondences. Under perfect cor-
respondencesˆV (vn) = V (vn), but the converse does not
hold. For good single-view recognition, as the correspon-
dence errors increase, it is more likely that p(µvn|cit) <
p(µvn|¬cit). Def.16 implies that if i1,i2 ?∈ˆV (vn), i3 ∈
ˆV (vn) and RT(it) ?∈ ˆV (vn), p(µvn|ci1) = p(µvn|ci2)
and p(µvn|ci3) < p(µvn|ci1). Also, if RT(it) ∈ˆV (vn),
p(µvn|cRT(it)) > p(µvn|cj) ∀j ?= RT(it) (see Sec. 4.2).
Theorem 2. (Detection Tradeoff)
Assume it ∈ V (v1),...,it ∈ V (vn). Assume a uniform
target map prior and good single-view recognition. Let
X(n)
i
, Y(n)
i
denote Bernoulli random variables with prob-
ability of success p(ci|µvn,...,µv1), p(ci|µvn) respectively.
Detection at step n is based on maxj∈ˆV (vn)E(X(n)
maxj∈ˆV (vn)E(Y(n)
(i) Given vn, µvn, single-view detection at step n is inde-
pendent of dead-reckoning errors.
(ii) If p(µvn|ci) ≤ p(µvn|¬ci), E(X(n)
(iii) If p(µvn|ci) ≥ p(µvn|¬ci), E(X(n)
(iv) Ifˆit =ˆjt,ˆit = argmaxj∈CE(Y(n)
j
) or
j
) being above a given threshold.
i
) ≤ E(X(n−1)
) ≥ E(X(n−1)
j
) andˆjt =
i
).
ii
).
argmaxj∈CE(Y(n−1)
(v) Ifˆit ?=ˆjt,ˆit = argmaxj∈CE(Y(n)
argmaxj∈CE(Y(n−1)
j
that E(X(n)
ˆjt
j
), E(X(n)
ˆit) ≥ E(X(n−1)
ˆjt
).
j
) andˆjt =
), then it is not necessarily the case
ˆit) ≥ E(X(n−1)
Case (iv) shows that with good correspondences, detec-
tion based on fusing multiple views becomes more reliable
than single-view detection (sinceˆit,ˆjt ∈ˆV (vn)). Case
(v) shows that under dead-reckoning errors, there is an in-
creased likelihood that fusing multiple-views will lead to
more false negative detections (sinceˆit,ˆjt ∈ˆV (vn)), and
thus, single-view detection (case (i)) might be preferable
when dead-reckoning errors occur. Despite the strong as-
sumption of Def.16, correspondence or dead-reckoning er-
rors make the detection problem significantly harder.
).
Definition
p(ci|µvn,...,µv1). A single-view recognition algorithm has
dual support at step n if ∀i, x(n)
p(µvn|¬ci)
p(µvn|ci)>
i
Definition 18. (Flipped Cells) We say that there exist
flipped cells at step n, if there exist two cells i1, i2, such
that x(n−1)
i1
>
i2
<
x(n)
i2
i2
+ x2>1
17.(DualSupport)
Let
x(n)
i
?
i
?∈ [1
p(µvn|ci)
p(µvn|¬ci)>
e,1
2]. Equivalently ∀i,
1−x(n−1)
x(n−1)
x(n−1)
i
1−x(n−1)
(e − 1) or
i
i
.
1
2, x(n−1)
1
2, x(n)
i1
= x(n−1)
i1
− x1<
1
2,
= x(n−1)
2for positive x1, x2.
Under Def.16 and a uniform target map prior, flipped
cells can only occur due to correspondence errors.
Definition 19. (Boundary Constraints) We say that the
cells in a set S satisfy the boundary constraints at step n if
for each i ∈ S, p(µvn|ci) < p(µvn|¬ci) and
p(ci|µvn−1,...,µv1) <p(µvn|¬ci) −?p(µvn|ci)p(µvn|¬ci)
p(µvn|¬ci) − p(µvn|ci)
or, p(µvn|ci) > p(µvn|¬ci) and
p(ci|µvn−1,...,µv1) >p(µvn|¬ci) −?p(µvn|ci)p(µvn|¬ci)
p(µvn|¬ci) − p(µvn|ci)
,
.
Theorem 3. (Localization Tradeoff)
Assume C satisfies the boundary constraints at step n. Also
assume a uniform prior distribution for the target map. De-
fine d(n)
i
? x(n−1)
i
− x(n)
(i) Assume there are no flipped cells at step n and ∀i
x(n−1)
i
≤1
Furthermore, if x(n−1)
i1
>
?
map entropy at step n is smaller than it is at step n − 1.
(ii) If x(n−1)
i1
>
which does not have to equal i1, such that x(n)
(iii) If there are no flipped cells at step n and there exists a
i
and r(n)
i,k?
d(n)
i
j?=kd(n)
?
j
.
2. Then, there exists a cell i1for which x(n)
i?=i1(x(n−1)
i1
>1
2.
i
)r(n)
i,i1, the target
1
2for some cell i1, there exists a cell j1,
j1>1
2.
Page 7
cell i1satisfying x(n−1)
map entropy at step n is smaller than it is at step n − 1.
(iv) If there exist flipped cells i1, i2at step n, the condi-
tion x1,x2> x(n−1)
i1
i2
recognition with dual support, guarantees that the target
map entropy at step n is smaller than it is at step n − 1.
Any termination condition based on probability thresh-
olding (e.g., Def.7), requires a decreasing target map en-
tropy. The above theoremquantifies a set of sufficient prop-
erties of the recognition algorithm, under which, multiple-
view localization leads to a decreasing entropy and there-
fore, after a certain number of steps, a smaller target map
entropy than that of a single-view. Theorem 3 lists all pos-
sible target map behaviours under the boundaryconstraints.
If we also assume good single-view recognition and that no
correspondenceerrors exist, Theorem 3 defines a set of suf-
ficient properties of the single-view recognition algorithm
so that multiple-view recognition leads to a decreasing tar-
get map entropy and a smaller bias and variance in the tar-
get’s localization at each step. Without the boundary con-
straints, we have no guaranty of a decreasing entropy. Un-
der good single-view recognition and a uniform target map
prior, flipped cells are the result of correspondence errors,
implyingapossibleincreasedtargetmapentropyandbiasin
the target localization. Theorem 3 shows that without good
single-view recognition, it is possible to have a decreasing
target map entropy and an increasing bias in the estimated
target position, exemplifying the difficulty of the problem.
Case (i) shows that if x(n−1)
i1
ity amongst all cells at step n − 1, the entropy decreases
at the next step. It also shows that as the probabilities of
the other cells relative to x(n−1)
i1
weights r(n)
i
for smaller probabilities increase (by decreas-
ing their respective probabilities from step n − 1 to step n,
more than the other cells), it becomes more likely that the
entropy will decrease in the next step. Case (ii) shows that
a localization threshold of over1
sults under dead-reckoning or correspondence errors. Case
(iv) is applicable when the correspondence errors increase
and shows that more stringent requirements on the recogni-
tion algorithm can compensate for such errors and guaran-
tee a decrease in the entropy (by requiring dual support and
x1,x2> x(n−1)
i1
i2
). No such requirement is needed
in case (iii), which assumes that no flipped cells exist.
i1
>1
2and x(n)
i1
>1
2, then, the target
−x(n−1)
(see Def.18)and single-view
is the maximum probabil-
decrease, or as the relative
2easily leads to biased re-
−x(n−1)
4.2. Proof of Theorem 2
Let p(µvn|¬cj) ?
assume a non-zero denominator).
dead-reckoning errors in (i), ∃ˆjt
RT(it) =ˆjt. Thus p(µvn|cˆjt) > p(µvn|¬cˆjt) regardless
of p(ci|µvn−1,...,µv1) ∀i ?=ˆjt, because if p(µvn|cˆjt) ≥
?
i?=jp(µvn|ci)p(ci|µvn−1,...,µv1)
?
Since we only have
∈
i?=jp(ci|µvn−1,...,µv1)
(we
ˆV (vn) such that
p(µvn|¬cˆjt) but not p(µvn|cˆjt) > p(µvn|¬cˆjt) for all tar-
get map distributions at step n − 1, there exists a cell
j
?= ˆjt such that p(µvn|cj) = p(µvn|cˆjt) and thus
p(µvn|cj) ≥ p(µvn|¬cj) for all target maps, contradict-
ing Def.16.
Thusˆjt = argmaxj∈ˆV (vn)p(µvn|cj) =
argmaxj∈ˆV (vn)E(Y(n)
Thus maxj∈ˆV (vn)E(Y(n)
have assumed to know the values of vn, µvn, any change
in the dead-reckoning errors is equivalent to a change
to the inertial coordinate frame and potentially to the
label RT(it) assigned to the structure represented by
cell it, which does not affect E(Y(n)
ing (i).Notice that E(X(n)
p(µvn|ci)
?
tion with Lemma 2 below, proves (ii). The proof of case
(iii) is similar to that of case (ii) and we leave it as an ex-
ercise. Case (iv) follows trivially from case (iii). Case
(v) follows since E(X(n−1)
ˆit
E(X(n)
j
) (because of the uniform prior).
) = E(Y(n)
j
RT(it)) and since we
RT(it)), thus prov-
≤
i
)
E(X(n−1)
i
)⇔
a∈{ci,¬ci}p(a|µvn−1,...,µv1)p(µvn|a)≤ 1 which in conjunc-
) can be arbitrarily small and
ˆit) is proportional to E(X(n−1)
ˆit
).
Lemma 2. Let g(x,α,β) =
1 such that αx + β(1 − x) ?= 0 . Then g(x,α,β) ≤ 1 iff
β > α or x = 1 or α = β.
α
αx+β(1−x)with 0 ≤ α,β,x ≤
Proof. Notice that g(x,α,β) ≤ 1 ⇔ α − β ≤ (α − β)x.
If α < β, then g(x,α,β) holds iff x ≤ 1 which we know is
always true. If α > β, then g(x,α,β) holds iff x = 1. If
α = β, then g(x,α,β) = 1 which proves the lemma.
4.3. Proof of Theorem 3
To simplify certain arguments, we assume that no cell
ever takes a value of zero. Let X(n)
random variable with probability of success x(n)
p(ci|µvn,...,µv1). By Lemma 3, the boundary constraint
assumption of Theorem 3 is equivalent to V ar(X(n)
V ar(X(n−1)
i
) ∀i ∈ C. Since the variance of a Bernoulli(p)
random variable is equal to p(1 − p), it is maximized at
p =1
that when the variance of X(n)
i
|x(n−1)
all cells i, there exists exactly one cell i1at step n with
x(n)
i1
>
not have decreased and maintained a sum of one across all
target map cells. This proves the first half of Theorem 3(i).
One of the following conditions must hold at each step n:
(1): ∀i, x(n−1)
that satisfies x(n)
i1
>1
(2): There exist two cells i1, i2such that x(n−1)
x(n−1)
i2
<1
i1
>1
i2
i
denote a Bernoulli
i
?
i
) <
2andit is alsosymmetricaroundp =1
2, whichimplies
has decreased, |x(n)
i
) < V ar(X(n−1)
i
−1
2| >
) for
i
−
1
2|.Since V ar(X(n)
i
1
2, since otherwise, the variance of all cells could
i
≤1
2.
2and there exists exactly one cell i1
i1
>
1
2,
2, x(n)
2, x(n−1)
<1
2.
Page 8
(3): x(n−1)
Assume condition (1) applies.
second half of Theorem 3(i).
plicity we index the |C| − 1 cells that are not
equal to i1 by the set {1,2,...,|C| − 1}.
g(p) ?
− plg(p). We want to show that g(x(n−1)
?|C|−1
or equivalently
?|C|−1
g(x(n−1)
i1
i1
)
x(n)
where x(n)
Notice that because of the boundary constraint and Lemma
3, d(n)
i
> 0 for i ?= i1 and x(n)
since the target map cells have to sum to one at
step n. By the Mean Value Theorem, for each
i ∈ {1,...,|C| − 1}, ∃zi ∈ [x(n−1)
such that g(x(n−1)
i
) − g(x(n−1)
and
∃z
g(x(n−1)
i1
i1
that?|C|−1
This in turn implies that?|C|−1
the entropy decreases if and only if?|C|−1
But since
?|C|−1
x(n−1)
i1
≤ z, a sufficient condition for a decrease in the
entropy is x(n−1)
i1
i=1
The proof of part (ii) of the theorem follows, since if
x(n)
i
≤1
hasdecreasedatstepn(x(n)
i1
≤1
one cell i2, x(n−1)
i2
i2
≤1
sum to one at step n. But this contradicts the monotonically
decreasing variances implied by Lemma 3, proving (ii).
If condition (2) holds, by a recursive application
of Lemma 4 (by setting γ
−?
sired. This proves part (iii) of the theorem.
For the proof of part (iv) of the theorem, condition (3)
applies. Notice that g(p) is monotonically increasing on
(0,1
have assumed x1,x2> x(n−1)
i1
that g(x(n−1)
i1
i2
probabilities of all cells i ?= i1,i2have decreased and we
assume dual support). Equivalently, we want to show that
g(x(n−1)
i1
i2
But since x(n−1)
i2
i1
we have assumed dual support (x(n−1)
we have proven part (iv) of the theorem.
i1
>1
2, x(n−1)
i2
<1
2, x(n)
i1
<1
We now prove the
For notational sim-
2, x(n)
i2
>1
2.
Let
) +
i1
i=1
g(x(n−1)
i
)
>g(x(n)
g(x(n−1)
i
i1) +
)−g(x(n−1)
x(n)
?|C|−1
−d(n)
i=1
g(x(n)
i
)
i=1
ii
)
>
+x(n))−g(x(n−1)
?
?|C|−1
i=1
d(n)
i
.
i
= x(n−1)
i1
+ x(n)
i
− d(n−1)
) = d(n)
+ x(n)]
) = x(n)g?(z).
i
,x(n−1)
i
g?(zi)
such
Notice
]
i
− d(n)
ii
∈[x(n−1)
i1
,x(n−1)
i1
that
+ x(n)) − g(x(n−1)
i=1
i,i1
r(n)
= 1 and g?(p) = −log(p)
log(2)−
1
log(2).
i=1
r(n)
i,i1g?(zi) > g?(z) and
i=1
(x(n−1)
i
z
r(n)
i,i1
i
)r(n)
< z.
i=1
z
r(n)
i,i1
i
≤
?|C|−1
i=1
i,i1 and
>?|C|−1
(x(n−1)
i,i1
)r(n)
i,i1. This proves (i).
2for all cells i ∈ C, then the probability of cell i1
2< x(n−1)
< x(n)
i1
)andforatleast
2so that all cell probabilities
=
1
2),we see that
lg(x(n−1)
i
i∈Cx(n)
i
lg(x(n)
i
) < −?
i∈Cx(n−1)
i
) as de-
e] and monotonically decreasing on (1
2,1]. Since we
, it suffices to show
i1) +g(x(n)
−x(n−1)
) > g(x(n)
i2
) +g(x(n−1)
i2) (since the
)+g(x(n−1)
) > g(x(n−1)
> x(n−1)
i1
−x1)+g(x(n−1)
−x1, x(n−1)
i1
i2
+x2).
+x2and
<1
i1
< x(n−1)
i2
>1
2, x(n−1)
i2
e),
Lemma 3. V ar(X(n)
i
) < V ar(X(n−1)
i
) if and only if
p(µvn|ci) < p(µvn|¬ci) and
p(ci|µvn−1,...,µv1) <p(µvn|¬ci) −?p(µvn|ci)p(µvn|¬ci)
p(µvn|¬ci) − p(µvn|ci)
or p(µvn|ci) > p(µvn|¬ci) and
p(ci|µvn−1,...,µv1) >p(µvn|¬ci) −?p(µvn|ci)p(µvn|¬ci)
p(µvn|¬ci) − p(µvn|ci)
Proof. For notational simplicity, let α = p(µvn|ci), β =
p(µvn|¬ci) and x = p(ci|µvn−1,...,µv1).
V ar(X(n)
i
) = E(X(n)
i
)(1 − E(X(n)
x
i
) = (1−x)
Notice that
)), E(X(n)
β
αx+β(1−x). Thus,
ii
) =
α
αx+β(1−x)and 1−E(X(n)
V ar(X(n)
V ar(X(n−1)
0 < (α − β)2x2+ 2β(α − β)x + β(β − α) ? g(x).
i
)
i
)
< 1 ⇔
αβ
(αx + β(1 − x))2< 1 ⇔
(3)
The zeros of g(x) are x =
of g(x) are independent of changes in α, β that retain the
ratio of α to β. Since g(x) is a quadratic function of x, we
can determine the range of values that satisfy (3): If α < β,
g(x) is concave up,
β−α
which implies that (3) is satisfied iff x <
graph of f2(α) in Fig.(1)). If α > β, g(x) is again con-
cave up,
β−α
plies that (3) is satisfied iff x >
f1(β) in Fig.(1)). The lemma shows that there exist a set
of achievable probability values based on a relationship be-
tween the quality of the discriminativeand single-viewgen-
erative probabilities that is sufficient to decrease the vari-
ance of each cell’s Bernoulli distribution. In conjunction
with Theorem 3, it demonstrates the strong relationship be-
tween each cell’s variance and the target map entropy.
β±√αβ
β−α. Notice that the zeros
β+√αβ
≥ 1, and
β−√αβ
β−α
β−√αβ
β−α
β−√αβ
β−α
(see the graph of
≤
β+√αβ
β−α
(see the
β+√αβ
≤ 0, and
β+√αβ
β−α
≤
which im-
β−√αβ
β−α
Lemma 4. Let g(p) ? − plg(p), where 0 < p ≤ 1. Let
γ ∈ (0,1), 0 < α < γ < β ≤ 1, 0 < x < α and
0 < x ≤ 1−β. Then g(α)+g(β) > g(α−x)+g(β +x).
Proof. Notice that g(α)+g(β) > g(α −x) +g(β +x) ⇔
g(α) − g(α − x) > g(β + x) − g(β) ⇔
g(β+x)−g(β)
x
.By the Mean Value Theorem, there ex-
ist α?
∈ [α − x,α] and β?
g?(α?) =
x
and g?(β?) =
since g?(p) = −log(p)
∀α??∈ [α − x,α] ∀β??∈ [β,β + x], g?(α??) > g?(β??), we
have proven the lemma.
g(α)−g(α−x)
x
>
∈ [β,β + x] such that
g(β+x)−g(β)
x
log(2)is a decreasing function and
g(α)−g(α−x)
. But
log(2)−
1
Page 9
0 0.2 0.40.6 0.81
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
α, β
f1(β), f2(α)
f1(β) :
β−√
β−α
αβ
for α > β, α = 1
f2(α) :
β−√
β−α
αβ
for β > α, β = 1
f1(β)
f2(α)
Figure 1. Graphs showing on the y-axis the zeros of g(x) for vari-
ous ratios of α, β (see Ineq.(3) in Lemma 3).
5. Discussion
A number of optimization algorithms for navigation,
mapping and next-view-planning have been suggested,
basedonPOMDPs [11], Bayesian methods[12, 13], heuris-
tics [5, 8, 20] and greedy algorithms [15, 21] amongst oth-
ers. The arguments in Sec. 3 suggest what kind of policies
would lead to efficient and reliable solutions for the com-
ponents of active object detection and localization systems
that deal with sensor control for recognition. These include
next-view-planners that use efficient approximation algo-
rithms, or algorithms based on greedy and dynamic pro-
gramming solutions to the Knapsack problem [7, 21], sug-
gesting that a mixture of specialized optimizers, rather than
a single kind of optimization, could lead to more efficient
solutions, without a significant decrease in reliabilty.
Theorems 2,3 suggest that single-view localization,
where the updated target map is only used to guide the
where-to-look-next policy, can lead to fewer false posi-
tive/negative detections, at the expense of greater localiza-
tion bias when dead-reckoning errors occur. Alternatively,
if we have some prior knowledge about the expected max-
imum dead reckoning error—typically the main source of
correspondenceerrors—, we can define appropriate dimen-
sions for cell it, such that the target’s centroid always falls
inside cell it. For example, an adaptive multiscale target
map approach could be used. At each step, we could adjust
the scale of the cells close to the expected target position,
based on the expected dead-reckoning errors, in order to
guarantee a monotonically decreasing target map entropy.
This would make a termination condition based on proba-
bility thresholding (e.g., Def.7) more reliable under modest
dead-reckoning errors, despite potentially increased target
localization bias. At that point, target re-localization could
take place within this region, to refine the target position.
6. Conclusions
We have proven that the active object localization prob-
lem and a number of its variants, are NP-Hard or NP-
Complete. We have studied the tradeoffs of localizing vs.
detecting a target object under single-view and multiple-
view recognition schemes. We have shown that a number
of bias/variance/entropy relationships and tradeoffs emerge
undersingle-viewandmulti-viewlocalizationanddetection
schemes, that dependon the quality of the recognitionalgo-
rithm and the magnitudes of the correspondence or dead-
reckoning errors. The results motivated a set of properties
for active detection and localization algorithms.
References
[1] Aristotle. Π?ρ´ ι Ψυχ´ ης (On the Soul). 350 B.C. 1
[2] R. Bajcsy. Active perception vs. passive perception. In IEEE
Workshop on Computer Vision Representation and Control,
Bellaire, Michigan, 1985. 1
[3] F. Callari and F. Ferrie. Active recognition: Looking for dif-
ferences. Int. J. Comput. Vision, 43(3):189–204, 2001. 1
[4] S. Dickinson, H. Christensen, J. Tsotsos, and G. Olofsson.
Active object recognition integrating attention and viewpoint
control. Comput. Vis. Image Und., 67(3):239–260, 1997. 1
[5] S. Dickinson, D. Wilkes, and J. Tsotsos. A computational
model of view degeneracy. IEEE Trans. Patt. Anal. Mach.
Intell., 21(8):673–689, August 1999. 1, 8
[6] S. Ekvall, P. Jensfelt, and D. Kragic. Integrating active mo-
bile robot object recognition and SLAM in natural environ-
ments. In Proc. Intelligent Robots and Systems, 2006. 1
[7] M. R. Garey and D. S. Johnson. Computers and Intractabil-
ity: A guide to the Theory of NP-Completeness. W.H. Free-
man and Company, 1979. 2, 3, 8
[8] T. Garvey. Perceptual strategies for purposive vision. Tech-
nical report, Nr. 117, SRI Int’l., 1976. 1, 8
[9] W. E. L. Grimson. The combinatorics of heuristic search
termination for object recognition in cluttered environments.
IEEE Trans. Patt. Anal. Mach. Intell., 13:920–935, 1991. 1
[10] D. Marr. Vision: A Computational Investigation into the Hu-
man Representation and Processing of Visual Information.
W. H. Freeman and Company, 1982. 1
[11] D. Meger, P. Forssen, K. Lai, S. Helmer, S. McCann,
T. Southey, M. Baumann, J. Little, and D. Lowe. Curious
George: An attentive semantic robot. In Proc. Robot. Auton.
Syst., 2008. 1, 8
[12] R. D. Rimey and C. M. Brown. Control of selective percep-
tion using bayes nets and decision theory. Int. J. Comput.
Vision, 12(2/3):173–207, 1994. 1, 8
[13] S. D. Roy, S. Chaudhury, and S. Banerjee. Isolated 3D object
recognition through next view planning. IEEE Trans. Syst.
Man Cybern. Part A Syst. Humans, 30(1):67–76, 2000. 1, 8
[14] W. Rudin. Principles of Mathematical Analysis. McGraw
Hill, 1976. 2
[15] F. Saidi, O. Stasse, K. Yokoi, and F. Kanehiro. Online object
search with a humanoid robot. In Proc. Intelligent Robots
and Systems, 2007. 1, 8
Page 10
[16] B.Schieleand J. Crowley. Transinformation for active object
recognition. In Proc. Int. Conf. on Computer Vision, 1998. 1
[17] J. Tsotsos, Y. Liu, J. Martinez-Trujillo, M. Pomplun,
E. Simine, and K. Zhou. Attending to visual motion. Com-
put. Vis. Image Und., 100(1-2):3–40, 2005. 1
[18] J. K. Tsotsos. Analyzing vision at the complexity level. Be-
hav. Brain Sci., 13-3:423–445, 1990. 1
[19] J. K.Tsotsos. On therelativecomplexity of active vs. passive
visual search. Int. J. Comput. Vision, 7(2):127–141, 1992. 1
[20] L. E. Wixson and D. H. Ballard. Using intermediate objects
to improve the efficiency of visual search. Int. J. Comput.
Vision, 12(2/3):209–230, 1994. 1, 8
[21] Y. Ye. Sensor Planning in 3D Object Search. PhD thesis,
University of Toronto, 1997. 1, 3, 8
View other sources
Hide other sources
-
Available from John K Tsotsos · 3 May 2013
-
Available from yorku.ca