Conference PaperPDF Available

Surface fusion: Unobtrusive tracking of everyday objects in tangible user interfaces


Abstract and Figures

Interactive surfaces and related tangible user interfaces often involve everyday objects that are identified, tracked, and augmented with digital information. Traditional approaches for recognizing these objects typically rely on complex pattern recognition techniques, or the addition of active electronics or fiducials that alter the visual qualities of those objects, making them less practical for real-world use. Radio Frequency Identification (RFID) technology provides an unobtrusive method of sensing the presence of and identifying tagged nearby objects but has no inherent means of determining the position of tagged objects. Computer vision, on the other hand, is an established approach to track objects with a camera. While shapes and movement on an interactive surface can be determined from classic image processing techniques, object recognition tends to be complex, computationally expensive and sensitive to environmental conditions. We present a set of techniques in which movement and shape information from the computer vision system is fused with RFID events that identify what objects are in the image. By synchronizing these two complementary sensing modalities, we can associate changes in the image with events in the RFID data, in order to recover position, shape and identification of the objects on the surface, while avoiding complex computer vision processes and exotic RFID solutions.
Content may be subject to copyright.
Olwal, A., and Wilson, A. SurfaceFusion: Unobtrusive Tracking of Everyday Objects in Tangible User Interfaces.
roceedings of GI 2008 (The 34th Canadian Graphics Interface Conference), Windsor, Canada, May 28-30, 2008, pp. 235-242.
SurfaceFusion: Unobtrusive Tracking of Everyday
Objects in Tangible User Interfaces
Alex Olwal
School of Computer Science and Communication, KTH1
Andrew D. Wilson
Microsoft Research2
Interactive surfaces and related tangible user interfaces often in-
volve everyday objects that are identified, tracked, and augmented
with digital information. Traditional approaches for recognizing
these objects typically rely on complex pattern recognition tech-
niques, or the addition of active electronics or fiducials that alter
the visual qualities of those objects, making them less practical for
real-world use. Radio Frequency Identification (RFID) technology
provides an unobtrusive method of sensing the presence of and
identifying tagged nearby objects but has no inherent means of
determining the position of tagged objects. Computer vision, on
the other hand, is an established approach to track objects with a
camera. While shapes and movement on an interactive surface can
be determined from classic image processing techniques, object
recognition tends to be complex, computationally expensive and
sensitive to environmental conditions. We present a set of tech-
niques in which movement and shape information from the com-
puter vision system is fused with RFID events that identify what
objects are in the image. By synchronizing these two complemen-
tary sensing modalities, we can associate changes in the image
with events in the RFID data, in order to recover position, shape
and identification of the objects on the surface, while avoiding
complex computer vision processes and exotic RFID solutions.
CR Categories and Subject Descriptors: H5.2 [Information
interfaces and presentation]: User Interfaces - Graphical user in-
Additional Keywords: RFID, Computer Vision, Fusion, Tangi-
ble User Interface, Tabletop, Surface Computing.
As the cost of large displays and computing hardware continues to
decline, we can expect an increasing variety of computing form
factors arrayed throughout our everyday environment. Interactive
table systems, for example, exploit large displays in combination
with sensing technologies to bring new experiences and interac-
tions to tables, desks and other horizontal surfaces. Many of these
systems augment our interactions with familiar everyday physical
objects, lending them unique capabilities in the virtual world.
The bridging of the physical and virtual can be achieved in nu-
merous ways. The DigitalDesk [21] tracks and augments paper
with an overhead projector and camera, with which the user can
interact using pens or their hands. The metaDESK [18] uses a
rear-projected surface where vision-based tracking is performed
with an IR-camera under the surface. It recognizes objects by the
2D projection of their geometry. SenseTable [9] uses electromag-
netic tablets under the surface to sense special “pucks” and sup-
plements them with front-projected graphics. Augmented Surfaces
[14] use an overhead camera to track and identify objects with
attached visual markers, and adds projected imagery. PlayAny-
where [23] expands on this scenario with a portable setup, where
interaction takes place using the hands, or with objects tagged
with visual codes for identification. BlueTable [22] and Light-
Sense [8] detect and track spatially aware mobile devices on sur-
faces to enable context-sensitive feedback and interaction. In or-
der to support simultaneous tracking of multiple devices, these
systems rely on the ability of the device to actively communicate
with the system. There is also an increasing interest in developing
commercial systems that enable interaction with physical objects
on interactive tabletops [2][7][10].
Previous work demonstrates robust detection and tracking of
everyday objects on an interactive tabletop using various types of
fiducial markers and active electronics, such as electromagnetic
sensors. For widespread adoption of such interacitive techniques it
is desirable to enable the use of real-world artefacts that are only
minimally modified.
In the present work we propose a lightweight, unobtrusive and
generic sensing framework that enables scenarios where physical
everyday artefacts can be augmented and associated with digital
information, while supporting intuitive direct-manipulative inter-
action. (See Figure 1.)
Our contribution is a framework that enables detection and
tracking of everyday objects without altering their appearance or
employing exhaustive learning processes. We wish to provide as
general mechanisms as possible for interactive surfaces, such that
they may be used in a wide range of setups. The techniques
Figure 1: Our tangible user interface allows a user to intuitively
interact with physical objects and their associated digital informa-
tion. The system fuses RFID sensing and activities detected by
simple computer vision techniques to identify and locate objects
on the table.
1 PDC, KTH, 100 44 Stockholm, Sweden.
2 One Microsoft Way, WA, USA.
Olwal, A., and Wilson, A. SurfaceFusion: Unobtrusive Tracking of Everyday Objects in Tangible User Interfaces.
roceedings of GI 2008 (The 34th Canadian Graphics Interface Conference), Windsor, Canada, May 28-30, 2008, pp. 235-242.
should therefore not be dependent on exotic hardware or compli-
cated configurations.
We make this possible by fusing activity detection in the RFID
and computer vision domain, combined with classical correspon-
dence tracking. The work leverages the respective strengths of
RFID (identification) and computer vision (location) by combin-
ing them in a complementary way, such that arbitrary, unobtru-
sively tagged objects can be robustly sensed.
We also introduce the Frame Difference Algebra (FDA), a set
of minimal image processing operations for vision-based detec-
tion of scene changes, which allows the fusion framework to
avoid complex computer vision-based recognition techniques.
Instead, it analyzes the temporal correlation of RFID tags and
corresponding objects detected by the camera to establish the
identity of shapes in the scene. The combined use of vision and
RFID also allows us to create a set of techniques that integrate
well with existing rear-projected tabletop systems; these systems
often use diffuse projection surfaces that make imaging of fine
features difficult. In contrast to most previous work, our sensing
technology does not block the camera or projector in such setups.
We believe that by limiting the system to a simple set of tech-
niques, the potential of the fusion with RFID may be illustrated in
the most fundamental way. While we acknowledge the many so-
phisticated object recognition techniques available today, we
would like to make as few assumptions about features available
from computer vision processes, preferring instead to rely more
on the fusion process.
We discuss related work in Section 2, followed by our activity
sensing techniques in Sections 3, 4 and 5. The fusion process is
described in Section 6. In Section 7, we detail how we extended
our fundamental techniques for continuous tracking, and present a
Tangible Image Exploration application to demonstrate the func-
tionality of the framework in Section 8. Finally, we provide Fu-
ture Work in Section 9 and Conclusions in Section 10.
Vision-based systems have traditionally been popular due to their
cost-effectiveness and flexibility in sensing many different types
of real objects. While scene changes can be discovered through
image processing techniques, recognition of arbitrary objects is
significantly harder, and is very limited if we would like to iden-
tify objects through a diffuse projection surface which tends to
blur fine detail. It is thus popular to simplify recognition by
applying visual code markers to the object (e.g., QR code). Such
schemes however require that the system have a clear line of
sight, restricting how the user may position and orient the object.
The markers themselves also alter the object’s appearance in an
undesirable way, making them inappropriate for truly ubiquitous
deployment in everyday objects.
In contrast, radio frequency identification (RFID) tags may be
placed on virtually any object and can be identified reliably [20].
We envision that standard consumer product bar codes will in
many cases be complemented or replaced by RFID tags in the
near future. RFID could be a key component of future ubiquitous
computing scenarios — particularly in applications involving a
large, or perhaps even virtually unlimited, number of unique ob-
jects. Since they can be applied discreetly and unobtrusively, they
avoid cluttering the environment with visual markers. While vis-
ual codes can support a variety of bit depths, longer codes require
more printed space, whereas RFID tags do not.
RFID technology reports the presence of a tag, but does not in-
herently provide means for locating a tag in space. While RFID
technology was not initially designed for localization, several
research projects investigate the use of custom or modified RFID
readers and tags for positioning purposes, in addition to identifica-
In Marked-up Maps [13], for example, a map is instrumented
with RFID tags, serving as reference points for an RFID-reader-
equipped PDA, which displays context-sensitive information. The
ePro board [16] uses 480 readers on a 20×24 matrix with 3×3 cm
squares. Parallel processing with 30 units each controlling 16
readers are used to reduce delay. In DataTiles [15], a matrix of
RFID readers behind an LCD detects transparent tiles with em-
bedded RFID tags in a 4×3 grid. They serve as graspable interac-
tion devices and also allow interaction with an electromagneti-
cally tracked stylus. The RFIG lamp [12] uses structured light to
send unique binary codes to an RFID tag coupled with a photo-
sensor. The RFIG tag transmits a binary code along with other
data upon RFID interrogation, such that its position can be estab-
lished. RFIG tags thus rely on line-of-sight to the reader. Boukraa
and Ando [1] describe a system where RFID is used to retrieve a
stored appearance model of an object such that computer vision-
based registration can be simplified. Rahimi and Recht [11] pre-
sent the tracking of RFID tags on a version of the SenseTable
where 10 antennas are woven into a 30×30 cm surface — each
reporting RFID signal strength for the tag. It is shown how a
mapping can be learned such that 2D tag position from the 10
sensor readings can be inferred. The tracking of multiple tags is
not discussed. Since each tag affects every other tag’s signal
strength as well as antenna transmission and reception, it is likely
that the training task would become increasingly complex with
multiple simultaneous tags. Krahnstoever et al. [6] modify an
RFID reader to estimate orientation and 3D position of RFID tags,
and associate this data with human motion tracking to infer high-
level interactions between people and objects in an environment.
Many of these projects rely on multiple or modified RFID read-
ers, active tags, or training processes. Our framework instead
focuses on the combination of standard unmodified RFID equip-
ment and vision techniques to avoid such complexity in order to
support multiple configurations and off-the-shelf technology.
Existing display systems that are combined with RFID sensing
tend to employ multiple short-range antennas covering the surface
under the display (e.g., WACOM tablets). Such techniques are
inappropriate for a rear-projection tabletop system since their
antennas and associated electronics would block both camera and
A general goal of the tracking and detection components in a ta-
bletop system is to recognize objects and track them on the sur-
face. The appearance and interactive behaviour of such objects
can be augmented by co-located projection and gesture sensing.
A camera is well suited to discover changes and motion in a
video image using image processing techniques such as frame
differencing. Vision-based recognition, on the other hand, is a
complex and difficult task, especially if it the camera is placed
behind a diffuse projection surface. Object recognition systems
typically require training examples for each object we wish to
recognize — a tedious and time consuming process. The complex-
ity, recognition performance and runtime performance of these
techniques varies widely, and often do not scale well with an in-
creasing number of objects. Furthermore, it may be difficult or
impossible to identify objects with similar appearance, such as
multiple instances of the same consumer product, such as a cam-
era or a mobile phone. For example, how could we, solely based
on their appearances, distinguish the identity of two cell phones of
the same model, belonging to two different persons?
Olwal, A., and Wilson, A. SurfaceFusion: Unobtrusive Tracking of Everyday Objects in Tangible User Interfaces.
roceedings of GI 2008 (The 34th Canadian Graphics Interface Conference), Windsor, Canada, May 28-30, 2008, pp. 235-242.
The vision system also depends on the camera-object distance
and on its viewpoint of the objects. Sufficient resolution is critical
for detection and recognition, particularly in the case of small
objects or small, dense visual codes. Finally, recognition perform-
ance tends to be sensitive to varying environmental conditions,
such as lighting, or subtle changes in object appearance.
The use of special visually encoded markers is popular in these
types of applications, but is not applicable in a general scenario
where we want to support arbitrary objects without altering their
appearance. Our framework thus focuses on synchronized activity
sensing and consists of four parallel processes, which we discuss
in the following sections:
1) Detection of activity in the camera image.
2) Detection of RFID tags.
3) Temporal synchronization of vision and RFID activities.
4) Frame-to-frame correspondence tracking for interactivity.
For activity sensing, traditional object detection and tracking
techniques are unnecessary, as we may instead focus on using
computer vision to merely detect changes in the scene, such as the
addition, removal and movement of objects on the table. This
approach minimizes assumptions about lighting conditions, object
appearance, tracking and other factors that lead to the complexity
of many computer vision techniques.
We are especially interested in finding image capture frames
that are representative of a change of state on the surface. Each
such still frame summarizes the complete, stable state of the ob-
jects on the surface. By comparing a still with the previous still, it
is possible to deduce whether an object has been added, removed
or moved. In the following, we describe image processing opera-
tions available to detect such change. These do not rely on corre-
spondence, object recognition or other complex image processing
A limitation of this approach is that only activity for one object
at a time can be detected. But by combining the event detection
with frame-to-frame correspondence tracking of the shapes (as
discussed in Section 7), we enable fluid interaction and simulta-
neous manipulation of multiple objects. Our only restriction lies in
the possible ambiguity caused by the unlikely event of the user
adding two objects at the exact same time.
4.1 Event Detection with Connected Components
Given an image of the tabletop surface, under some assumptions it
may be possible to determine the set of objects on the surface
through traditional image segmentation techniques using binariza-
tion and connected components analysis. Once the set of objects is
determined, it is relatively straightforward to detect object activ-
ity, particularly when the number of objects undergoing change is
A background image is stored when the scene is empty and ab-
solute difference images are calculated from the background im-
age and subsequent images. Candidate objects are detected
through connected component analysis: groups of connected pix-
els are classified as distinct, independent objects. (See Figure 2.)
Add, remove and move events can be determined by comparing
the list of connected components found in the current frame to
that of the previous frame, using set difference operations. An
increase in the number of objects (by one) indicates the addition
of an object to the surface, while a decrease of one corresponds to
object removal. Without resorting to object feature matching and
recognition techniques, movement is detected as an object being
removed and another object (the same) being added (i.e., the
number of objects is unchanged). A related approach involves
determining that a connected component from the previous frame
and another from the current frame correspond to the same physi-
cal object if they appear at the same location in the image.
The detection of events using connected components, as de-
scribed above, is valid only when each connected component
corresponds to exactly one object on the surface. In applications
involving real world objects, this assumption may not hold.
Choosing the binarization threshold can in practice be very diffi-
cult, and may depend greatly on the nature of the objects' visual
appearance, which in many cases may not be under the designer's
control. A single object may lead to multiple connected compo-
nents. In the following, we present an alternative approach which
does not have these requirements.
4.2 Event Detection with Frame Differencing
Changes in the image may be mapped to surface activity by a
Frame Difference Algebra (FDA) that detects scene changes such
as the addition, removal or movement of an object, with a mini-
mum of image processing operations. The FDA allows us to use
fast and robust operations to detect scene changes that take place
between two still frames.
Three images are used for the FDA calculations (See Figure 3):
the background image (BG), the previous frame (P) and the cur-
rent frame (I). Denoting the pixelwise absolute differencing op-
erator , we observe that (I, P) leaves areas of the image which
just changed. Furthermore, we may compute (I, BG) and (P,
BG), which contain the objects that exist in the current and previ-
ous frame, respectively. We mask with (I, P), obtaining images
A = (I, P) AND (I, BG), and D = (I, P) AND (P, BG). Image
A contains objects not present in the previous frame, but present
in the current frame, indicating objects that just appeared. D con-
tains objects present in the previous, but not in the current frame,
indicating objects that disappeared.
The sum of pixels in the difference image indicates if there is a
change significant enough for an event to have occurred. With the
limitation that only one object can be added, removed or moved at
a time, we have the following three cases:
Figure 2: Segmentation. From left to right: a) Background subtraction. b) Binarization.
c) Noise reduction using Integral Image Sums. d) Connected component analysis yields two located objects.
Olwal, A., and Wilson, A. SurfaceFusion: Unobtrusive Tracking of Everyday Objects in Tangible User Interfaces.
roceedings of GI 2008 (The 34th Canadian Graphics Interface Conference), Windsor, Canada, May 28-30, 2008, pp. 235-242.
If sum(A) >> sum(D), then an object has been added.
If sum(D) >> sum(A), then an object has been removed.
If sum(A) sum(D), then an object has been moved.
The system stores an image mask for each new added object
(A) and its associated RFID. As shown in Figure 3, the mask will
only contain pixels corresponding to the object even if many ob-
jects are already on the surface. The masks are currently com-
puted by binarizing a difference image. This requires threshold-
ing, but since the difference image contains only one object (re-
gardless of the number of objects on the surface) this can be set
very generously. The moved object is determined by finding
which of the stored masks representing the current objects on the
surface is most similar to the new mask D. Almost any image
comparison operation will do; we use the sum of absolute differ-
ence between two binary masks — a small difference will indicate
a match. The mask stored for the object is now updated with mask
Note that this process does not rely on segmentation or track-
ing, and easily handles two objects right next to each other before
one of them is moved, or when one object is moved right next to
The FDA allows us to store the timestamp and location for an
object that has been added, removed or moved. Because the FDA
involves simple, robust operations on the image, we obtain a fast
detection mechanism that is straightforward to implement, and in
contrast to many traditional vision-based systems, avoids assump-
tions on object shape, appearance, position and orientation.
The FDA can naturally be extended for more complex scenarios
depending on the requirements of the application. In our case, we
find it particularly useful to combine it with continuous tracking
of objects, in order to allow fluid interaction and the manipulation
of multiple objects. In Section 7, we describe such an extension
where the FDA is complemented with vision-based correspon-
dence-based tracking.
While our computer vision-based activity sensing provides us
with the position and shape of our objects, it lacks means to iden-
tify them. The unobtrusive nature of RFID tags, which can be
embedded into most any physical object, as well as the robust
identification qualities of RFID, motivates the use of this com-
plementary sensing technology. While there has been exhaustive
investigations into custom, modified and special-purpose RFID
readers in previous work [1][11][12][13][15][16], we find it inter-
esting and practical to leverage commercially available, unmodi-
fied RFID readers and antennas, and passive RFID tags.
Whereas RFID tags can be detected reliably in the range of the
antennas, our application requires that the tags are only detected
when they are present on the tabletop surface, and thus visible in
the camera image. This requirement ensures that activity detected
by the vision system and the RFID reader can be synchronized.
5.1 Frequency of operations
There are four frequency groups in which commercial RFID tech-
nology operates; Low Frequency (LF, 125-148 kHz), High Fre-
quency (HF, 13.56 MHz), Ultra High Frequency (UHF, 902-928
MHz) and Microwave (2.4 GHz). Differences in range (a few
inches to hundreds of feet) and tag cost ($0.5-$25) can determine
suitability to a given application. We find UHF technology appro-
priate for tabletop systems given the larger tracking/display areas
involved – in our case a 32×24 inches surface. Another important
motivation for using the longer range UHF is that it is impractical
to place sensing technology directly under the surface (as required
for short range RFID) for rear-projected systems since it might
block the camera and projector.
5.2 Reader
We use the XR400 from Symbol Technologies, a long-range UHF
reader made for industrial use, such as the scanning of large pal-
lets and items on conveyor belts. It can use 1–4 read points, where
each consists of a transmitting and receiving antenna pair. The
XR400 has a read rate of about 1 Hz, which is comparable to the
performance of other products. The reader works by transmitting
energy and sequentially reading the backscattered energy from the
tags. It will sequentially perform reads for each of the read points
and within each read point sequentially interrogate the three tag
types (class 0, class 1, class 1 gen2). The read rate is therefore
affected by the number of active read points, number of tag types
currently set to detect, and finally, the number of tags present in
the system. We reduce delay in the system by using only one read
point and one active tag class. The number of simultaneous tags in
our system ranges from 1–10, matching the expected number of
objects in a tangible tabletop interface. Additionally, we were
advised by the manufacturer to set the reader in “conveyor belt
mode” to further increase the performance. While the reader will
not report signal strength of a tag, it is possible to attenuate the
transmitted energy to limit the effective read range.
Figure 3: The Frame Difference
Algebra uses absolute difference
images and binary image opera-
tions for robust and fast detection
of scene changes under the con-
straint that only one object is
manipulated at a time. The back-
ground image (BG), current frame
(I) and previous frame (P) are
used in the calculations. By com-
paring the number of shapes in
the resulting images with shapes
that appeared (A) and disap-
peared (D), it is possible to infer
whether an object was added,
moved or removed.
Olwal, A., and Wilson, A. SurfaceFusion: Unobtrusive Tracking of Everyday Objects in Tangible User Interfaces.
roceedings of GI 2008 (The 34th Canadian Graphics Interface Conference), Windsor, Canada, May 28-30, 2008, pp. 235-242.
5.3 Antenna Configurations
The antenna type must be chosen carefully to support tabletop
applications. It is desirable that the antennas can be placed unob-
trusively, ideally integrated with the surface, and do not interfere
with a rear- or front-projected imaging system.
5.3.1 Wire-loop antenna
In our efforts to limit the RFID readings to the surface, we inves-
tigated the use of a design consisting of two custom elongated
transmitting and receiving wire loops, placed directly on the sur-
face, on opposite sides of the area to be monitored. (See Figure 4.)
With full gain we obtained a working design where the tags were
detected on the surface only, as desired. The sensing area meas-
ured approximately 32×12 inches, which is about half the size of
our target display. Readers that support multiplexed transmission
and reception on the same antenna could potentially address this
problem by doubling the sensing range. Our reader requires a
separate transmit (TX) and receive (RX) antenna, which we place
on opposite sides, whereas a multiplexing reader could have two
TX/RX antennas placed on opposite sides such that each antenna
need only cover half of the surface. The advantage of the wire
loop antenna is that it effectively restricts the RFID sensing to just
the surface, but our design was not powerful enough to cover a
sufficiently large area.
5.3.2 Area antennas
Our second configuration, shown in Figure 5, has the advantage of
using commercially available area antennas. A transmitting and
receiving area antenna is placed opposite one another, under the
surface and angled towards the center, such that they monitor the
display. These antennas are powerful and work well with the
reader set to 20% gain. It is important that the antennas are placed
at a sufficiently steep angle, otherwise an object that is held di-
rectly above the surface for an extended period of time might be
prematurely detected.
The area antennas are significantly more powerful than our wire
loop antenna and can easily cover larger surfaces.
The vision-based event detection gives us accurate shape and
position information for objects in the scene. Simultaneously, our
RFID sensing accurately identifies present objects. The fusion
framework provides a mechanism to synchronize the information
from these two modalities such that each object can be identified.
We employ a database of events where detected vision and
RFID events are continuously stored. When a new object appears,
disappears or is moved, as detected by our image processing tech-
niques, we add a timestamped entry, with a reference to the corre-
sponding still image in the FDA. Similarly, we store a timestam-
ped event when an RFID tag appears or disappears. Our database
thus contains all state changes that have occurred, such that the
state of objects on the surface may be retrieved at any time.
When a new event occurs in either modality, a matching proc-
ess searches backwards in time for a corresponding, unmatched
event in the other modality. If found, the two events are marked as
Figure 6: The fusion pipeline. As new events appear, the system tries to match them with previously unmatched events in order to associate
localize and identify shapes on the surface.
Figure 4: The rear-projected setup with our custom-made wire-
loop antennas. RFID tags are detected only on the surface and in
an area about half the size of the display due to limited range of
the antenna.
Figure 5: Area antennas can be unobtrusively added to front- and
rear-projected systems. RFID tags are detected in the overlapping
volume of the transmission (TX) and reception antenna (RX).
Olwal, A., and Wilson, A. SurfaceFusion: Unobtrusive Tracking of Everyday Objects in Tangible User Interfaces.
roceedings of GI 2008 (The 34th Canadian Graphics Interface Conference), Windsor, Canada, May 28-30, 2008, pp. 235-242.
matched and the still image from the FDA is associated with the
RFID data. (See Figure 6.) Upon identification we perform a
lookup in an object database, where the RFID is associated with
additional metadata, such as the name of the object. That informa-
tion can now be displayed at the location of the shape, as indi-
cated by the FDA.
This fusion process suffers if a match was made incorrectly or
missed altogether due to a great difference in time between RFID
and vision events, RFID read failure or large spurious events in
the vision system. We note that this has not been a problem in our
experiments, due to the use of still images, each of which repre-
sents the stable state of surface objects. We also emphasize that
typical tabletop scenarios do not have a high rate of objects being
placed and removed from the surface, for practical reasons. It is
thus likely that the user will introduce a smaller number of objects
to work with. But while the working set might be kept small, there
is a simultaneous need to be able to support a very large number
of objects without having to prepare them or train the system for
While our activity sensing provides us with a mechanism to iden-
tify and track objects in the scene, it limits the activity to one ob-
ject at a time and only in-between still frames. These restrictions
allow the approach to work in front-projection systems. While
there are ways in which we can enable interaction in a front-
projection system, the required computer vision reasoning be-
comes significantly more complex [23].
Our rear-projected system allows fluid tracking of objects while
they are in motion, since the hand does not occlude objects it in-
teracts with (from the camera’s point of view) and shapes touch-
ing the surface can be robustly and reliably detected.
We employ frame-to-frame correspondence tracking to associ-
ate moving objects with the same ID they had in the previous
frame. Correspondence is determined by computing the distance
of a given object to every other object on the surface, such that a
shape in the new frame inherits the ID of the closest shape in the
previous frame, given that it is not a newly introduced object. The
correspondence can be extended with more sophisticated methods,
such as common pattern and template based tracking techniques.
The continuous tracking effectively addresses limitations of the
FDA, such that multiple objects can be simultaneously manipu-
lated on the surface, enabling responsive and fluid interaction.
In the spirit of Tangible User Interfaces [5][17][19], we developed
a prototype tabletop application, which in contrast to previous
work, uses visually unaltered objects with minimal instrumenta-
tion. It is based on the previously described techniques and makes
use of our fusion framework, continuous tracking and touch
screen interaction in a rear-projected setup, as shown in Figure 7.
In our application, imagery is downloaded from the Internet and
projected next to various objects as they are placed on a table, as
shown in Figure 1 and 8. To copy an image of interest, users place
a personal item on the table, such as a badge, and drag the image
to it. The image is then copied to the user’s personal folder on the
network. The next time the badge is placed on the table, the previ-
ously stored images appear.
When a new object is detected by the fusion module, we per-
form a lookup in an object database that stores information about
the object type. Currently there are three types of tagged objects;
Query, container and operator objects.
Query objects use pre-stored parameter values such as associ-
ated keywords. When a query object is detected on the table, the
keywords are retrieved from the database and used in a search on
the online Flickr photo database ( Match-
ing images are then downloaded and appear around the object.
Users can interact with the images by changing their size and
moving them, as well as dragging them to other objects on the
Container objects act as a physical handle to a collection of
digital images. They can also be used as symbolic links to physi-
cal storage, such as a shared network folder or a USB drive. They
present all images currently stored in the represented location, and
as new images are dragged to them, they are copied to the repre-
sented physical storage.
Operator objects execute a specific function on a dropped im-
age. An ashtray, for example, can represent a trashcan, such that
an image is deleted when dragged to it.
There are several ways in which our prototype application
could benefit from additional functionality. In order to support
more complex queries, we need mechanisms for authoring tags
and keywords, as well as introducing more operators. Similarly,
while the current prototype is limited to performing queries re-
turning existing photos stored on Flickr, it would be beneficial to
allow photos to be transferred directly from a portable device such
as a digital camera or a camera phone, in the spirit of BlueTable
[22]. New photos can spill out onto the table when the camera is
placed on it, and associated to other objects (thus assigning tags to
the photo), or deleted. We are also interested in other applications,
such as a tabletop slideshow controlled by the configuration of the
objects placed on the table. The relative position of the various
objects on the surface can be used to build a database query,
where we can extend the discrete control in previous work [17]
with our continuous 2D parameter space.
There are many ways in which we can make use of the digital
surface as a platform for extending and augmenting physical ob-
jects, both thanks to the availability of a large display, and also
given new means for interaction through a multi-touch sensitive
surface. Editing documents and pictures on a device would, for
instance, be much easier if we could extend the interface to the
surface. Linked service manuals that dynamically visualize the
Figure 7: Application system architecture, including object database.
Olwal, A., and Wilson, A. SurfaceFusion: Unobtrusive Tracking of Everyday Objects in Tangible User Interfaces.
roceedings of GI 2008 (The 34th Canadian Graphics Interface Conference), Windsor, Canada, May 28-30, 2008, pp. 235-242.
functionality associated with parts on a camera is another exam-
The simple fusion technique presented in this paper is clearly not
limited to RFID and vision modalities. While we would like to
improve the performance of these existing modalities, it may be
interesting to consider complementing RFID and computer vision
with additional sensing modalities. We emphasize that the modu-
larity of the approach demonstrates that the combined power of
multiple modalities can achieve satisfactory results with less ef-
9.1 Extracting data using the RFID reader
RFID readers can potentially provide information that goes be-
yond the detection of a tag’s presence. Such capability may be
useful in multimodal fusion systems.
Previous work [11][6] has shown ways in which RFID readers
can sense additional properties, such as orientation or the location
of a tag in space. Currently, such approaches require training or
access to more data than is typically exposed in commercial sys-
tems. While the RFID technology we used is designed for com-
mon commercial applications and does not expose a number of
low-level features, we find its availability as a commercial prod-
uct compelling. It would be especially advantageous to use a more
advanced commercial reader that provides signal strength, as this
would allow more sophisticated reasoning about sensed objects on
the surface. Higher read rates would improve overall system per-
formance and interactivity. The ability to transmit and receive on
the same antenna could allow twice the number of read points,
increase sensing range and simplify antenna design.
We explored a set of features that might be useful in inferring
data about objects in the system, especially in combination with
our fusion framework. Most of these features are readily available
without modification to the reader or tags:
9.1.1 Signal strength
Signal strength might be used as a coarse indication of distance,
refined by subsequent fusion with other modalities, or for detect-
ing interaction with the tagged object [3].
9.1.2 Response rate
Most commercial readers do not report signal strength, including
the device we used. Fishkin et al [3] however discuss how signal
strength can be approximated with the response rate, the number
of successful responses divided by the number of attempted polls.
9.1.3 Time-multiplexed gain attenuation
Some readers provide software control over gain attenuation at
runtime. We achieved a radar-like functionality by increasing the
energy over a number of reads. The major drawback of a 1 Hz
update rate is that a scan with 10 different energy levels takes 10
seconds, which is prohibitive for most interactive applications.
9.1.4 Multiple antennas/readers
Depending on the available data, one can use multiple antennas
and readers with varying position, orientation, gain and other pa-
rameters in order to extract more information about the tags being
read. For example, signal strength from multiple antennas could
be used for coarse position triangulation of a tag.
9.2 Exploiting RFID tag specific properties
There are many factors that determine whether a tag is read suc-
cessfully. Occlusion, tag geometry and orientation are examples
of issues that can affect how much energy the tag can absorb and
reflect through backscattered energy. This could bring additional
factors to help the fusion process.
9.2.1 Geometry
Tag antenna design varies greatly and is critical to how well the
tag absorbs and reflects energy. Besides using different designs,
we can also modify the performance, by cutting off half of the
antenna, for example. We found this to be useful when it is desir-
able to limit the reading range of certain tags. A related possibility
is to place multiple tags with varying geometry on an object, and
use the resulting variation in sensitivity as an indication of signal
9.2.2 Orientation
Tag geometry also plays an important role for how detection per-
formance varies with orientation. Tag detection is typically less
reliable when the (flat) tags are oriented perpendicular towards the
antenna, rather than face-on. We also discovered that elongated
tags are not as robustly read as symmetrical tags with a 90 degree
orientation when used with our wire-loop antenna. Multiple orien-
tation-sensitive tags on an object could both increase robustness
and provide an indication of orientation.
9.2.3 Occlusion
Like any RF technology, sensing degrades in the presence of liq-
uids or metal. Given that the human body is largely composed of
water, we have been able to reliably block a tag from being read
by occluding it with our hand. By tracking the hand in the camera
image, we might correlate that motion with the varying readability
of the blocked tags, such that the tags will also act as sensors.
9.2.4 Memory
The on-board memory on RFID tags has already surpassed many
applications’ requirements for storing the identification number
and we can expect it to continue growing. By having a passive tag
with general purpose on-board storage we could make use of this
to aid the recognition process. For instance, we envision that the
interactive surface could update the object’s tag with detected
tracking features as it learns new properties about the object. In-
stead of storing the object features in a central repository (as in
[1]), we could store all information directly with the object itself.
Even simple information, such as shape, size and color, could
provide valuable information to the fusion framework, such that
objects could be more robustly disambiguated on the surface.
9.3 RFID sensing for tabletop systems
We have experimented with both reader parameters and tag prop-
erties and conclude that the most important features are interactive
rates of operation and robust tag detection on the surface only.
Response rate and time-multiplexed gain attenuation are thus not
particularly useful, given the severely reduced update rate. Signal
strength is perhaps the most promising technique, but is unfortu-
nately still not exposed in most commercial readers. We also note
that the use of multiple readers, and the performance-affecting tag
properties, requires training specific to the particular configuration
and tags used. The control of tag geometry did however prove
useful in order to minimize accidental tag detection above the
9.4 Global fusion
Building upon our current fusion framework, we envision a global
fusion process that is capable of resolving ambiguities over a his-
tory of activity, such that the reasoning becomes an integrated,
Olwal, A., and Wilson, A. SurfaceFusion: Unobtrusive Tracking of Everyday Objects in Tangible User Interfaces.
roceedings of GI 2008 (The 34th Canadian Graphics Interface Conference), Windsor, Canada, May 28-30, 2008, pp. 235-242.
online probabilistic process. Specifically, motion and RFID activ-
ity may be modelled for each pixel of the input image. Each tag
would have a probability distribution in the image, where each
pixel stores the likelihood of its association with a specific RFID
tag. These probability images would be continuously updated as
new events occur, allowing them to update the model, resolve
ambiguities and repair incorrect associations.
We have presented an approach that combines RFID technol-
ogy and image processing to support tangible interactions with
visually unaltered everyday objects on interactive surfaces. We
use a camera to detect shapes and motion in the video image,
whereas an RFID reader senses tag presence. By synchronizing
these two sensing modalities in time, we can associate a located
shape with the ID provided by the RFID reader. The ID can be
used to index into a stored table of known objects far larger than
what is practical with most visual codes. Our approach takes ad-
vantage of each modality’s strength; the vision component moni-
tors the camera image for activities of interest, while the RFID
component monitors the RF domain to sense tags. We also note
that the unobtrusive nature of our techniques allows them to coex-
ist with other approaches, such as fiducial tracking, for example if
additional robustness and redundancy would be desired.
The fusion of these complementary sensors allows the use of a
single standard RFID reader and robust vision techniques. The
frame difference algebra, for example, makes very few assump-
tions about the nature and appearance of objects, and is thereby
widely applicable. Likewise, the use of standard RFID equipment
allows the unambiguous identification of multiple physical objects
of almost any type, using inexpensive and unobtrusive tags that
we expect to be ubiquitously embedded in future products, replac-
ing today’s visual barcodes. We believe that this approach pro-
vides new opportunities in bridging physical and virtual worlds,
using interactive surfaces and everyday physical objects.
[1] Boukraa, M. and Ando, S. Tag-based vision: assisting 3D scene
analysis with radio-frequency tags. Image Processing 2002 (2002), I-
[2] Dietz, P. and Leigh, D. DiamondTouch: a multi-user touch technol-
ogy. UIST '01 (2001), 219-226.
[3] Fishkin, K., Jiang, b., Philipose, M. and Roy, S. I Sense a Distur-
bance in the Force: Unobtrusive Detection of Interactions with
RFID-tagged Objects. IRS-TR-04-013. (2004).
[4] Gonzalez, R.C., and Woods, G. Digital Image Processing. Addison
Wesley (1993).
[5] Ishii, H. and Ullmer, B. Tangible bits: towards seamless interfaces
between people, bits and atoms. CHI ’97 (1997), 234-241.
[6] Krahnstoever, N., Rittscher, J., Tu, P., Chean, K. and Tomlinson, T.
Activity Recognition using Visual Tracking and RFID.
WACV/MOTIONS '05 (2005), 494-500.
[7] Microsoft Surface. (Sep 2007).
[8] Olwal, A. LightSense: Enabling Spatially Aware Handheld Interac-
tion Devices. ISMAR ‘06 (2006), 119-122.
[9] Patten, J., Ishii, H., Hines, J., and Pangaro, G. Sensetable: a wireless
object tracking platform for tangible user interfaces. CHI '01 (2001),
[10] Philips Entertaible. (Sep 2007).
[11] Rahimi, A, and Recht, B. Estimating Observation Functions in Dy-
namical Systems using Unsupervised Regression. NIPS (2006).
[12] Raskar, R., Beardsley, P., Dietz, P., and van Baar, J. Photosensing
wireless tags for geometric procedures. Commun. ACM 48, 9
(2005), 46-51.
[13] Reilly, D., Rodgers, M., Argue, R., Nunes, M., and Inkpen, K.
Marked-up maps: combining paper maps and electronic information
resources. Personal Ubiquitous Comput. 10, 4 (2006), 215-226.
[14] Rekimoto, J. and Saitoh, M. Augmented surfaces: a spatially con-
tinuous work space for hybrid computing environments. CHI '99
(1999), 378-385.
[15] Rekimoto, J., Ullmer, B., and Oba, H. DataTiles: a modular platform
for mixed physical and graphical interactions. CHI '01 (2001), 269-
[16] Sugimoto, M., Kusunoki, F,. and Hashizume, H. Supporting Face-to-
face Group Activities with a Sensor-Embedded Board. CSCW
Workshop on Shared Environments to Support Face-to-Face Col-
laboration (2000).
[17] Ullmer, B., Ishii, H., and Jacob, R. Tangible Query Interfaces:
Physically Constrained Tokens for Manipulating Database Queries.
INTERACT'03 (2003), 279-286.
[18] Ullmer, B. and Ishii, H. 1997. The metaDESK: models and proto-
types for tangible user interfaces. UIST '97. (1997) 223-232.
[19] Ullmer, B., Ishii, H., and Glas, D. mediaBlocks: physical containers,
transports, and controls for online media. SIGGRAPH '98 (1998),
[20] Want, R., Fishkin, K. P., Gujar, A., and Harrison, B. L. Bridging
physical and virtual worlds with electronic tags. CHI '99 (1999),
[21] Wellner, P. Interacting with paper on the DigitalDesk. Commun.
ACM 36, 7 (1993), 87-96.
[22] Wilson, A. D. and Sarin, R. BlueTable: connecting wireless mobile
devices on interactive surfaces using vision-based handshaking.
Graphics Interface 2007 (2007), 119-125.
[23] Wilson, A. D. PlayAnywhere: a compact interactive tabletop projec-
tion-vision system. UIST '05. (2005), 83-92.
Figure 8: Our tangible image explorer enables interaction with unobtrusively tagged physical objects on interactive surfaces.
... AR has been studied by the scientists [3,6,18,25,32] representations of dynamic digital content [19], sensing for embedded devices and tangible user interfaces [27]. Thus, the computer technologies may provide enormous possibilities for educational instructions, giving to teachers and their students plenty of tools for multimodal learning. ...
... However, it should be mentioned that QR codes are perfect for only either text objects or website links. This technology also provides opportunities for being used in the classroom still AR based on 3D objects visualizing and photo/video incorporation into its digital component gives wider perspective of application in the learning process.Scientists are getting involved in describing the devices which can support the technology[3,7, 19, 20,25,26,27] and suggest different variants of combining and applying them. Azuma[3] gives detailed descriptions of displays: head-worn devices, namely head-mounted, (HWD), handheld (HHD) and projection displays. ...
... The latest research go much further. For example, Olwal´s scientific laboratory[10, 19,25,26,27] generally focuses on the tools, techniques and devices that enable new interaction concepts for the augmentation and empowerment of the human senses, basically dealing with physical and visual Professional science applies the Creative Commons Attribution (CC BY 4.0) license to the materials published| International Scientific Conference of Young Researchers for Academic Disciplines | SECTION 3. TEACHING AND EDUCATION ...
... Both were not fine-tuned to primarily track finger touches; additionally, they had high resolution, which is beneficial to track an object's imprint on the screen either through visual markers (e.g., Kaltenbrunner and Bencina [23]) or object contours (e.g., Wilson and Sarin [66]). This enabled a seamless integration of Tangible User Interfaces (TUIs), e.g., [5,6,20,43,[54][55][56]. Such approaches include using the Touch API of the smartphone to detect markers, e.g., [9,70]. ...
Full-text available
While tangibles enrich the interaction with touchscreens, with projected capacitive screens being mainstream, the recognition possibilities of tangibles are nearly lost. Deep learning approaches to improve the recognition of conductive triangles require collecting huge amounts of data and domain-specific knowledge for hyperparameter tuning. To overcome this drawback, we present a toolkit that allows everyone to train a deep learning tangible recognizer based on simulated data. Our toolkit uses a pre-trained Generative Adversarial Network to simulate the imprint of fiducial tangibles, which we then use to train a deployable recognizer based on our pre-defined neuronal network architecture. Our evaluation shows that our approach can recognize fiducial tangibles such as AprilTags with an average accuracy of 99.3% and an average rotation error of only 4.9°. Thus, our toolkit is a plug-and-play solution requiring no domain knowledge and no data collection but allows designers to use deep learning approaches in their design process.
... Ces tags (Finkenzeller, 2003) permettent de garantir l'unicité de l'objet et sa localisation. Par exemple, Olwal & Wilson (2008) proposent SurfaceFusion qui s'appuie sur le suivi d'objets tangibles munis de tag RFID sur table. Dans la même logique, Hosokawa et al. (2008) proposent un système support à la conception qui utilise des objets tangibles (RFID) ; leur exemple suggère la conception d'une nouvelle maison à l'aide d'objets murs, portes, etc., équipés d'étiquettes RFID. ...
International audience In recent years, tangible user interfaces, which imply interactions performed with one or several objects, gain more and more interest in research in Human-Computer Interaction (HCI). The tangible object represents a subject or an action. It acts on the system, as an action in classical user interfaces (e.g,. GUI). Interaction on a table, which is a common furniture in everyday life and used in multiple activities (desktop, coffee table, kitchen table, etc.), opens a new way for research and development in HCI. In this article, we present definitions, models, and key issues elicited from the literature that enable understanding and reasoning about the couple < interactive tabletop, tangible object> within an interactive system. Then, we propose a framework that allows to characterize applications supported by the couple <interactive tabletop, tangible object> in a domain-independent manner. Depuis quelques années les interfaces tangibles impliquant des interactions réalisées via un objet (ou plusieurs) prennent de plus en plus d’importance dans les recherches en interaction homme-machine. L’objet tangible représente un sujet ou une action ; l’objet agit sur le système, telle une action sur une interface « classique ». L’interaction sur table, c’est-à-dire sur un meuble présent dans la vie courante et utilisé à diverses fins (bureau, table à manger, table de salon, table bar, etc.), ouvre un champ nouveau de recherche et de développement. La mise en exergue, issue de l’état de l’art, des définitions, modèles et problématiques, permet d’abord d’appréhender le couple (table, objet tangible) au sein d’un système interactif. Puis, nous proposons un cadre qui permet de positionner des applications mettant en oeuvre le couple (table, objet tangible). Le cadre est décrit de manière à être utilisé pour positionner des applications indépendamment du domaine.
... Many of the very earliest explorations of touch screens and "surface computing" incorporated tangibles; e.g., [6,13,25,39,48,49]. In the 2000s, a wide range of "multi-touch" research systems were developed [7,20], and in 2007, Microsoft launched a commercial "Surface" computer (later rebranded PixelSense) [37]. ...
... Ces tags (Finkenzeller, 2003) permettent de garantir l'unicité de l'objet et sa localisation. Par exemple, Olwal & Wilson (2008) proposent SurfaceFusion qui s'appuie sur le suivi d'objets tangibles munis de tag RFID sur table. Dans la même logique, Hosokawa et al. (2008) proposent un système support à la conception qui utilise des objets tangibles (RFID) ; leur exemple suggère la conception d'une nouvelle maison à l'aide d'objets murs, portes, etc., équipés d'étiquettes RFID. ...
Full-text available
Depuis quelques années les interfaces tangibles impliquant des interactions réalisées via un objet (ou plusieurs) prennent de plus en plus d'importance dans les recherches en interaction homme-machine. L'objet tangible représente un sujet ou une action ; l'objet agit sur le système, telle une action sur une interface « classique ». L'interaction sur table, c'est-à-dire sur un meuble présent dans la vie courante et utilisé à diverses fins (bureau, table à manger, table de salon, table bar, etc.), ouvre un champ nouveau de recherche et de développement. La mise en exergue, issue de l'état de l'art, des définitions, modèles et problématiques, permet d'abord d'appréhender le couple (table, objet tangible) au sein d'un système interactif. Puis, nous proposons un cadre qui permet de positionner des applications mettant en oeuvre le couple (table, objet tangible). Le cadre est décrit de manière à être utilisé pour positionner des applications indépendamment du domaine.
... Radio frequency identification (RFID) technologies have been used to associate digital content with physical objects on tabletops (e.g., [45]). RFID has also been used with optical tracking as a means to uniquely identify objects [26]. Work including WISP [5], RapID [39] and IDSense [20] has demonstrated the ability to track objects tagged with parasitically-powered UHF RFID tags at room-scale distances, but such systems require a relatively large and high-powered reader antenna to operate. ...
Conference Paper
We present Project Zanzibar: a flexible mat that can locate, uniquely identify and communicate with tangible objects placed on its surface, as well as sense a user's touch and hover hand gestures. We describe the underlying technical contributions: efficient and localised Near Field Communication (NFC) over a large surface area; object tracking combining NFC signal strength and capacitive footprint detection, and manufacturing techniques for a rollable device form-factor that enables portability, while providing a sizable interaction area when unrolled. In addition, we detail design patterns for tangibles of varying complexity and interactive capabilities, including the ability to sense orientation on the mat, harvest power, provide additional input and output, stack, or extend sensing outside the bounds of the mat. Capabilities and interaction modalities are illustrated with self-generated applications. Finally, we report on the experience of professional game developers building novel physical/digital experiences using the platform.
... Furthermore, identification and tracking of tangibles can be achieved by computer vision methods, which requires explicit hardware (i.e., optical markers and cameras) and suffers from multiple limitations like occlusion and costs [8,9]. The tracking of tangibles is possible via microcontrollers [10], RFID (Radio Frequency Identification) [11] or magnetic sensors [12,13], too, but needs a much higher level of assembly effort and complexity compared to a low-cost prototyping environment. Regarding the potential and limitations for future dissemination, those approaches seem more realistic. ...
Full-text available
With Tangible User Interfaces, the computer user is able to interact in a fundamentally different and more intuitive way than with usual 2D displays. By grasping real physical objects, information can also be conveyed haptically, i.e., the user not only sees information on a 2D display, but can also grasp physical representations. To recognize such objects (“tangibles”) it is skillful to use capacitive sensing, as it happens in most touch screens. Thus, real objects can be located and identified by the touch screen display automatically. Recent work already addressed such capacitive markers, but focused on their coding scheme and automated fabrication by 3D printing. This paper goes beyond the fabrication by 3D printers and, for the first time, applies the concept of capacitive codes to laser cutting and another immediate prototyping approach using modeling clay. Beside the evaluation of additional properties, we adapt recent research results regarding the optimized detection of tangible objects on capacitive screens. As a result of our comprehensive study, the detection performance is affected by the type of capacitive signal processing (respectively the device) and the geometry of the marker. 3D printing revealed to be the most reliable technique, though laser cutting and immediate prototyping of markers showed promising results. Based on our findings, we discuss individual strengths of each capacitive marker type.
... Yet another possibility is to make use of Radio Frequency Identification (RFID) technology in order to identify and track tangible objects. SurfaceFusion [7] combines both optical and RFID tracking by integrating RFID tags without altering the visual appearance of the objects. Tangibles can also be tracked by magnetic sensors. ...
Conference Paper
Electronic markers can be used to link physical representations and virtual content for tangible interaction, such as visual markers commonly used for tabletops. Another possibility is to leverage capacitive touch inputs of smartphones, tablets and notebooks. However, existing approaches either do not couple physical and virtual representations or require significant post-processing. This paper presents and evaluates a novel approach using a coding scheme for the automatic identification of tangibles by touch inputs when they are touched and shifted. The codes can be generated automatically and integrated into a great variety of existing 3D models from the internet. The resulting models can then be printed completely in one cycle by off-the-shelf 3D printers; post processing is not needed. Besides the identification, the object’s position and orientation can be tracked by touch devices. Our evaluation examined multiple variables and showed that the CapCodes can be integrated into existing 3D models and the approach could also be applied to untouched use for larger tangibles.
Conference Paper
Full-text available
A novel method to infer interactions with passive RFID tagged ob- jects is described. The method allows unobtrusive detection of human interac- tions with RFID tagged objects without requiring any modifications to existing communications protocols or RFID hardware. The object motion detection al- gorithm was integrated into a RFID monitoring system and tested in laboratory and home environments. The paper catalogs the experimental results obtained, provides plausible models and explanations and highlights the promises and outstanding future challenges for the role of RFID in ubicomp applications.
Conference Paper
Full-text available
In this paper we present a system that electromagnetically tracks the positions and orientations of multiple wireless objects on a tabletop display surface. The system offers two types of improvements over existing tracking approaches such as computer vision. First, the system tracks objects quickly and accurately without susceptibility to occlusion or changes in lighting conditions. Second, the tracked objects have state that can be modified by attaching physical dials and modifiers. The system can detect these changes in real-time.We present several new interaction techniques developed in the context of this system. Finally, we present two applications of the system: chemistry and system dynamics simulation.
Conference Paper
Full-text available
This paper describes our design and implementation of a computeraugmented environment that allows users to smoothly interchangedigital information among their portable computers, table and walldisplays, and other physical objects. Supported by a camera-basedobject recognition system, users can easily integrate theirportable computers with the pre-installed ones in the environment.Users can use displays projected on tables and walls as a spatiallycontinuous extension of their portable computers. Using aninteraction technique called hyperdragging, users can transferinformation from one computer to another, by only knowing thephysical relationship between them. We also provide a mechanism forattaching digital data to physical objects, such as a videotape ora document folder, to link physical and digital spaces.
Conference Paper
Full-text available
Associating and connecting mobile devices for the wireless transfer of data is often a cumbersome process. We present a technique of associating a mobile device to an interactive surface using a combination of computer vision and Bluetooth technologies. Users establish the connection of a mobile device to the system by simply placing the device on a table surface. When the computer vision process detects a phone-like object on the surface, the system follows a handshaking procedure using Bluetooth and vision techniques to establish that the phone on the surface and the wirelessly connected phone are the same device. The connection is broken simply by removing the device. Furthermore, the vision-based handshaking procedure determines the precise position of the device on the interactive surface, thus permitting a variety of interactive scenarios which rely on the presentation of graphics co-located with the device. As an example, we present a prototype interactive system which allows the exchange of automatically downloaded photos by selecting and dragging photos from one cameraphone device to another.
Conference Paper
Full-text available
The vision of spatially aware handheld interaction devices has been hard to realize. The difficulties in solving the general track- ing problem for small devices have been addressed by several research groups and examples of issues are performance, hard- ware availability and platform independency. We present Light- Sense, an approach that employs commercially available compo- nents to achieve robust tracking of cell phone LEDs, without any modifications to the device. Cell phones can thus be promoted to interaction and display devices in ubiquitous installations of sys- tems such as the ones we present here. This could enable a new generation of spatially aware handheld interaction devices that would unobtrusively empower and assist us in our everyday tasks.
This paper describes a system called ePro for supporting face-to-face group activities . ePro connects a sensor-embedded board and a computer simulation, and is currently used to discuss urban planning and environmental problems. Group members collaboratively construct a town by placing pieces such as houses on the board. The computer simulation program automatically recognizes the arrangement of pieces on the board. It then visualizes environmental changes of the town through simulations. The visualization shown to the group members amplifies interaction between them, and gives them feedback for their further actions. Keywords interaction, face-to-face collaboration, combining physical and virtual worlds, sensor-embedded board AUTHORS' BACKGROUNDS AND MOTIVATIONS Our research group is composed of three main investigators. Their backgrounds are computer science, cognitive science, electrical engineering, respectively. The goal of our research is to develop a new computational medium for supporting group activities and evaluate it. We have constructed an electronically enhanced board that can quickly recognize objects placed on its surface. This board is applied to a system for supporting people working for urban planning and environmental problems in a "face-to-face" situation. In this workshop, we would like to talk about the technological aspects and effects of our system, and explore the possibility of new applications through discussions.
We derive a cost functional for estimating the inverse of the observation function in nonlinear dynamical systems. Limiting our search to invertible observation functions confers numerous benefits, including a compact representation and no local minima. Our approximate algorithms for optimizing this cost functional are fast, and give diag-nostic bounds on the quality of their solution. Our method can be viewed as a manifold learning algorithm that utilizes a prior on the low-dimensional manifold coordinates. The benefits of taking advantage of such priors in manifold learning, and searching for the inverse observation functions in system identification, are demonstrated empirically by learning to track moving targets from raw measurements in a sensor network setting and in an RFID tracking experiment.