Sensor-offset HMD perception and performance
James E. Melzer
& Kirk Moffitt
Rockwell Collins Optronics, 2752 Loker Ave. West, Carlsbad, CA 92010-9731;
La Quinta CA
The perceptual and performance effects of viewing HMD sensor-offset video were investigated in a series of small
studies and demonstrations. A sensor-offset simulator was developed with three sensor positions relative to left-eye
viewing: inline and forward, temporal and level, and high and centered. Several manual tasks were used to test the effect
of sensor offset: card sorting, blind pointing and open-eye pointing. An obstacle course task was also used, followed by a
more careful look at avoiding specific obstacles. Once the arm and hand were within the sensor field of view, the user
demonstrated the ability to readily move to the target regardless of the sensor offset. A model of sensor offset was
developed to account for these results.
Keywords: Sensor offset, HMD, perception, performance
The easiest approach to designing a helmet-mounted sensor/display system is to bolt displays and sensors onto the
helmet. For example, a sensor/display module can be suspended in front of the viewing eye in a manner typical of night
vision goggles (NVG) — placing the sensor outwards, but this approach results in a very forward center of gravity, with
the potential to cause neck strain for the soldier. A more ambitious design approach is to integrate displays and sensors
into the helmet so as to minimize bulk and protrusions and to optimize weight and balance in an attractive package. But
this integrated approach creates an offset of the sensor with respect to the wearer’s normal line of sight. An example of
the integrated approach is the Soldier Mobility and Rifle Targeting System (SMaRTS), shown in Figure 1, where the
sensor is above eye-level and is head-centered.
Figure 1 shows the Soldier Mobility and Rifle Targeting System (SMaRTS) system with the two
imaging sensors (visible on top, long wave infrared on the bottom) located high and in the
middle of the head with the digital HMD shown worn over the right eye.
Our current interest is in an integrated sensor/display system that is monocular, with a moderate field of view (FOV) and
unity magnification. A necessary trade-off with this design is the desire to mount the sensor package in a location that is
not directly in line with the user’s eye. Data are very limited on the perceptual and performance effects of sensor offset,
and there are no engineering guidelines.
James E. Melzer is Manager of Research and Technology: phone 1-760-438-9255: firstname.lastname@example.org
Kirk Moffitt is a human factors consultant: phone 1-760-360-0204: email@example.com
Head- and Helmet-Mounted Displays XII: Design and Applications
edited by Randall W. Brown, Colin E. Reese, Peter L. Marasco, Thomas H. Harding
Proc. of SPIE Vol. 6557, 65570G, (2007) · 0277-786X/07/$18 · doi: 10.1117/12.721156
Proc. of SPIE Vol. 6557 65570G-1
2.1 Displaced vision
A well-developed literature describes the effects of displaced vision on manual tasks at a near distance that can be
characterized as vision intensive
. The typical methodology uses one or two prisms to displace the visual scene and
usually minimizes viewing the arms or hands. Pointing or reaching initially overshoots the target in the direction of the
displaced image. This is followed by gradual adaptation to the displacement, though perfect adaptation does not always
occur. Following removal of the prism displacement, a negative aftereffect is temporarily reported where pointing and
reaching errors are in the opposite direction.
2.2 Angular error
Reports of human-performance problems with HMD offset-sensor systems can sometimes be attributed to angular error
(i.e., sensor line of sight not in display line of sight). An informal test was conducted at Rockwell Collins Optronics
(RCO) in 2005 using a simulated SMaRTS system. Several of the relevant sensor/display offset conditions were
evaluated in terms of walking and reaching, as well as general perceptions of height and tilt. This simulator consisted of
an RCO monocular display and a forehead-mounted daylight camera. The horizontal FOV was approximately 32°. The
camera was slewed, tilted and rotated while the display remained in front of the right eye. Five subjects were tested, and
their behavior and observations were all in general agreement.
Initial testing with the camera and display both aligned straight ahead (i.e., boresighted to each other) resulted in
minimal problems. Participants were able to walk across the room, grasp objects, and move in straight lines. As
expected, movement was slower and slightly more hesitant than with naked-eye viewing, likely due to the field of view
restriction over normal viewing. With the camera slewed to the side by approximately 10°, each participant was asked to
walk across the room to grab a door handle, and then back to grab an object sitting on a bench. Walking was noticeably
slowed and hesitant, and started in the direction opposite the camera direction. Walking across the room from another
direction towards a doorway resulted in an arced path for all participants. The distance was approximately 20 feet, and
the arc was about 2 feet off-line at the halfway point. The endpoint was within the doorway.
Tilting the camera up approximately 10° simulated the effect of looking down at the display. After walking back and
forth across the room, participants said they felt “tall” and as if they were “on stilts.” Another observation was that the
room appeared to tip forwards, resulting in the impression of walking downhill.
Rotating the camera approximately 10° was disturbing and not easily corrected. Given the instruction to regain
gravitational upright by using head tilting, it was initially unclear whether to move the head left or right. Furthermore,
large head tilts never seemed to make the image upright. Some of this can be explained by the difference in center-of-
rotation of the HMD and head tilting. Head tilt rotates about the base of the neck, and describes a wide arc. To
complicate matters, most head tilting is accompanied by head rotation. To really complicate things, this head motion
stimulates the vestibular system and sense of balance, and stimulates both vertical and small rotational eye movements.
The results of the testing indicate that in the case of a digital sensor and imaging display, the two need to be boresighted
to reduce perceptual and locomotion effects. Although there were no specific tests conducted to evaluate a numerical
alignment tolerance, it is recommended that there be a maximum of 0.5° angular error between the two. Note that the
effects associated with an angular error between the sensor and display should not be confused with sensor-offset. The
remainder of this report assumes the two are boresighted with no angular error.
2.3 Early sensor-offset studies
A 1998 study of the effect of offset binocular cameras on eye-hand coordination used cameras positioned forward 165
mm and upward 62 mm with the image seen on a head-mounted display
. Measures of performance using a pegboard
task showed significant cost. There was adaptation over time, though performance never returned to the baseline level.
Negative aftereffects were also observed after removal of the apparatus. Only the one camera position was tested.
What about manual tasks that involve distances greater than arm’s length? One study measured the effects of several
stereo NVG configurations on grenade tossing performance to a target at a distance of 20-feet
. Compared to the control
condition where the NVG objectives were separated by the nominal distance of the eyes, a hyperstereo configuration
where the lateral NVG separation was twice the nominal eye separation significantly degraded tossing performance, and
Proc. of SPIE Vol. 6557 65570G-2
this was attributed to exaggerated stereo. When the NVG objectives were vertically displaced, but with the same lateral
separation as the eyes, no performance degradation was found. While the imaging apparatus studied by these researchers
was binocular, it provides some indication that a simple vertical offset does not affect a medium-distance task involving
What about walking and driving with offset vision? Walking and driving involve the picking-up of flow-field
information in our visual periphery and the direction of waypoints rather than size and distance computations. To
approach an object, we make it expand in our field-of-view and the ground flow backward. The kinesthetic feedback
from our feet on the ground simplifies the acts of walking. This sensation of grounding precludes the need to directly
observe our feet. During these activities, we are generally looking forward. One researcher used himself as a subject in
extended testing of a displaced camera system similar to that used by Biocca and Rolland
. He wore the head-mounted
apparatus for several days, and found that walking around his building, up and down stairs, and through doorways was
not a problem.
2.4 FOV effects
What will have an effect on mobility is the limited FOV of the sensor/display system. FOVs of 12° and 40° have been
shown to result in significant errors in a navigation task, with some degradation present with a larger FOV of 90°
Performance degradation has also been reported on search and maze tasks with FOVs of 48° up to 112°
vision is important to self movement not because of retinal organization, but because that is where the highest rate of
optic flow occurs. If vision is limited to the sensor/display FOV, an increase in head movement and slower movement is
The available data provide little design guidance for the location of offset-sensors on an HMD system. We devised a
plan to use helmet-mounted cameras and a head-mounted display to test the effects of sensor offset. Manual coordination
and mobility tasks were used in testing with small numbers of subjects.
3.1 Sensor-offset apparatus
A simulator was constructed using
an eMagin HMD and three
miniature monochrome daylight
cameras. Night-vision sensors
were not used due to cost, weight
and complexity. The sensor-offset
simulator is diagrammed in Figure
2. The three basic components are
the helmet and cameras, eMagin
binocular HMD, and the
switcher, video interface, laptop
PC and eMagin controller. The
Watec cameras are 537x597 pixel,
>380 line monochrome systems
with a 6 mm fl lens and an
approximate FOV of 32°
Figure 2 shows a block diagram
of sensor-offset simulator.
Front Watec Camera
9 v battery
Side Watec Camera
Top Watec Camera
Proc. of SPIE Vol. 6557 65570G-3
The eMagin HMD is a 800x600 pixel SVGA binocular display with a horizontal FOV of approximately 32°. Viewing
was left-eye-only, with the right eyepiece covered. The helmet was a large-size bicycle helmet with a camera platform
on the front-left and counter-balancing weights on the back right. The weight of the helmet system was approximately
one kilogram. The helmet assembly and HMD are shown in Figure 3, and the backpack is added in Figure 4. The camera
offsets are described in Table 1. All outside vision was shielded with a black drape attached to the helmet with Velcro
and tied around the waist. The compete package—HMD, helmet and cameras, black drape and backpack with electronics
is shown in Figure 5.
Table 1. Camera positions relative to left eye
Offset Lateral Vertical Longitudinal
Forward 0 0 12 cm forward
High & Centered 3 cm nasal 15 cm high 7 cm forward
Side 12 cm temporal 0 0
Figure 3. Sensor-offset simulator ensemble that includes a helmet, three cameras, mounting platform, power-switching
control, video connectors, counterweights and eMagin HMD. Note the locations of the three cameras (circled)
Fig. 4.Sensor-offset simulator ensemble with backpack used
in obstacle course and avoidance studies. The backpack held
a laptop computer, video switch unit, and the eMagin
Proc. of SPIE Vol. 6557 65570G-4
Fig. 5.Sensor-offset simulator ensemble with black drape to limit visibility to 32° x 24° camera video.
4.1 General procedure and observations
Subjects were affiliated with RCO, and were verified to have at least 20/30 left-eye acuity using a vision wall-chart at a
distance of 10 meters. Eye dominance was assessed for the initial study, and each subject had an unambiguous dominant
eye. No eye dominance perceptual or performance effects were noted. The cameras were prefocused for near or far
depending on the task. Each camera was boresighted to the HMD for all tasks.
For each task, subjects were first measured without the simulator to establish a naked-eye baseline. Within each task, the
order of sensor-offset was randomly selected for each subject.
Relative to the baseline naked-eye vision, all camera conditions with a FOV of 32° x 24° slowed down movement. The
most noticeable perception was a downward slant of the floor and a minification of distant objects with the top-mounted
4.2 Manual tasks
The first test was card sorting. This was a simple task where cards from one suit were laid out at the clock positions on a
piece of felt on a table. The subject, using one hand, simply made a pile in the center starting with the “2” and ending
with the “King.” The time for this task was recorded for three repetitions, after which each of the three subjects was
asked to rate the effort required for that task on a scale of 1 (no effort) to 10 (extreme effort). A photo of this task is
shown in Figure 6.
Median times for the card-sorting task are shown in Table 2 for each of the three subjects. The difference between the
baseline and sensor times reflects the cost of limiting the FOV plus offsetting vision. The times for the front and forward
sensor are less than for the side and top sensors for all three subjects. Workload estimates were inconsistent and did not
correspond to response times.
Proc. of SPIE Vol. 6557 65570G-5
Fig. 6. Card-sorting task. Fig.7. This photo represents both the blind- and open-eye
Table 2. Median card-sorting times (seconds).
Subject Baseline Front Side Top
S1 9 25 36 32
S2 9 20 25 21
S3 11 27 30 31
The combination of the small differences in Table 2 combined with the inconclusive workload data led us to develop
another study of manual performance and sensor offset. We decided to separate the perceptual and performance aspects
of a manual task. A pointing task was developed where subjects stood on a line 120 cm from the wall and step forward
and point at an “X” target” at a height of 150 cm. For the first part of this pointing study, subjects were instructed to look
at the target, close their eyes, and step forward and place their index finger on the “X” target. The experimenter promptly
noted the finger position on the sheet of paper. This “blind” pointing task represents the perceptual component of where
the target appears, with no visual guidance of their hand and finger. The pointing task is shown in Figure 7.
Three trials were run for each sensor condition, and the centroid of the resulting triangle of points used as the summary
statistic. Figure 8 shows the results for this study. These results correspond to the prism displacement studies, where the
apparent target position is opposite to the sensor position. Specifically, the top sensor results in the perceived lowest
target, and the (left) side target results in the target appearing to the right. The control of baseline condition with naked-
eye vision always results in the most accurate performance.
We next asked subjects to point to the same “X” target with their eyes open. Each trial started with the experimenter
removing a card that hid the target “X”. Since the end result was the index finger pointing at the target, we took video
recordings of each trial and noted the time from a go signal to finger-on-target plus any apparent strategies. The data for
the “eyes open” pointing task for a representative subject are shown in Figure 9. The control or baseline condition
showed the quickest response. We expected that response would improve and level-off with trials. This did not generally
occur, and may be due to the tendency to stab at the target sheet and then drag the index finger to the target in the first
few trials, but to then start guiding the finger to the target—the net result being little difference in pointing time over the
nine trials. The hand and finger started each trial outside the 32° x 24° FOV. Based on video evidence, we speculate that
the two strategies were to stab with the finger into the FOV and then drag it onto the target, or to move the finger into the
FOV and then visually guide it to the target—with both taking about the same amount of time. No sensor position effect
can be discerned from the data from the four subjects.
Proc. of SPIE Vol. 6557 65570G-6
Fig. 8.Pointing performance for the blind pointing task for four subjects (TP, TO, KM and JM).
Fig. 9.Representative pointing data for the open-eye pointing task for one subject (TO).
Proc. of SPIE Vol. 6557 65570G-7
4.3 Mobile tasks
We first asked subjects to describe their perceptions of the experimental room in terms of the floor slanting, objects
looking distorted. A common response was that the floor looked slanted downwards with objects at 5 to 10 meters
looking small with the top-mounted camera. No consistent perceptual effects were noted with the front or side mounts.
We tested the effects of sensor offset on mobility by constructing a simple obstacle course. Subjects briskly walked
through a course defined by cardboard boxes—stepping over two one-foot high and deep boxes and ducking under five-
foot entryways. Subjects also had to avoid several tall boxes on the left and right, and execute a hairpin left-hand turn.
The entire course was approximately 50 feet in length. Subjects were instructed to move briskly but not to purposely
knock over boxes. Figure 10 shows two views of the course.
Fig. 10. Obstacle course constructed of stacked boxes in a u-shaped 50-foot course.
Completion times for the obstacle-course task were 10 seconds for the baseline naked-eye condition, and between 18 and
42 seconds for the three sensor offsets. No sensor-offset trends were evident. Similarly, workload ratings also showed no
evident trends. Subjects hit a number of boxes in stepping over, going around and ducking under obstacles. We think
that subjects felt with their arms and hands and readily kicked the boxes to make their way through the course. We
decided to follow-up with a closer look at components of this task.
We recorded video of two subjects stepping over a one-foot high and wide box, then circling around and passing close
by six-foot high stacked boxes on the left, and then circling around and walking towards these boxes and stopping at a
distance of one-foot (chest to box). This sequence was repeated for each camera offset. Subjects did not reach out to
touch any boxes. Representative video frames are shown in Figure 11.
The results of this demonstration are shown in Table 4. As with the other studies in this investigation, the naked-eye
control condition was associated with superior performance. The cost of the 32° x 24° FOV HMD view was misjudging
distances and sometimes running into obstacles. Arms and hands were not used to reach out and feel the obstacles. Both
subjects maintained a relatively large clearance in passing-by an obstacle on the left with the left-side-mounted camera.
Similarly, the approach distance was overestimated with the front-mounted camera. Both of these findings correspond to
camera offset, and demonstrate that effects linked to specific sensor-offset effects are more likely with isolated and
simple tasks than with complex tasks.
Proc. of SPIE Vol. 6557 65570G-8
Figure 11. Video frame sequences (1/10 second between frames) of stepping over, passing by and approaching tasks.
Proc. of SPIE Vol. 6557 65570G-9
Table 4. Stepping over, passing by, and approaching obstacles
Task Control Forward Side Top
Step-over 10 cm Hit Hit 30
Pass-by 10 20 40 30
* +1 +60 +40 +40
Step-over 20 30 30 20
Pass-by 5 Hit 40 Hit
Approach* -11 +29 +20 +29
* Relative to instruction to stop at distance of one-foot.
5. SUMMARY AND MODEL
Task performance was degraded and workload estimates were higher for all sensor positions for manual and mobile
tasks relative to naked-eye vision. The likely cause of this global effect is the limited vision from the 32° x 24° HMD
FOV. If a sensor/display system has an angular error, performance will be dramatically affected. The current study only
used straight-ahead and aligned sensors.
The current study only used left-eye monocular imagery. The subjects presented a mix of left and right eye dominance,
and left- and right-handedness. There were no comments or concerns about not seeing with both eyes, or even which eye
was used for viewing.
In agreement with the prism displacement literature, pointing without real-time visual feedback is opposite to the sensor-
offset position. Once the arm and hand are visible within the sensor FOV, the user can readily reach a target position—
regardless of sensor-offset position. The user can either stab at the apparent target location and then drag their hand to
the target, or make a reaching motion and then guide their hand to the target. The distinction between the hand not being
visible (obscured or outside the sensor FOV) or visible (within the sensor FOV) is critical to understanding reaching and
The current study was deficient in only testing a small number of subjects for a limited number of trials. The apparatus
had no look-around vision, which required total visual reliance on sensor imagery. The tasks imposed minimal stress on
the subjects, unlike many tasks that would be encountered in the real world. Table 5 presents a simple model of a sensor-
offset helmet display system as it relates to perception and performance.
Table 5. Sensor offset model
System configuration Perception and performance Evidence
Walking in curved path to waypoint, distracting slant
and rotation effects
Informal testing at RCO
Hands and feet visible
Noticeable slant with high sensor; misjudge
closeness of right-side objects when walking;
reaching opposite to sensor position with no visual
Blind pointing opposite of sensor positions
No large sensor offset performance or effort
differences reported for both manual and mobile
Current study, literature
Current study, large
literature on prism
Left-eye display No evidence of eye awareness or dominant eye
effects. Monocular versus binocular viewing not
Current study, literature
Moderate 32° FOV Large decrements in performance and increases in
reported effort relative to naked-eye vision
Current study, literature
Proc. of SPIE Vol. 6557 65570G-10
1. J. Petz, M. Hayhoe, and R. Loeber. “The coordination of eye, head, and hand movements in a natural task,” Journal
of Experimental Brain Research, 139, 266-277 (2001).
2. R. B. Welch. “Adaptation of space perception, “ In K. R. Boff, L. Kaufmanf and J. P. Thomas (eds.), Handbook of
perception and human performance, Volume I. New York: Wiley (1986).
3. F. A. Biocca and J. P. Rolland. “Virtual eyes can rearrange your body: Adaptation to visual displacement in see-
through head-mounted displays,” Presence, 7, 262-277 (1998).
4. V. G. CuQlock-Knopp, K. P. Myles, F J. Malkin, and E. Bender. The effects of viewpoint offsets of night vision
goggles on human performance in a simulated grenade throwing task. ARL-TR-2401. Aberdeen Proving Ground MD:
Army Research Laboratory (2001).
5. S. Mann. “Fundamental issues in mediated reality, WearComp, and camera-based augmented reality”. In W.
Barfield & T. Caudwell (eds.), Fundamentals of wearable computers and augmented reality. Mahwah NJ: Erlbaum
6. P. L. Alfano and G. F. Michel. “Restricting the field of view: perceptual and performance effects,” Perceptual and
Motor Skills, 70(1), 35-45 (1990).
7. K. W. Arthur. Effect of field of view on performance with head-mounted displays. Dissertation thesis, University of
North Carolina (2000).
8. A. P. Mapp, H. Ono, and R. Barbeito. “What does the dominant eye dominate? A brief and somewhat contentious
review,” Perception & Psychophysics, 65, 310-317 (2003).
Proc. of SPIE Vol. 6557 65570G-11