Conference PaperPDF Available

A system for high precision glass-to-glass delay measurements in video communication

Authors:
A SYSTEM FOR HIGH PRECISION END-TO-END DELAY MEASUREMENTS IN VIDEO
COMMUNICATION
Christoph Bachhuber and Eckehard Steinbach
Technical University of Munich
Chair for Media Technology
Munich, Germany
ABSTRACT
Ultra low delay video transmission is becoming increasingly
important. Video-based applications with ultra low delay re-
quirements range from teleoperation scenarios such as con-
trolling drones or telesurgery to autonomous control of dy-
namic processes using computer vision algorithms applied on
real-time video. To evaluate the performance of the video
transmission chain in such systems, it is important to be able
to precisely measure the glass-to-glass (G2G) delay of the
transmitted video. In this paper, we present a low-complexity
system that takes a series of pairwise independent measure-
ments of G2G delay and derives performance metrics such
as mean delay or minimum delay etc. from the data. The
precision is in the sub-millisecond range, mainly limited by
the sampling rate of the measurement system. In our imple-
mentation, we achieve a G2G measurement precision of 0.5
milliseconds with a sampling rate of 2kHz.
Index TermsVideo signal processing, Glass-to-glass
delay measurement, video delay distribution
1. INTRODUCTION
With the advent of 5G networks [1] and the prospects of the
tactile internet [2] End-to-End (E2E) delays of 1 millisec-
ond are requested for communication systems of the future.
These ultra low delay systems enable applications such as
networked control for fast assembly robots, highly dynamic
teleoperation [3] in virtual or augmented reality[4], car-to-
X communication [5, 6] to improve safety and efficiency in
transport, and many more.
In all these scenarios, ultra low delay video transmission
is an important component. Therefore, we need a precise
measurement of the video delay. For video transmission sys-
tems, which presents the video to a user on a display, this is
called Glass-to-Glass (G2G) delay. It describes the time from
when the photons of a visible event pass through the lens of
a camera until the corresponding photons of the event shown
on a display pass through the display glass.
The G2G measurements are preferably non-intrusive,
such that they can be applied to a wide range of systems.
Author Auto-
matic
Non-
Intru-
sive
De-
corre-
lated
Cost Pre-
cision
Hill/MC [7, 8] no yes no med low
Jacobs [9] no yes no med high
Sielhorst [10] yes yes no med low
Boyaci [11] yes no no none low
Jansen [12] yes yes no high low
Our method yes yes yes low high
Table 1: Comparison of delay measurement methods. Justifi-
cation of the classification is given in Section 1.1. Our method
is presented in Section 2.
Furthermore, a video camera typically has a fixed refresh
rate, producing new images in constant time intervals. Real
world events are virtually never synchronized to the camera
frame capture time instances. To make realtime measure-
ments, real-world events have to be triggered. Because of this
non-deterministic G2G delay values are obtained. By repeat-
ing the measurement process several times, a distribution of
delay values is obtained.
Measuring partial delays such as the processing delay on
a camera or the encoding latency is a standard task in system
design. For both, the signal propagation time through the cir-
cuit has to be measured. But there are few approaches avail-
able to measure the G2G latency of the more complex system
of an entire video transmission chain. This measurement also
comprises delays from data transmission between processing
blocks and the synchronization effects between blocks oper-
ating at fixed rate.
1.1. Related Work
Several methods to measure G2G delay in video transmission
have previously been proposed. An overview of their system
characteristics is given in Table 1.
The approaches in [7, 8] rely on the presentation of a run-
ning clock, for example on a computer screen. This clock is
Copyright © 2016 IEEE, article accpeted for publication by IEEE. Personal use of this material is permitted.
However, permission to use this material for any other purposes must be obtained from the IEEE by sending an
email to pubs-permissions@ieee.org. DOI: 10.1109/ICIP.2016.7532735
filmed, the video of it transmitted and displayed by the video
transmission system under test. Another camera films both
the real clock and the clock displayed by the video transmis-
sion system. By comparing the clock states in the resulting
image, the G2G delay can be obtained. These methods suf-
fer from many issues: without image processing algorithms,
the calculation of the delay has to be done manually by read-
ing the numbers from the final image. For the measurement
system, one has to purchase an additional camera to record
the entire scene. Further, the achievable precision is low be-
cause the monitor displaying the running clock and the sec-
ond camera are refreshed at their individual frame rates, e.g.
fDis =fCam = 60Hz.
Jacobs et al. [9] set the basis for our system: the authors
use a blinking light-emitting diode (LED) in the field of view
of the camera as signal generator and tape a photoelectric sen-
sor to where the LED is shown on the display. The LED trig-
gers an oscilloscope which also records the signals from the
photoelectric sensor. This allows them to manually extract
the G2G delay of individual samples. The problem is that this
method is not automated on a simple circuitry and therefore
requires high effort and expensive equipment.
Sielhorst et al. [10] propose a system that comprises
moving LEDs. From the position difference of the LEDs in
the actual world and on the video, the delay is automatically
computed by employing a computer vision algorithm. This
method does not include the exposure delay of the camera
since the source continuously creates events (new translation
positions). Furthermore, they use a recording rate of the
measurement camera of at most 200 Hz. This introduces an
average imprecision of 5 milliseconds.
Boyaci et al. [11] measure the capture-to-display latency
between a caller and a callee in a video conferencing applica-
tion. They embed timing information in the form of an EAN-8
barcode in the recorded frames. This information is decoded
on the callee PC and compared to the internal clock in soft-
ware. The method is constrained to desktop computers, since
it is intrusive and requires custom software to be executed
on the caller and callee machines. The authors assume syn-
chronized clocks and take no further analyses or measures to
ensure synchronization. Finally, the method does not include
the delay introduced by the graphics buffer and the display,
since the timestamp is compared to the current time immedi-
ately after decoding.
Jansen et al. [12] utilize QR codes to mark time. A mea-
surement system feeds QR codes from a display to the camera
of the system under test, from which the video is displayed
and again recorded by the measurement system. The mea-
surement system decodes the QR code and computes the G2G
delay. The problem is that a camera is not a time-precise
recording tool. Further, a computer or laptop and a camera
have to be used as measurement system, which constitutes
one of the most expensive options here.
Camera Display
Processing,
Transmission
Light Source
(LED)
Light Sink
(PT)
t
0
v
1
t0
t
R0
t0
R1
t1
Delay T
UR
T = t1 - t0
Light Light
Fig. 1: Delay measurement principle
1.2. Contribution
We propose a G2G delay measurement system that unifies
most of the benefits of the existing systems as shown in
Table1. It is an advancement of Jacobs’ [9] system and com-
prises an LED as light source and a phototransistor (PT) as
light detector. The actual LED can cover only a small area of
the video image to not bias the coding process. The analysis
of the data is not done manually with an oscilloscope, but
automatically with a microcontroller board. We propose a
theoretical model for G2G delay and relate initial measure-
ments obtained with the new system to it.
The remainder of this paper is organized as follows: Sec-
tion 2 describes the system principle, the hardware and soft-
ware implementation and a theoretical model for delay. Sec-
tion 3 presents and discusses results obtained with the mea-
surement system. Section 4 summarizes the results and gives
an outlook to future work in this field.
2. SYSTEM DESCRIPTION
2.1. Concept and Realization
The G2G delay measurement process is based upon the idea
that the video transmission system delays the propagation
of light, as depicted in Figure 1. An initially disabled light
source is put in the field of view of the camera. After enabling
the light source, the video transmission system requires the
G2G delay Tto transmit this information to the display,
which is picked up by the light sink. The proposed approach
assumes an ideal system without any reaction delay within
the light source and sink and with no noise.
We created a prototype with an Arduino R
Uno. It is de-
picted in Figure 2. It can be connected to a PC using USB
or to mobile devices using bluetooth. An LED acts as light
source in the field of view of the camera. In LEDs, the time
between the start of an electrical current pulse and the start
of emission of photons is typically below one microsecond.
Since our measurements are in the order of milliseconds, the
delay from the LED is negligible. The light sink is a photo-
transistor (PT) which has a rise and fall time of 10 microsec-
onds, which is also small compared to the G2G delay we want
to be able to measure. To suppress noise we are using the de-
tection algorithm proposed in Section 2.2.
LED
PT
Arduino R
USB
Fig. 2: Prototype
2.2. Signal Processing
The voltage dropping over the PT is sampled at 2kHz in our
prototype. The resolution of the voltage is 10 bit, resulting in
1024 brightness levels. To extract the time at which the event
appears on the display, the sample data undergoes a two-step
processing: first, a maximum smoothing filter and second a
rising edge detection algorithm are applied (both steps are de-
scribed below). The algorithm has been validated by compar-
ing the resulting G2G delays with manually read values from
an oscilloscope which is connected to the LED and PT.
The maximum smoothing is required to suppress wrong
detections caused by pulse width modulation (PWM) of LCD
display backlight or short light pulses in CRT and plasma
monitors. The filter has two tasks: smooth the signal from
unwanted waves and let the resulting signal increase imme-
diately if the input signal increases. This is solved by the
maximum filter with length k. For every new raw sample ai,
the maximum
bi= max
max(0,ik)ji(aj)
of itself and the previous ksamples is stored in the pro-
cessed value bi.
To automatically find the sample at which a consistent in-
crease of the sample values is initiated, we apply a rising edge
detection based on slope thresholding on the processed sam-
ples bi. An increase of a cumulative 20 brightness levels over
the duration of 3 subsequent samples or the same increase
within one sample to the next triggers the flag that the picture
of the lit up LED can now be seen on the display. These pa-
rameters make the algorithm robust against noise from exter-
nal lighting and panel refresh on one hand. On the other hand,
it enables us to reliably recognize the lighting up of the LED
in typical measurement environments without further precau-
tions.
With constant inter-measurement intervals, a measure-
ment sequence of a simple Camera to PC setup exhibits
strong correlations between measurement samples, consid-
erably reducing their significance. This is because of the
constantly changing phase shifts between the sampling pro-
cesses in the camera and display. To avoid these correlations,
we use random inter-measurement intervals.
2.3. Delay Distribution
To explain the G2G measurements obtained with the pro-
posed system, we model the G2G delay distribution of a sim-
ple video transmission system consisting of a camera, a PC
and a display. We first define three partial delays: the camera
sampling delay pCam(t)U(tmin ,f1
Cam +tmin)contributed
by the camera sampling is uniformly distributed because the
turn-on time of the LED is independent of the frame period
f1
Cam of the camera. The LED has to light up at least tmin be-
fore the end of a frame period f1
Cam to be part of the current
frame. This frame is read out of the sensor and transmitted at
the end of the current frame period, leading to a delay in the
interval [tmin,f1
Cam]. If the LED turns on later than that dur-
ing the current frame period, the light-up information is trans-
mitted at the end of the next frame period, causing a delay in
]f1
Cam,f1
Cam +tmin]. These two possibilities together form
the uniform distribution of pCam(t)as seen in the beginning
of this paragraph. The occurrence of the second possibility
has two reasons: first, it either lights up so late during the ex-
posure that the corresponding relatively dark depiction on the
display will not trigger the rising edge detection. Second, the
LED can light up during a frame period after the exposure has
ended. The minimum exposure required for triggering and
the difference between a frame period f1
Cam and the exposure
time add up to tmin.
The display refresh also contributes a uniform delay
pRef (t)U(0,f1
Dis), upper bounded by the inverse of the
display refresh rate f1
Dis. This delay is uniformly distributed
because the display refreshes independently of when the
computer fills the graphics buffer.
All remaining parts like the processing in the camera, PC
and display and the interface delays are modeled to be deter-
ministic and are thus represented by one variable pProc(t)
(tProc). In reality, there will be deviations from the ideal
deterministic delay for example because we do not use a real-
time operating system.
Since the G2G-delay Tis the sum of these three mutu-
ally independent delays, the corresponding probability distri-
bution
TP(t)=pCam(t)pProc (t)pRef (t)
is the convolution of them. Overall, we expect the G2G
delays to approximate a isosceles trapezoid shape that is cen-
tered around the mean tProc +tmin +1
2fCam +1
2fDis with min-
imum delay tProc +tmin and maximum latency tProc +tmin +
1
fCam +1
fDis . In real measurements, the non-deterministic pro-
cessing delay will smoothen the nooks of the shape.
3. MEASUREMENTS
We present measurements conducted with our prototype de-
scribed in Section 2. The video transmission system is a Fe-
dora 20 PC with an AlliedVision Guppy PRO F-031C IEEE
1394 camera and a Samsung 2233BW monitor at fDis =
60Hz. We parametrized the camera such that the exposure
time is, with a negligibly small difference below the millisec-
ond order, equal to the frame period. As displaying software,
we use coriander 2.0.2.
The G2G delay distribution of 250 measurements with
fCam = 50Hz is shown in Figure 3a. The delay is at min-
imum 19.1ms=tProc +tmin. The sum elements can not be
distinguished using the data produced by the proposed mea-
surement system. With this minimum delay, it takes at min-
imum 19.1ms from an event taking place until it is shown
on the display. This can also be thought of as the best case
measurement. The opposite, the maximum delay is 52.4ms=
tProc +tmin +1
fCam +1
fDis , representing the worst case delay
from the event until the display of it. The 95% confidence in-
terval from fitting a Student’s t-distribution to the histogram
in Figure 3a for the mean ranges from 32.4ms to 34.1ms. The
standard deviation is 6.9ms. The histogram in Figure 3a also
confirms the assumptions from Section 2.3: it approximates
an isosceles trapezoid and has a width of 52.4ms 19.1ms =
33.3ms. This is a few milliseconds smaller than 1
fCam +1
fDis
20ms + 16.7ms = 36.7ms because the ideal worst and best
case delays are so improbable that they did not occur in this
series of measurements. Performing more measurements re-
duces the difference in width between theory and practical
measurements. But with an increasing number of measure-
ments, the difference only approximates zero, but does not
perfectly equal it. This is why we did not perform more mea-
surements here.
In Figure 3b, we plot maximum G2G delay, the bounds
for the 95% confidence interval for estimating the mean, the
minimum delay and the standard deviation of the delay as a
function of the frame rate of the camera. For every frame
rate setting, 250 G2G measurements have been performed.
The statistics of the measurements in Figure 3a can be seen
at 50Hz in Figure 3b. All statistics are monotonically de-
creasing with ascending frame rates. This is because f1
Cam,
influencing the camera sampling delay, gets smaller. tmin de-
creases because with ascending frame rates, we increase the
gain of the camera sensor, which allows the LED to be turned
on later during exposure and still be detected by the PT. The
95% confidence interval for the mean estimation lies between
the curves MeanUpper and MeanLower. The delay distribu-
tions of the different frame rates resemble the distribution in
Figure 3a, thus providing no further insight and are therefore
not depicted.
The triple (minimum delay / mean delay / maximum
delay) sufficiently describes the G2G delay characteris-
tics of a system, so this is the metric we report. For the
G2G delay [ms]
15 20 25 30 35 40 45 50 55
Relative frequency
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
(a) G2G delay measurement distribution for 50Hz camera frame rate
Camera Frame Rate [Hz]
0 50 100 150 200 250 300
G2G Delay [ms]
0
10
20
30
40
50
60
70
80
Max
MeanUpper
MeanLower
Min
SD
(b) G2G delay measurement distribution characteristics for different
frame rates of the camera
Fig. 3: Measurements
fCam = 25Hz and fCam = 300Hz camera frame rates, these
are (24.8/50.4/78.7)ms and (8.1/15.5/23)ms, respectively.
For fCam = 25Hz, the width of the histogram, which is the
difference between minimum and maximum delay, is 53.9ms,
approximating 1
fCam +1
fDis
40ms + 16.7ms = 56.7ms.
An analog approximation holds over all measured camera
frequencies, which again confirms the model from Section
2.3.
4. CONCLUSIONS
We proposed an inexpensive, automatic and highly precise
G2G delay measurement system. It unifies advantages of pre-
viously proposed implementations and can be used to inde-
pendently assess bigger, more complex video transmission
systems. Furthermore, we briefly discussed the origins of de-
lay in video transmission and showed that the measurements
fit to the proposed model.
5. REFERENCES
[1] Federico Boccardi, Robert W Heath, Aurelie Lozano,
Thomas L Marzetta, and Petar Popovski, “Five disrup-
tive technology directions for 5g, IEEE Communica-
tions Magazine, vol. 52, no. 2, pp. 74–80, 2014.
[2] Gerhard P Fettweis, “The tactile internet: Applications
and challenges,” IEEE Vehicular Technology Magazine,
vol. 9, no. 1, pp. 64–70, 2014.
[3] Mitchell JH Lum, Diana CW Friedman, Hawkeye HI
King, Regina Donlin, Ganesh Sankaranarayanan, Tim-
oty J Broderick, Mika N Sinanan, Jacob Rosen, and
Blake Hannaford, “Teleoperation of a surgical robot via
airborne wireless radio and transatlantic internet links,”
in Field and service robotics. Springer, 2008, pp. 305–
314.
[4] Curtis W Nielsen, Michael Goodrich, Robert W Ricks,
et al., “Ecological interfaces for improving mobile robot
teleoperation,” IEEE Transactions on Robotic, vol. 23,
no. 5, pp. 927–941, 2007.
[5] Klaus David and Alexander Flach, “Car-2-x and pedes-
trian safety, IEEE Vehicular Technology Magazine, vol.
5, no. 1, pp. 70–76, 2010.
[6] Andreas Festag, Roberto Baldessari, Wenhui Zhang,
Long Le, Amardeo Sarma, and Masatoshi Fukukawa,
“Car-2-x communication for safety and infotainment in
europe,” NEC Technical Journal, vol. 3, no. 1, pp. 21–
26, 2008.
[7] Rhys Hill, Christopher Madden, Anton van den Hengel,
Henry Detmold, and Anthony Dick, “Measuring latency
for video surveillance systems, in Digital Image Com-
puting: Techniques and Applications, 2009. DICTA’09.
IEEE, 2009, pp. 89–95.
[8] John MacCormick, “Video chat with multiple cameras,”
in Proceedings of the 2013 conference on Computer
supported cooperative work companion. ACM, 2013,
pp. 195–198.
[9] Marco C Jacobs, Mark A Livingston, et al., “Manag-
ing latency in complex augmented reality systems, in
Proceedings of the 1997 symposium on Interactive 3D
graphics. ACM, 1997, pp. 49–ff.
[10] Tobias Sielhorst, Wu Sa, Ali Khamene, Frank Sauer, and
Nassir Navab, “Measurement of absolute latency for
video see through augmented reality, in Proceedings of
the 2007 6th IEEE and ACM International Symposium
on Mixed and Augmented Reality. IEEE Computer So-
ciety, 2007, pp. 1–4.
[11] Omer Boyaci, Andrea Forte, Salman Abdul Baset, and
Henning Schulzrinne, “vdelay: A tool to measure
capture-to-display latency and frame rate, in Multime-
dia, 2009. ISM’09. 11th IEEE International Symposium
on. IEEE, 2009, pp. 194–200.
[12] Jack Jansen and Dick CA Bulterman, “User-centric
video delay measurements,” in Proceeding of the 23rd
ACM Workshop on Network and Operating Systems
Support for Digital Audio and Video. ACM, 2013, pp.
37–42.
... Here, the latency measurement accuracy is not needed to be on field/frame level, a measurement accuracy of 0.1 s is sufficient, to accurately measure according to the requirements by FIFA [3]. In this case, traditional glass-to-glass (G2G) measurement will do, i.e., from the lens of the camera to the glass surface of the display [4]. This type of latency measurement is important not only in football, but is relevant in a number of other fields and applications, e.g., remote operation of machines [5] and digital rear-view mirrors. ...
... All these methods suffer from the issue that the latency calculation relies on an image processing algorithm which can be difficult to implement into an unknown system. Published studies tends to deal with measuring delay for one video stream [4,[10][11][12][13][14][15], not synchronicity between multiple video streams. When synchronicity is mentioned, it is often referred to as the synchronicity between video and audio in one video stream [16]. ...
... In Jacobs [17] describes a non-intrusive system that measures glass-to-glass (G2G). In references [4] and [16], Jacobs's method is further developed, but only for the measurement of the latency of one video stream. ...
Article
Full-text available
Changes in the footballing world’s approach to technology and innovation, along with major advancements in broadcasting contributed to the decision by the International Football Association Board to introduce Video Assistant Referees in 2018. The change meant that under strict protocols referees could use video replays to review decisions in the event of a “clear and obvious error” or a “serious missed incident”. At the time of writing 48-Member Associations have introduced the Video Assistant Referees protocol in at least one of their tournaments and there are many technology providers who work with organisers to implement the Video Assistant Referees systems. To ensure that the use of Video Assistant Referees has a positive effect on the game, Fédération Internationale de Football Association collaborated with the RISE Research Institutes of Sweden to develop objective test methods that could be used to ensure that a system can provide an adequate solution. Each provider must be able to pass requirements that ensure that they can deal with the challenges of processing, coding, decoding, synchronising, and re-formatting of the broadcast feeds. This article will describe the development of the test methods and illustrate some initial results from a test event on Video Assistant Referees system candidates. The methods have shown to be robust and appropriate for their intended purpose and will be developed over the years to ensure the quality of Video Assistant Referees. The developed measurement methods are general and can be applied to other broadcast and video systems as well as to other sports.
... The sensor latency is the time delay between the occurrence of an event in the vehicle environment and its display at the operator's monitor. It is further known as Glass to Glass (G2G) latency [13]. The actuator latency describes the time difference between the operator's input command and the according actuator response at the steering wheel inside the vehicle. ...
... To measure the latency of this pipeline, tools already exist to some extent. Bachhuber et al. [13] proposed a measurement system that consists of a LED, a phototransistor and a microcontroller. The timestamps of the LED and phototransistor activation are subtracted in order to calculate the G2G latency. ...
... 1) Glas to Glas: The time delay between an event happening infront of the camera and the time until the event is shown on the display is called Glas to Glas delay ∆t g2g and can be measured for example with a photon to photon measurement [13]. Here a light emitting diode (LED) controlled by a micro-controller is placed in front of the camera and a high-speed light-to-voltage sensor which is also connected to the micro-controller is placed in front of the display. ...
Conference Paper
Due to the challenges of autonomous driving, backup options like teleoperation become a relevant solution for critical scenarios an automated vehicle might face. To enable teleoperated systems, two main problems have to be solved: Safely controlling the vehicle under latency, and presenting the sensor data from the vehicle to the operator in such a way, that the operator can easily understand the vehicles environment and the vehicles current state. While most of the teleoperation systems face similar challenges, the teleoperation of automated vehicles is unique in its scale, safety requirements and system constraints. Two major constraints are the round-trip-latency and the maximum upload-bandwidth. While the latency mainly influences the controllability and safety of the vehicle, the upload-bandwidth affects the amount of transmit-table sensor data and therefore operators situation awareness, as well as the running costs of the whole system. The focus of this paper is measuring and reducing the end-to-end latency for a teleoperation setup. Therefore the latency is separated into actuator and sensor latency. For each part the different components and settings are analyzed in order to find a realistic minimal end-to-end latency for the teleoperation of automated vehicles. Therefore new measurement methods are developed and existing methods adapted.
... Figure 15 shows the trend of the time delay between the image acquisition and the ROS message publication after image detection and processing, with a mean value equal to 98 ms. To evaluate the image acquisition time, which is a characteristic of the camera module, a glass-to-glass latency test [32] was performed, showing an average time of 55 ms. In this way, it is possible to split the average delay of 98 ms into 55 ms as the average time for image acquisition and the remaining 43 ms as the time required for image detection and processing. ...
Article
Full-text available
In recent years, the research on object detection and tracking is becoming important for the development of advanced driving assistance systems (ADASs) and connected autonomous vehicles (CAVs) aiming to improve safety for all road users involved. Intersections, especially in urban scenarios, represent the portion of the road where the most relevant accidents take place; therefore, this work proposes an I2V warning system able to detect and track vehicles occupying the intersection and representing an obstacle for other incoming vehicles. This work presents a localization algorithm based on image detection and tracking by a single camera installed on a roadside unit (RSU). The vehicle position in the global reference frame is obtained thanks to a sequence of linear transformations utilizing intrinsic camera parameters, camera height, and pitch angle to obtain the vehicle’s distance from the camera and, thus, its global latitude and longitude. The study brings an experimental analysis of both the localization accuracy, with an average error of 0.62 m, and detection reliability in terms of false positive (1.9%) and missed detection (3.6%) rates.
... However, OCR cannot be used in panoramic frames such as in 360 • video since images are stretched and therefore clock' numbers may not be recognized. In this paper, a hardwarebased technique inspired from [78] and applied to the VR application on HMD is used to measure G2G, GRE, HRL, and CTL. The method is applied to a HMD, following the architecture in Figure 7, and as shown in Figure 5. ...
Article
Full-text available
The management of remote services, such as remote surgery, remote sensing, or remote driving, has become increasingly important, especially with the emerging 5G and Beyond 5G technologies. However, the strict network requirements of these remote services represent one of the major challenges that hinders their fast and large-scale deployment in critical infrastructures. This paper addresses certain issues inherent in remote and immersive control of virtual reality (VR)-based unmanned aerial vehicles (UAVs), whereby a user remotely controls UAVs, equipped with 360∘ cameras, using their Head-Mounted Devices (HMD) and their respective controllers. Remote and immersive control services, using 360∘ video streams, require much lower latency and higher throughput for true immersion and high service reliability. To assess and analyze these requirements, this paper introduces a real-life test-bed system that leverages different technologies (e.g., VR, 360∘ video streaming over 4G/5G, and edge computing). In the performance evaluation, different latency types are considered. They are namely: i) Glass-to-Glass latency between the 360∘ camera of a remote UAV and the HMD display, ii) user/pilot’s reaction latency, and iii) the command/execution latency. The obtained results indicate that the responsiveness (dubbed Glass-to-Reaction-to-Execution -GRE -latency) of a pilot, using our system, to a sudden event is within an acceptable range, i.e., around 900ms.
... Commands are written to, and feedback read from the vehicle CAN bus through a dSpace Autobox, also connected to the vehicle PC via Ethernet. The end-to-end delay, the so-called glass-to-glass (G2G) latency [25], of a 40 Hz, 520p video feed, transmitted over a wired connection and displayed to the operator on a gaming monitor, operating at 144 Hz, is approximately 104 ms. A thorough assessment and comparisons of the latency for different configurations within the same system are provided in [26]. 2 https://youtu.be/bQZLCOpOAQc ...
Preprint
Teleoperation allows a human operator to remotely interact with and control a mobile robot in a dangerous or inaccessible area. Besides well-known applications such as space exploration or search and rescue operations, the application of teleoperation in the area of automated driving, i.e., teleoperated driving (ToD), is becoming more popular. Instead of an in-vehicle human fallback driver, a remote operator can connect to the vehicle using cellular networks and resolve situations that are beyond the automated vehicle (AV)'s operational design domain. Teleoperation of AVs, and unmanned ground vehicles in general, introduces different problems, which are the focus of ongoing research. This paper presents an open source ToD software stack, which was developed for the purpose of carrying out this research. As shown in three demonstrations, the software stack can be deployed with minor overheads to control various vehicle systems remotely.
Article
Full-text available
New research directions will lead to fundamental changes in the design of future fifth generation (5G) cellular networks. This article describes five technologies that could lead to both architectural and component disruptive design changes: device-centric architectures, millimeter wave, massive MIMO, smarter devices, and native support for machine-to-machine communications. The key ideas for each technology are described, along with their potential impact on 5G and the research challenges that remain.
Article
Full-text available
The dominant paradigm for video chat employs a single camera at each end of the conversation, but some conversations can be greatly enhanced by using multiple cameras at one or both ends. This paper provides the first rigorous investigation of multi-camera video chat, concentrating especially on the ability of users to switch between views at either end of the conversation. A user study of 23 individuals analyzes the advantages and disadvantages of permitting a user to switch between views at a remote location. Benchmark experiments employing up to four webcams simultaneously demonstrate that multi-camera video chat is feasible on consumer hardware. The paper also presents the design of MultiCam, a software package permitting multi-camera video chat. Some important trade-offs in the design of MultiCam are discussed, and typical usage scenarios are analyzed.
Chapter
Full-text available
Robotic assisted surgery generates the possibility of remote operation between surgeon and patient. We need better understanding of the engineering issues involved in operating a surgical robot in remote locations and through novel communication links between surgeon and surgery site. This paper describes two recent experiments in which we tested the RAVEN, a new prototype surgical robot manipulation system, in field and laboratory conditions. In the first experiment, the RAVEN was deployed in a pasture and ran on generator power. Telecommunication with the surgical control station was provided by a novel airborne radio link supported by an unmanned aerial vehicle. In the second experiment, the RAVEN was teleoperated via Internet between Imperial College in London and the BioRobotics Lab at the University of Washington in Seattle. Data are reported on surgeon completion times for basic tasks and on network latency experience. The results are a small step towards teleoperated surgical robots which can be rapidly deployed in emergency situations in the field.
Conference Paper
Full-text available
We present vDelay, a tool for measuring the capture-to-display latency (CDL) and frame-rate of real-time video applications such as video chat and conferencing. vDelay allows measuring CDL and frame-rate without modifying the source code of these applications. Further, it does not require any specialized hardware. We have used vDelay to measure the CDL and frame-rate of popular video chat applications such as Skype, Windows Live Messenger, and GMail video chat. vDelay can also be used to measure the CDL and frame-rate of these applications in the presence of bandwidth variations.
Conference Paper
The complexities and physical constraints associated with video transmission make the introduction of video playout delays unavoidable. Tuning systems to reduce delay requires an ability to effectively and easily gather delay metrics on a potentially wide range of systems. In order to support this process, we report on a system called videoLat. VideoLat provides an innovative approach to understand glass-to-glass video delays. This paper provides a series of requirements for obtaining representative delay information, it illustrates how such measurements can provide insights into complex (and often closed) video processing systems, and it describes how user-centric testing can be supported in a more realistic manner. We also survey the present state of the art in video delay measurement. The main contribution of this work is that it provides a measuring framework that could serve as the basis for obtaining representative comparative measurements across a wide range of video processing environments.
Article
Wireless communications today enables us to connect devices and people for an unprecedented exchange of multimedia and data content. The data rates of wireless communications continue to increase, mainly driven by innovation in electronics. Once the latency of communication systems becomes low enough to enable a round-trip delay from terminals through the network back to terminals of approximately 1 ms, an overlooked breakthrough?human tactile to visual feedback control?will change how humans communicate around the world. Using these controls, wireless communications can be the platform for enabling the control and direction of real and virtual objects in many situations of our life. Almost no area of the economy will be left untouched, as this new technology will change health care, mobility, education, manufacturing, smart grids, and much more. The Tactile Internet will become a driver for economic growth and innovation and will help bring a new level of sophistication to societies.
Article
Vehicular communication based on short-range wireless technology opens up novel applications improving road safety and travel comfort. Ad hoc networking enables a direct communication among cars as well as between cars and road-side communication devices. Geocast is an ad hoc routing scheme which is specifically considered in Eu- rope as a core networking concept for future CAR-2-X systems. It provides wireless multi-hop communication and allows for geographical addressing and routing. This paper describes advanced concepts and mechanisms to de- ploy Geocast in realistic environments and presents NEC's CAR-2-X platform.
Article
Traffic accidents involving pedestrians or cyclists cause thousands of fatalities and serious injuries every year. In this article, we present an innovative approach for a collision avoidance system that seeks to reduce these accidents. We also discuss the different architectural approaches utilizing ad hoc and/or cellular technologies and different processing setups and present a physical analysis of the system time available between detection, warning, and reaction to give an overview of the time constraint.
Conference Paper
The increased flexibility and other benefits offered by IP network cameras makes them a common choice for installation in new and expanded surveillance networks. One commonly quoted limitation of IP cameras is their high latency when compared to their analogue counterparts. This causes some reluctance to install or upgrade to digital cameras, and is slowing the adoption of live, intelligent analysis techniques in video surveillance systems. This paper presents methods for measurement of the latency in systems based upon digital IP or analogue cameras. These methods are camera-agnostic and require no specialised hardware. We use these methods to compare a variety of camera models. The results demonstrate that whilst analogue cameras do have a lower latency, most IP cameras are within acceptable tolerances. The source of the latency within an IP camera is also analysed, with prospects for improvement identified.
Conference Paper
Registration (or alignment) of the synthetic imagery with the real world is crucial in augmented reality (AR) sys- tems. The data from user-input devices, tracking devices, and imaging devices need to be registered spatially and tem- porally with the user's view of the surroundings. Each device has an associated delay between its observations of the world and the moment when the AR display presented to the user appears to be aected by a change in the data. We call the dierences in delay the relative latencies. Relative latency is a source of misregistration and should be reduced. We give general methods for handling multiple data streams with dieren t latency values associated with them in a working AR system. We measure the latency dierences (part of the system dependent set of calibrations), time-stamp on-host, adjust the moment of sampling, and interpolate or extrapo- late data streams. By using these schemes, a more accurate and consistent view is computed and presented to the user. CR Categories and Subject Descriptors: I.3.7 (Three- Dimensional Graphics and Realism): Virtual Reality; I.3.1 (Hardware Architecture): Three-dimensional displays; I.3.6 (Methodology and Techniques): Interaction Techniques.