Content uploaded by Arne Roennau
Author content
All content in this area was uploaded by Arne Roennau on Oct 23, 2020
Content may be subject to copyright.
Distributed and Synchronized Setup towards Real-Time Robotic Control
using ROS2 on Linux
L. Puck1P. Keller1T. Schnell1C. Plasberg1A. Tanev1G. Heppner1A. Roennau1R. Dillmann1
Abstract— A monolithic black-box controller made by the
individual robotic manufacturers commonly controls modern
industrial robots. The setup’s single components are not ac-
cessible nor exchangeable, often due to them being specially
tuned and adjusted to fulfill the demanding requirements for
robotic control. The open-source framework ROS enables to
combine these monolithic controllers with simple interfaces,
therefore allowing more complex robotic applications. The
next generation, ROS2, targets highly modular systems of
sensors, actuators and controllers, each being interchangeable
and further providing real-time capabilities by employing DDS
as middleware. This study uses system inherent tools alongside
non-invasive measurements for comprehensive insights, thereby
guiding to ROS2 applications on an underlying distributed and
synchronized real-time Linux system.
I. INTRODUCTION
Modern industrial robots are commonly controlled by
a monolithic black box controller made by the individual
manufacturers.
Breaching such monolithic designs is one of the concep-
tual goals of the open-source ROS2 (Robot Operating System
2) [1] using DDS (Data Distribution Service) [2] as real-time
capable middleware. However, choosing a modular architec-
ture imposes new challenges such as time synchronization
between modules. The modularity generates reusability of
modules in similar systems and allows specialization for
different tasks. Moreover, it would even allow the separation
of modules on distributed hosts. By being modular and
distributable one adheres to the core principles of ROS2 and
its predecessor.
As ROS2 is under ongoing development, new features and
improvements are integrated quickly in every new release.
Current research by Casini et al [3] explores the response
time of ROS2 in processing chains. The authors provide an
analysis to measure the expected worst case of a robotic
application. Furthermore, they present a real-time scheduling
model for ROS2. Guit´
errez et al. [4] evaluate how CPU
load and network communication affect the latencies of
ROS2. Their work includes a thorough analysis of the Linux
network stack and different DDS versions. Moreover, they
propose methods to obtain bounded latencies.
This work explores the fundamental setup for a dis-
tributed and time critical real-time system, enabling future
and improved ROS2 evaluations. Current research explores
the limitations of ROS2, however it lacks benchmarking the
1Department of Interactive Diagnosis and Service Systems (IDS), FZI
Research Center for Information Technology, Haid-und-Neu-Straße 10–14,
76131 Karlsruhe, Germany.
Fig. 1. Evaluation setup and communication structure. Time synchroniza-
tion is started by the PTP grandmaster on the top left. The time is transferred
over the switch to PTP slaves, i.e. the industrial PCs (IPCs) and the PTP
time converter (TICRO). The slaves generate pulses on their digital outputs
(DOs) which are measured by the oscilloscope. The precise periodic pulse
of the TICRO is used as reference. The IPCs communicate using ROS2 with
DDS OpenSplice as underlying middleware. IPC1 runs two publisher nodes,
which publish every millisecond. On IPC1 and on IPC2 a node subscribes
to the corresponding publisher.
underlying system as well as a high-precision time synchro-
nization between the individual components of the system.
Minimizing and finding bounds for the latencies and jitter
of the ROS2 communication is the main challenge for the
distributed system. Since the ROS2 applications are limited
by the underlying operating system (OS) configuration, this
has to be set up and evaluated first. As a possible next step,
such thoroughly optimized real-time capabilities can be used
in scenarios with distributed ROS2 hardware controllers.
In this research the underlying OS is prepared and evalu-
ated for real-time requirements. Robustness of the system
is achieved by exploring different benchmark scenarios.
Further, multi-perspective evaluation methods are included,
due to each method inferring differently with the system.
With different measurements critical configurations are less
likely to be missed. A small ROS2 application designed
for test purposes serves as a high-level software example
that is examined. The long-term evaluation is conducted
by measuring response time and communication latencies
externally using directly addressed GPIOs.
Core components of this study are the real-time capa-
ble setup and evaluation of a system and its applications.
© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,
including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to
servers or lists, or reuse of any copyrighted component of this work in other works. DOI: 10.1109/CASE48305.2020.9217010
IEEE 16th International Conference on Automation Science and Engineering (CASE), Hong Kong, Hong Kong, pp. 1287-1293, 2020.
Thereby achieving minimized external perturbations which
conflict with the constraints.
The structure of the paper is as follows:
•In Section II a review of the current research will be
presented.
•Section III highlights the setup of a distributed real-time
capable system.
•In Section IV the presented approach will be confirmed
by multi-perspective experiments.
•Afterwards, Section V will highlight the key findings.
•Finally, in Section VI a conclusion and an outlook will
be given.
II. STATE OF THE ART
Real-time capabilities can be divided into soft, firm and
hard real-time, with the latter stating that a missed deadline
leads to system failure. A common approach to enhance
Linux kernels with real-time capabilities is the Preempt-RT
patch [5]. In contrast to commercial or open source operating
systems, such as FreeRTOS [6] for micro-controllers, the
kernel patch does not provide hard real-time. Hard real-
time requires proofs that task deadlines are guaranteed to be
met. The complexity of these proofs becomes unfeasible for
complex and customizable operating systems. Therefore the
Preempt-RT patch for the standard Linux kernel does not
contain a mathematical proof. The goal of the patch is to
minimize the impact of general non-deterministic behavior,
thereby allowing a more predictable execution of tasks by
the operating system. The open source accessibility and
progressed development status also attracts industrial asso-
ciations such as the Open Source Automation Development
Lab [7].
A survey about the use and evaluation methods regarding
the Preempt-RT Linux systems is provided by Reghenzani et
al [8]. The authors conclude, that although the kernel patch
is not hard real-time and therefore not suitable for security
related applications, the Preempt-RT patch enables research
and test applications with little effort and high usability.
Furthermore, the authors state that dual-kernel approaches
reach better performance regarding worst-case scheduling
latencies at the expense of increased development efforts of
user space applications and limited portability. A possible
scenario for real-time Linux systems are real time networks,
belonging to this group are distributed ROS2 applications.
Dantam et al [9] and Guti´
errez et al [10] assess the network
communication performance using the Preempt-RT patch.
They execute their benchmarks on a single host using both
the Preempt-RT Linux and Xenomai as a counterexample
for the dual-kernel scheme. They show that Xenomai has
better worst case latencies of 7µs compared to the Preempt-
RT patch with latencies up to 21µs. The evaluation was
conducted using the loopback devices of each system.
Local wired network communications on embedded multi-
core devices using the Linux Preempt-RT patch are con-
ducted by Guti´
errez et al [10]. The setup of the patched
Linux OS is evaluated employing cyclictest [11]. Detecting
a maximum task scheduling latency of 110µs (88µs with
CPU isolation). For their evaluation the authors measured
the round-trip time of a 500-Byte UDP message sent with
1kHz rate between two identical systems. Furthermore, CPU
affinity and CPU isolation, two tools for managing CPU
usage for user-space applications, are examined. Using a
Linux system without RT capabilities the worst-case round-
trip latency was ≈1.5ms without any load on the devices.
Applying the Preempt-RT patch reduced this by more than
60% to 522µs. CPU affinity (644µs) and isolation (592µs)
could not improve the result. However, under CPU and mem-
ory load or increased network traffic (up to 100 Mbps) these
CPU management tools showed their potential avoiding any
missed deadline. Increasing the concurrent network traffic
devices (TX/RX) high latency spikes could be observed even
with affinity or isolation applied.
The network stack of Linux is a key element in ROS2
communications. Especially the UDP transport is of impor-
tance, since most of the DDS implementations use UDP as
their default protocol [2]. Extensive tests evaluating ROS2
are performed by Maruyama et al [12]. The authors test local
and remote communications of ROS2 nodes employing three
different DDS implementations (Connext [13], OpenSplice
[14], FastRTPS [15]). Real-time capabilities are enabled
using FIFO scheduling policies (SCHED FIFO) and memory
locking (mlockall). Cerqueira et al [16] realize an in-depth
comparison of using real-time priorities on mainline Linux
kernels and patched versions (Preempt-RT), thereby reveal-
ing significant differences. The authors state that the task
scheduling is performed faster and with increased robustness
on systems with the preemptive patch applied, even in idle
scenarios. Once CPU- or IO-bound load is generated, laten-
cies became infeasible large with peaks over 1ms. Neither
these types of stress sources nor others were applied in
[12] for the benchmarks. Their tests expose various ROS2
performance characteristics, especially for the integrated
DDS implementations. Primarily, they show that DDS is
responsible for ≈70% of the time in the communication
stack. The rest is split nearly equally into type conversion
procedures regarding the message data from ROS2 to DDS
and vice versa. A negligibly small amount of the transmission
delay is caused by other processes. In addition, they notice
that there can be significant differences of communica-
tion performance in ROS2. In their evaluation OpenSplice
(Community Edition) [14] performs better than Connext
[13]. FastRTPS [15] did not support enough functionality
to allow for a fair comparison at the time of their study. The
development and improvements since 2016 certainly have
changed the performance of DDS implementations overall.
Nonetheless, it demonstrates that performance levels between
DDS vendors might be significantly different. Furthermore,
Maruyama et al [12] evaluate ROS2 against ROS version 1.
ROS2 communication with reliable quality of service (QoS)
is comparable to TCP communication which is commonly
used in the predecessor. In their evaluation, both versions
perform almost equally in remote communication scenarios
between multiple machines. The authors state, that in local
scenarios ROS1 is faster than ROS2 by a noticeable factor.
Similar to this work, Guti´
errez et al[4] examine the real-
time capabilities and performance of ROS2 on patched
preemptive RT Linux systems. Three different DDS imple-
mentations, varying CPU work-load and concurrent network
traffic are taken into account. The round-trip time between a
standard PC and an embedded device (dual core) is measured
using software timestamping. Despite carefully assigning
real-time priorities to all involved threads and using real-
time compatible allocation in ROS2 the measured worst-
case round-trip time with applied system stress and 1Mbps
concurrent network traffic reaches 2182µs. With 40Mbps
network traffic they observe a worst-case round-trip time of
4942µs. Profound evaluation of the real-time performances
regarding the underlying Preempt-RT systems is missing.
Two more adaptations of ROS2 are developed, both with
ongoing research regarding real-time ROS2 applications.
Micro-ROS [17] uses the ROS2 stack but targets embedded
devices. For this use-case explicit real-time OS are used
and DDS implementations for limited hardware resources
are integrated. Apex.OS [18], a commercial OS, is based
on ROS2 and running on a real-time operating system
it provides ROS2 functionality in a modified version to
guarantee hard real-time. Both projects actively contribute
to the open source ROS2 community.
Previous work revealed insights and crucial points in
ROS2 and real-time Linux systems and their performance. In
this work we examine the real-time capabilities of the latest
ROS2 release Eloquent Elusor on a real-time Linux OS using
various evaluation scenarios and strategies. Furthermore,
ROS2 and the underlying preemptive real-time Linux OS
are subject to an holistic iterative evaluation and optimization
procedure to improve the performance.
III. DISTRIBUTED REAL-TIME SYSTEM
The used hardware and the setup for external measure-
ments is introduced in this section. A comprehensive multi-
perspective evaluation of the system capabilities follows.
To conclude this section, research methods to inspect the
network latencies are investigated.
A. Hardware and Software Setup
The off-the-shelf hardware consists of two identical PCs,
in the following abbreviated with IPCs. The IPCs and pro-
cessor type were chosen according to their promising OS-
ADL benchmark results. This industrial association regularly
tests complete systems for their real-time capabilities using
cyclictest [11]. Those results can further be used as baselines
to evaluate the configuration of comparable setups. In table
I the hardware and OS specifications are detailed.
Both IPCs are initially set up with a clean ”out-of-the-box”
install of Debian 10 (buster). For the kernel the Linux 4.19
version is chosen, due to the fact that the already released 5.X
versions revealed worse real-time capabilities according to
several OSADL benchmark results. As evaluated in previous
work [19], [16], [8] a well performing real-time system can
be set up with the Preempt-RT Linux patch, in this work
the 4.19.72-rt25 patch is applied. Using the preempt patch
the introduced requirements on the software are negligible,
contrary to other approaches like Xenomai, which require
special libraries [20].
TABLE I
SYS TEM SET UP
Hardware
Platform Shuttle SH370 R8
Chipset Intel H370
Processor Intel Core i9-9900K (3600 MHz, 8 Cores)
RAM G.Skill DIMM 16GB DDR4-3200
Setup
OS Debian 10
Kernel (Preempt Patch) 4.19.72(-rt25)
BIOS 2.20.1271 (American Megatrends)
PTP
Grandmaster OTMC 100i (Omicron)
PTP time converter TICRO 100 - OCXO 25 (Omicron)
Switch IDS 509 (Perle)
Software
ROS2 Eloquent Elusor
DDS Vortex OpenSplice Community Edition
Regarding the time synchronization, a Precision Time Pro-
tocol (PTP) grandmaster sends out master clock messages.
The PTP switch is set up to forward mode, thereby allow-
ing pass-through of all PTP-messages within the network.
The PTP slaves, i.e. the IPCs and the PTP time converter
(TICRO), are synchronized using their network interface
controllers (NIC) with hardware-timestamping. Moreover,
the PTP time converter publishes a periodically pulsed ref-
erence signal required as a reference for the oscilloscope
measurements. These pulses are recorded as the ground truth
with respect to the timing of the other components, which
generate pulses themselves on their GPIOs to non-invasively
measure scheduling latencies or communication delays. The
serial interface pins on the motherboards of the IPCs are used
as GPIOs. They are directly set using the according registers
to introduce only minimized overhead.
Figure 1 displays the communication setup of the PTP
time synchronization and of the ROS2 test application. IPC1
contains two publishers, which are scheduled to wake up
every millisecond according to the internal synchronized
clock and publish a ROS2 message. When woken up, they
immediately enable the corresponding GPIO, then publish
the message and finally disable the GPIO again. To fulfill
the requirements for both use-cases one local publisher has
a subscriber that runs locally and the remote publisher’s
subscriber runs on the remote IPC2. The local one publishes
every millisecond with an offset of 150µs, to ensure that no
race conditions appear between the publishers. In figure 1 the
different communication protocols are visualized. The net-
work is synchronized using PTP and the IPCs communicate
on the same network using DDS (via ROS2). Both protocols
rely on UDP messages. For this work Vortex OpenSplice
(Community Edition) is chosen, however any DDS imple-
mentation can be evaluated in the same manner. Furthermore,
the ROS2 version Eloquent Elusor is used. The TICRO
provides Pulse per Second (PPS) signals, similar pulses are
sent from the GPIOs of the IPCs, thereby allowing the
external measurement and validation with an oscilloscope.
B. Real-time evaluation
The system, once set up, has to be configured and tuned
for real-time capabilities. Starting from a clean ”out-of-the-
box” install of Debian 10 the most important steps to ensure
the fulfillment of real-time requirements for the system are
demonstrated in the following. Using known methods such
as the OSADL cyclictest [11] benchmark each system is
analyzed and evaluated, showing the limitations of such tests
and how to overcome them. The main challenge for this
section is to minimize the response times for each system,
initially regarding the latencies for scheduled interrupts.
Afterwards ROS2 benchmark tests are conducted for intra-
process communication, excluding all network connections.
The systems will be time synchronized using PTP before
performing any external evaluation measurements. Those
non-invasive measurements are used to validate the response
time of the time-synchronized systems against real-time
requirements. The response times are set in relation to the
reference time from the TICRO. To comply with real-time
behavior they need to exhibit an upper bound and ideally are
as short as possible. Once established, this system can then
further be used to develop a modular robotic controller.
The proposed method will first analyze how real-time
wake up times can be measured using cyclictest benchmarks
and further ROS2 real-time tests. Hereafter, the setup is
configured and fine-tuned showing the limitations of the tests
and the need for multi-perspective analysis. Being optimally
configured, a ROS2 application is used to conduct scheduled
wake ups to measure latencies of response times for higher
level software applications. It is ensured that the ROS2 node
is running with real time priority of 95 with FIFO scheduling.
To analyze the periodic wake-up, which is triggered every
millisecond, the GPIOs instantly emit a pulse. By setting this
pulse in relation to the reference signal from the TICRO, the
response time of a time synchronized system can be evalu-
ated. To further extend this research the tests are performed
in a stressed environment1. Since this workload challenges
the real-time capabilities of the operating system, approaches
to specifically seclude the relevant real-time processes are
evaluated. First, CPUs are isolated using the isolcpus
command and processes are moved onto the isolated cores
using taskset. The second approach uses shielded CPUs
keeping the linux scheduler and load balancer active for the
respective cores. This keeps essential kernel threads inside
the shield. The shielding is realized using cset shield.
This evaluation strategy guides the setup of a real-time
capable systems and provides in-depth insight regarding the
execution of the real-time critical applications, thereby allow-
ing to further extend this research into a time synchronized
real-time capable system on consumer based hardware.
C. Network evaluation
ROS2 is developed with real-time capabilities in mind.
Before further evaluating ROS2 and the underlying DDS
1Command used stress -c 8 -i 8 -m 8 - d 8
middleware, the time-synchronized network has to be as-
sessed. As established in the previous section, both IPCs are
synchronized and set up for real-time applications. The linux
network stack is potentially a risk to all distributed real-time
systems [10]. Therefore, an analysis is conducted to evaluate
the network latencies in the local network.
This is achieved by evaluating the timestamps of the net-
work events on both IPCs using Wireshark. The packets can
be compared by their unique identifiers and since both IPCS
are time-synchronized the packets can be set in relation.
Thus, not the round-trip time is measured with additional
overhead as it is done in most related work[10], [4], but
instead the one-way transmission time of the messages.
Furthermore, the network is then set under load using iperf2
to generate RX- and TX-traffic each.
By evaluating the transmission time, further research can
focus on the overlaying applications and middleware for
real-time purposes. Together with the previous evaluation of
the real-time capabilities of the system, this research allows
modular approaches for real-time critical applications using
linux.
IV. EVALUATION
Following this research everybody should be able to set
up and validate a real-time capable system. This means
that the system is robust against perturbations and there are
empirically confirmed upper bounds regarding the system’s
latencies. Furthermore, the system should act deterministic,
taking into account already known limitations regarding the
linux network stack.
The results of this research lay the foundations for upcom-
ing research of the ROS2 communication process. Therefore
the results are validated using different perspectives. This
work will introduce optimization techniques for the complete
setup. They are generally applicable for applications on
real-time Linux systems and not limited to ROS2. Further
optimizations regarding in-depth ROS2 or specific DDS
middleware implementations are only addressed briefly.
The final evaluation shows a validated stress-robust dis-
tributed network with no additional overhead in the com-
munications. Moreover, even though this research does not
include mathematical proofs, the setup withstands long-term
tests under heavy computational load.
The first view into the real-time capabilities is the OSADL
benchmark test using cyclictest [11]. According to this test
the hardware and kernel version have been chosen, thus, this
setup is expected to be real-time capable. In Figure 2 the
plots of different settings are visualized. Figure 2a shows the
initial setting of a clean install with only hyperthreading and
virtualization technologies disabled. The maximum latency
is rather high with 425µs. Changing the mode of the CPU
frequency scaling from powersave to performance mode
drastically reduces the latencies to 38µs. This shows that the
main focus of consumer based technologies are power saving
capabilities. Further improvement is gained by disabling all
2Command used iperf -u -c -b $bandwidth
(a) Response time of the out-of-the-box-system (b) Response time after changing to performance
modus
(c) Response time in performance modus with
disabled debug options
Fig. 2. Evaluation of system response time using OSADL’s cyclictest benchmark. Different setups improve the response time drastically. 2c shows response
times according to the benchmark tests, therefore the system appears to be setup accordingly. Command used for evaluation is cyclictest -lNUMCYCLES
-m -Sp95 -i200 -h400 with NUMCYCLES being 108.
kernel debug options which are enabled by default. Response
times shrink to a maximum of 21µs. All test are based on 108
cycles in cyclictest. This last result is among the best of the
real-time benchmark baselines given by OSADL. Based on
these observations, the system appears to be set up correctly.
Additional experiments to examine the response times
and latencies of ROS2 applications are performed using
the pendulum real-time test provided by ROS2. It becomes
apparent that further investigations are needed to validate a
system. Figure 3a shows the result of the test before disabling
the processors’ C-states. Looking at the histogram two peaks
show up, at ≈4µs and at 80µs. Further in-depth insights
can be gathered using this demo combined with ftrace [21],
which revealed occurrences of idle states. These idle or
C-states were already disabled in the BIOS, however the
Linux kernel overwrites some BIOS settings with its own
configuration. By disabling them in the corresponding grub
configuration, they can be disabled with little effort. After
correctly disabling the C-states, the resulting latencies are
visualized in Figure 3b. The results show, except for an initial
delay, stable latencies below 5µs. Even after inducing stress
to the system these values hold true (see Fig. 3c).
The previously shown experiments validated the response
time of the preemptive Linux operating system using two
procedurally different benchmarking systems. To validate the
wake up times on the application layer, the non-invasive
external testing with the oscilloscope has been conducted.
The triggering of the GPIOs from within the applications
source code adds an overhead of a few microseconds, which
is accepted in favor of an external measurement which
is otherwise not inferring with the system’s performance.
The application consists of a simple ROS2 subscriber and
its publisher, that is triggered to publish a message each
millisecond. The wake up times of different scenarios can
be seen in table II and in the corresponding figure 4. Each
scenario contains over 2x106samples.
The results show that, on average, the systems perform
similarly under stress and in idle states. Furthermore, the
stress only increases the response time of the application by
2µs. However, core isolation and shielding are still realized
TABLE II
MEASUREMENTS OF RESPONSE TIME
Min Max Mean Std Dev
Idle 14.31 µs 43.45 µs 18.18 µs 294 ns
Stress 9.89 µs 45.60 µs 17.43 µs 1.49 µs
Isolated 0.0 µs 1000.0 µs 17.20 µs 1.72 µs
Shielded 2CPUs 9.71 µs 69.56 µs 17.34 µs 1.49 µs
Shielded 4CPUs 11.43 µs 41.47 µs 17.45 µs 1.47 µs
to reduce perturbations. It has to be noted, that ROS2 and
OpenSplice spawn 58 processes, where most computation
time is applied by the ROS2 Nodes and the OpenSplice
transmitting and receiving processes. Isolation is done by
adding the isolcpus parameter to the grub configuration
and then using taskset to move the processes to the
isolated cores. For this study two cores have been isolated for
the processes of the test application. The results exhibit that
core isolation leads to inaccuracy regarding the timings, since
the scheduler and load balancer are not used by those cores
anymore. Thereby the system wakes up at desynchronized
times, and thus the underflow occurs. Due to the limitations
of the oscilloscope’s measuring method, min and max cannot
be measured for isolated cores (table II). The underflow can
be seen in figure 4. The true maximum lies around 160µs,
therefore still decreasing in accuracy. Shielding with cset
shield slightly improves over the idle state. Test with each
two and four shielded cores were conducted, showing that,
due to many ROS2 and DDS processes, it is advisable to have
more cores inside the shield. However, the standard deviation
increases for all stressed systems over the idle state.
Finally, the transmission times between the two network
interfaces of IPC1 and IPC2 has been evaluated. This ex-
cludes all delays that are introduced above the physical
layer (e.g. by the application) and focuses on the network
transmissions with induced traffic on top. Traffic was created
using iperf. For the results of the evaluation in Table III
only TX-traffic is regarded.
They state that increased network traffic will further
increase the latencies of the network. The latencies grow
with higher network load until the threshold of 300 Mbit/s
(a) Latencies of intra-process test after system
setup solely based on cyclictest performance.
(b) Response time after disabling C-states for
the CPU.
(c) Response time of a stressed system after
disabling C-states.
Fig. 3. Further evaluation of the real-time system using the ROS2 pendulum demo for intra-process communication. Shows that, even though a system
appears to be set up correctly, there still might be deficits. The right graphs show the results after final optimization with completely disabled the processors’
C-States.
Fig. 4. Shows the different response times as measured by the oscilloscope.
This is a selection of the oscilloscope recordings. Only the first half of the
test interval (1ms) is shown. At the bottom is the reference signal from the
TICRO. The other lines are the summed results of 2x106measurements for
each trial. Especially striking is the wide spread of the isolated trial with
the underflow due to desynchronization, which can be seen in Table II.
TABLE III
NETWORK DELAYS WITH INDUCED TRAFFIC
Traffic in Mbit/s Min Mean Max
0 91.1 µs 109.7 µs 299.8 µs
10 89.1 µs 124.7 µs 435.2 µs
100 87.5 µs 255.1 µs 622.6 µs
500 91.5 µs 225.8 µs 553.1 µs
is exceeded. Then instead of using interrupt requests a
continuous polling mode is automatically activated by the
system to reduce load. This explains the minor improvement
between 100 Mbit/s and 500 Mbit/s. Issuing the same test
using RX-traffic on IPC1, PTP reaches its limitations in a
busy network. Since PTP, by design, assumes stable round-
trip times, changes in latency lead to a loss in accuracy.
This leads to the conclusion that for time synchronization an
additional network dedicated only to PTP is preferred over
a shared network.
This research shows that a stable real-time capable system
can be configured using consumer-based hardware. The
results were confirmed in repeated long-term runs, and hold
true over a large time frame. Furthermore, it is shown that
the system interrupts and tasks on the application layer are
handled with low latencies and therefore react in a sufficient
time frame.
The results have been evaluated using an external, non-
invasive measurement method, and show reaction times of
45µs as maximum on a stressed system. Even though the
control of digital outputs introduces slight overhead, it can
be used to measure the systems latencies according to a
reference signal. Regarding the order of magnitude the results
are comparable to the benchmark tests of OSADL, but are
on a high-level application. More so, the experiments show
that with the proper shielding methods the real-time critical
components still work correctly in an otherwise stressed
system. The shielding appears to not be that important for
scheduled wake-ups, they are, however, presumed to have a
greater impact once the whole stack is examined.
Main limitations as previously expected are the Linux
network stack, which introduced high latencies and jitter
in busy networks. Additionally, the PTP synchronization is
easily disturbed with concurrent RX-network traffic.
In conclusion, applying a correct system setup a real-time
capable system can be set up, even reducing disturbances
using the correct shielding. Though as a bit of a surprise,
isolating cores can have negative influences, due to the load
balancer and scheduler not being active on the cores any-
more. Therefore, using cset shield is advised instead.
V. RESULTS
This work evaluates the setup of a distributed time-
synchronized system. Though other work already showed
that a real-time capable system is possible to be set up, this
work further evaluated the response time of such a system
using external measurements, thus giving the measurements a
higher confidence. This further adds the possibility for future
research to set up time critical applications on consumer
based hardware. In the context of this work, it allows to do
an in-depth evaluation of ROS2 and its underlying structures
with the final goal of the development of a modular and
distributed real-time robotic controller.
In contrast to the work of Guti´
errez et al [10] firstly an
evaluation of the real-time capabilities of the system has
been performed. Their focus was more on the evaluation of
the network stack, which constitutes the secondary work of
this research. In their ongoing work [4], the authors further
evaluated round-trip times of the ROS2 stack using different
DDS implementations, a step yet to be conducted for this
work. Casini et al [3] present a processing chain analysis
method, taking a theoretical approach to examine real-time
capabilities of ROS2 applications.
Core contributions of this work are:
•Real-time capabilities of consumer-based hardware
This study uses off-the-shelf hardware to set up a real-
time capable system. A very precise distributed time-
synchronization is realized using PTP.
•Evaluation of real-time systems
This work has introduced a multi-perspective evaluation
of the real-time capabilities of a system. Even though
the cyclictest already showed promising results, it is
worth performing additional tests to also detect failures,
which might occur only under certain circumstances.
Tooling such as ftrace in combination with the required
applications helps in understanding what is going on in
which layer of the system.
•External validation
Using the serial interface of the IPCs as GPIOs the
time-synchronized setup can be validated. Furthermore,
the application response time can be evaluated. Even
though introducing a minor overhead, this allows precise
measurements with regard to a reference time.
•Evaluation of network latencies
An evaluation of the network latencies, which can be
expected with higher network, load has been conducted.
However, main insights are that PTP should be used on a
dedicated or low traffic network, to ensure best possible
synchronization. Since PTP is becoming unstable with
variable network latencies caused by network traffic.
VI. CONCLUSIONS AND FUTURE WORKS
The presented approach evaluates the real-time capabilities
of a distributed time-synchronized system, as a potential
basis for robotic control applications with ROS2 on Linux.
A multi-perspective analysis was realized, not only relying
on single benchmark tests. Instead, various tests of the
use-case scenario were performed with different evaluation
methods. In addition, an external measurement approach for
non-invasive validation was applied using a high-precision
reference signal.
As expected the linux network stack causes non-
deterministic latencies and jitter. Therefore, it is advisable not
to overload the local network, and use a dedicated network
for the time-synchronization with PTP when possible.
This research demonstrates the configuration and versa-
tile validation of real-time capabilities with consumer-based
hardware and open-source software. The external measure-
ments allow to validate the robustness of the system even on
extremely stressed systems and networks. For this, the serial
ports of the hosts are utilized as digital outputs to examine
the response time and the synchronicity.
An evaluation of the real-time capabilities regarding the
operating system was conducted using cyclictest as
baseline. From thereon the system was further evaluated
using network and ROS2 benchmarks but also ftrace
to analyze the current configurations in-depth. Finally, the
response time was evaluated using an external measurement
including high workload situations. Furthermore, the limita-
tions of isolating cores versus the benefit of shielding cores
to cope with stress were inspected. Thereby this research
allows the recreation of a stable real-time ready linux system,
further showing where pitfalls reside and how to overcome
them.
After establishing a robust fundamental system and evalu-
ating the network latencies, an analysis of ROS2 and differ-
ent DDS implementations as middleware can be completed.
With all parts thoroughly tested and optimized, an evalua-
tion of ROS2 as real-time capable software for distributed
modular robotic control is intended.
REFERENCES
[1] “ROS2.” [Online]. Available: https://index.ros.org/doc/ros2/
[2] “DDS Interoperability Wire Protocol.” [Online]. Available: https:
//www.omg.org/spec/DDSI-RTPS/
[3] D. Casini, T. Blaß, I. L¨
utkebohle, and B. B. Brandenburg, “Response-
time analysis of ros 2 processing chains under reservation-based
scheduling,” in 31st Euromicro Conference on Real-Time Systems
(ECRTS 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik,
2019.
[4] C. S. V. Guti ´
errez, L. U. S. Juan, I. Z. Ugarte, and V. M. Vilches,
“Towards a distributed and real-time framework for robots: Evaluation
of ros 2.0 communications for real-time robotic applications,” arXiv
preprint arXiv:1809.02595, 2018.
[5] “Linux Real-Time [Wiki].” [Online]. Available: https://wiki.
linuxfoundation.org/realtime/start
[6] “FreeRTOS - Real-time operating system for microcontrollers.”
[Online]. Available: https://www.freertos.org
[7] “Open Source Automation Development Lab.” [Online]. Available:
https://www.osadl.org
[8] F. Reghenzani, G. Massari, and W. Fornaciari, “The real-time linux
kernel: A survey on preempt rt,” ACM Computing Surveys (CSUR),
vol. 52, no. 1, pp. 1–36, 2019.
[9] N. T. Dantam, D. M. Lofaro, A. Hereid, P. Y. Oh, A. D. Ames,
and M. Stilman, “The ach library: A new framework for real-time
communication,” IEEE Robotics Automation Magazine, vol. 22, no. 1,
pp. 76–85, 2015.
[10] C. S. V. Guti ´
errez, L. U. S. Juan, I. Z. Ugarte, and V. M. Vilches,
“Real-time linux communications: an evaluation of the linux com-
munication stack for real-time robotic applications,” arXiv preprint
arXiv:1808.10821, 2018.
[11] “Manpages for cyclictest on Debian Buster.” [Online]. Available:
https://manpages.debian.org/buster/rt-tests/cyclictest.8.en.html
[12] Y. Maruyama, S. Kato, and T. Azumi, “Exploring the performance
of ROS2,” in Proceedings of the 13th International Conference
on Embedded Software - EMSOFT ’16. New York, New
York, USA: ACM Press, 2016, pp. 1–10. [Online]. Available:
http://dl.acm.org/citation.cfm?doid=2968478.2968502
[13] “Connext DDS Professional by RTI.” [Online]. Available: https:
//www.rti.com/products/connext-dds-professional
[14] “Adlink Vortex OpenSplice.” [Online]. Available: https://www.
adlinktech.com/en/vortex-opensplice-data- distribution-service.aspx
[15] “eProsima FastRTPS.” [Online]. Available: https://www.eprosima.
com/index.php/products-all/eprosima-fast-rtps
[16] F. Cerqueira and B. Brandenburg, “A comparison of scheduling latency
in linux, preempt-rt, and litmus rt,” in 9th Annual Workshop on
Operating Systems Platforms for Embedded Real-Time Applications.
SYSGO AG, 2013, pp. 19–29.
[17] “Micro ROS.” [Online]. Available: https://micro-ros.github.io/
[18] “Apex.OS.” [Online]. Available: https://www.apex.ai/apex-os
[19] H. Fayyad-Kazan, L. Perneel, and M. Timmerman, “Linux preempt-rt
vs commercial rtoss: How big is the performance gap?” GSTF Journal
on Computing, vol. 3, no. 1, 2013.
[20] J. H. Brown and B. Martin, “How fast is fast enough? choosing
between xenomai and linux for real-time applications,” in proc. of
the 12th Real-Time Linux Workshop (RTLWS’12), 2010, pp. 1–17.
[21] “ftrace - Function Tracer.” [Online]. Available: https://www.kernel.
org/doc/Documentation/trace/ftrace.txt