Figure - uploaded by José Martins
Content may be subject to copyright.
Average L2 miss rate, data TLB miss rate and stall cycle on memory access rate for the small and large variants of MiBench's qsort benchmark.
Source publication
Given the increasingly complex and mixed-criticality nature of modern embedded systems, virtualiz-ation emerges as a natural solution to achieve strong spatial and temporal isolation. Widely used hypervisors such as KVM and Xen were not designed having embedded constraints and requirements in mind. The static partitioning architecture pioneered by...
Contexts in source publication
Context 1
... of the benchmark, we collected information on L2 cache miss rate, data TLB miss rate, and stall cycle rate for memory access instructions for the qsort benchmarks. Table 4 shows the results for the small and large qsort benchmarks for each scenario. ...Context 2
... observe that hosted execution causes a marginal decrease in performance. This is reflected in Table 4 by a small increase in both L2 cache and data TLB miss rates, which in turn explain the increase in memory access stall rate. As expected, this stems from the virtualization overheads of 2-stage address translation. ...Context 3
... when coloring is enabled, the performance overhead is further increased. This is supported by the results in Table 4 that show an already noticeable increase across all metrics. Again, as expected, this can be explained by the fact that only half of L2 is available, and that coloring precludes the use of superpages, significantly increasing TLB pressure. ...Context 4
... the interference scenario, there is significant performance degradation. The results in Table 4 confirm that this is due to the foreseen explosion of L2 caches misses. Finally, we can see that cache partitioning through coloring can significantly reduce interference. ...Context 5
... we can see that cache partitioning through coloring can significantly reduce interference. Table 4 shows that coloring can completely reduce L2 miss rate back to the levels of the solo colored scenario. However, looking back at Figure 2, we can see that this cutback is not mirrored in the observed performance degradation, which is still higher in the interf-col than the solo-col scenario. ...Context 6
... visible trend in Figure 2 is that performance degradation is always more evident in the small data set variation of the benchmark. When comparing the small and large input data set variants, we see that, despite the increase in L2 cache miss rate in Table 4 being similar, the small variant experiences greater performance degradation. We believe this might be due to the fact that, given that the small input data set benchmarks has smaller total execution times, the cache miss penalty will more heavily impact them. ...Context 7
... believe this might be due to the fact that, given that the small input data set benchmarks has smaller total execution times, the cache miss penalty will more heavily impact them. This idea is supported by the observed memory access stall cycle rate in Table 4, which incurs in a much higher percentage increase for the small input data set case. ...Similar publications
Virtualization is already a key-enabling technology for mixed-criticality embedded systems. Open-source hypervisors such as KVM or Xen were not originally tailored for embedded constraints and real-time requirements, and depend on Linux, resulting in large TCBs and wide attack-surfaces. Furthermore, they do not address the numerous microarchitectur...
Citations
... Mixed-criticality Systems (MCS) are embedded and/or realtime systems that consolidate workloads with two or more distinct criticality levels (e.g., safety-critical and non-safetycritical) [1]- [4]. There are two conflicting requirements in the design of such systems. ...
... We implemented and evaluated the IRQ Coloring on a real modern Arm high-performance multi-core platform (Xilinx ZCU102) running a static partitioning hypervisor (Bao [4]) and multiple Virtual Machine (VM)s. Results for multiple system configurations (i.e., dual and quad-VMs) demonstrated negligible overhead (1%) and reasonable throughput guarantees for medium-critical workloads. ...
... Other works have proposed dynamic re-coloring schemes [27]- [29]. Cache coloring has been implemented in several hypervisors such as Bao [4], Jailhouse [13], and XVisor [14]. DRAM Bank Partitioning This technique leverages the parallelism in DRAM bank access to avoid contention among different workloads. ...
Integrating workloads with differing criticality levels presents a formidable challenge in achieving the stringent spatial and temporal isolation requirements imposed by safety-critical standards such as ISO26262. The shift towards high-performance multicore platforms has been posing increasing issues to the so-called mixed-criticality systems (MCS) due to the reciprocal interference created by consolidated subsystems vying for access to shared (microarchitectural) resources (e.g., caches, bus interconnect, memory controller). The research community has acknowledged all these challenges. Thus, several techniques, such as cache partitioning and memory throttling, have been proposed to mitigate such interference; however, these techniques have some drawbacks and limitations that impact performance, memory footprint, and availability. In this work, we look from a different perspective. Departing from the observation that safety-critical workloads are typically event- and thus interrupt-driven, we mask "colored" interrupts based on the \ac{QoS} assessment, providing fine-grain control to mitigate interference on critical workloads without entirely suspending non-critical workloads. We propose the so-called IRQ coloring technique. We implement and evaluate the IRQ Coloring on a reference high-performance multicore platform, i.e., Xilinx ZCU102. Results demonstrate negligible performance overhead, i.e., <1% for a 100 microseconds period, and reasonable throughput guarantees for medium-critical workloads. We argue that the IRQ coloring technique presents predictability and intermediate guarantees advantages compared to state-of-art mechanisms
... Timing and safety-critical flight control modules are implemented as latency-sensitive threads in a lightweight real-time OS (Quest (Danish et al. 2011)), alongside mission control tasks in Yocto Linux. FlyOS's separation kernel works on the principle of partitioning hypervisors (Cesarano et al. 2022;Li et al. 2014a;Martins et al. 2020;Ramsauer et al. 2017;Technology 2014;West et al. 2016) whereby each guest directly manages its own set of allocated resources without any run-time intervention of the most trusted compute base (TCB) of the hypervisor. It differs in its partitioning scheme compared to the state-of-the-art ARINC-653 extended architectures, which predominantly employ consolidating hypervisors (Craveiro et al. 2009;VanderLeest 2010). ...
Autonomous multicopters often feature federated architectures, which incur relatively high communication costs between separate hardware components. These costs limit the ability to react quickly to new mission objectives. Additionally, federated architectures are not easily upgraded without introducing new hardware that impacts size, weight, power and cost constraints. In turn, such constraints restrict the use of redundant hardware to handle faults. In response to these challenges, we propose FlyOS, an Integrated Modular Avionics approach to consolidate mixed-criticality flight functions in software on heterogeneous multicore aerial platforms. FlyOS is based on a separation kernel that statically partitions resources among virtualized sandboxed OSes. We present a dual-sandbox prototype configuration, where timing- and safety-critical flight control tasks execute in a real-time OS alongside mission-critical vision-based navigation tasks in a Linux sandbox. Low latency shared memory communication allows flight commands and data to be relayed in real-time between sandboxes. A hypervisor-based fault-tolerance mechanism is also deployed to ensure failover flight control in case of critical function or timing failures. We validate FlyOS’s performance and showcase its benefits when compared against traditional architectures in terms of predictable, extensible and efficient flight control.
... Xen Project and Jailhouse ARM versions are some of the hypervisors that are already compatible with such technique, as successfully shown in the European projects I-MECH [27] and Hercules [28], respectively. Xvisor [29] and Bao [30] also presented positive results while using cache colouring. Other approaches were also proposed to improve isolation and guarantee real-time requirements. ...
With the emergence of the Industry 4.0 paradigm, there is a need to introduce a significant degree of flexibility, security and resilience in automation infrastructures, while keeping up with real-time requirements that are characteristic of such domains. Interestingly, many of these driving principles are the same that encouraged the adoption of virtualization technologies on the IT domain, somehow suggesting that the same benefits could be realisable for Industrial and Automation Control Systems, allowing to virtualize servers and cyber–physical system control devices. However, the suitability of using off-the-shelf hypervisor technologies to address the specific real-time requirements of automation infrastructures remains unclear, due to their focus on maximising systems throughput and capacity, often at the expense of determinism and increased latency.
This work addresses this problem, presenting a discussion and an empirical evaluation on the feasibility of using general purpose off-the-shelf hypervisors to virtualize cyber–physical systems’ servers and control devices. While the evaluation concludes that some of these hypervisors are already capable of dealing with typical real-time workloads, this cannot be generalised to all types of real-time systems.
... Timing and safety-critical flight control modules are implemented as latency-sensitive threads in a lightweight real-time OS (Quest (Danish et al., 2011)), alongside mission control tasks in Yocto Linux. FlyOS's separation kernel works on the principle of partitioning hypervisors (Cesarano et al., 2022;Li et al., 2014a;Martins et al., 2020;Ramsauer et al., 2017;S. C. Technology, 2014;West et al., 2016) whereby each guest directly manages its own set of allocated resources without any run-time intervention of the most trusted compute base (TCB) of the hypervisor. ...
Autonomous multicopters often feature federated architectures, which incur relatively high communication costs between separate hardware components. These costs limit the ability to react quickly to new mission objectives. Additionally, federated architectures are not easily upgraded without introducing new hardware that impacts size, weight, power and cost (SWaP-C) constraints. In turn, such constraints restrict the use of redundant hardware to handle faults. In response to these challenges, we propose FlyOS, an Integrated Modular Avionics (IMA) approach to consolidate mixed-criticality flight functions in software on heterogeneous multicore aerial platforms. FlyOS is based on a separation kernel that statically partitions resources among virtualized sandboxed OSes. We present a dual-sandbox prototype configuration, where timing-and safety-critical flight control tasks execute in a real-time OS alongside mission-critical vision-based navigation tasks in a Linux sandbox. Low latency shared memory communication allows flight commands and data to be relayed in real-time between sandboxes. A hypervisor-based fault-tolerance mechanism is also deployed to ensure failover flight control in case of critical function or timing failures. We validate FlyOS's performance and showcase its benefits when compared against traditional architectures in terms of predictable, extensible and efficient flight control.
... It introduces the notion of cells, with statically assigned resources that are exclusively mapped to one guest operating system (OS) and its applications called inmates, that, in our vision, can host the functions of a Real-Time FaaS. Bao [8] is a lightweight bare-metal hypervisor for mixed-criticality IoT systems, focusing on security and safety requirements by providing strong isolation, fault-containment and real-time features. Xtratum [9] is a paravirtualized partitioning hypervisor certifiable for the avionic domain according to the ARINC 653 standard. ...
... In this context, hypervisor design must balance, on one side, minimality for safety and security, and feature-richness and efficient sharing of resources on the other. While traditional hypervisors were optimized for the latter [5], [6], on the opposite end of the spectrum we have static partitioning hypervisors (SPH) specifically designed for MCS [7], [8]. Besides statically assigning system resources (e.g., CPUs, memory, or devices) to virtual machines (VMs), SPH must provide latency and isolation guarantees at the microarchitectural level to comply with the freedom from interference requirements of industry safety standards such as ISO 26262 [1], [9]- [11]. ...
... Bao Hypervisor. Bao [8] is an open-source static partitioning hypervisor that was made publicly available in 2020. It implements the pure static partitioning architecture, i.e., a minimal, thin-layer of privileged software which leverages the existing ISA virtualization primitives to partition the hardware. ...
... Bao has no scheduler and does not rely on any external libraries or privileged VM (e.g., Linux), consisting on a standalone component which depends only on standard firmware to initialize the system and perform platform-specific tasks such as power management. Bao originally targeted Armv8-A [8]. ...
In this paper, we aim to understand the properties and guarantees of static partitioning hypervisors (SPH) for Arm-based mixed-criticality systems (MCS). To this end, we performed a comprehensive empirical evaluation of popular open-source SPH, i.e., Jailhouse, Xen (Dom0-less), Bao, and seL4 CAmkES VMM, focusing on two key requirements of modern MCS: real-time and safety. The goal of this study is twofold. Firstly, to empower industrial practitioners with hard data to reason about the different trade-offs of SPH. Secondly, we aim to raise awareness of the research and open-source communities to the still open problems in SPH by unveiling new insights regarding lingering weaknesses. All artifacts will be open-sourced to enable independent validation of results and encourage further exploration on SPH.
... Other lightweight approaches are those based on partitioning techniques, such as Jailhouse [25], Bao [23], and Xtratum [21]. These tiny hypervisors are designed to statically partition the hardware resource to the guest VMs minimizing the hardware interference bearing the cost of less efficient use of resources. ...
The spread of cutting-edge virtualization methods for modern embedded architectures like MultiProcessor Systems on Chip (MPSoCs) opens the door to the development of powerful and dependable hypervisors which might help developers to deal with complexity and heterogeneity of modern technologies while maintaining real-time requirements for critical applications. Following the needs of industrial applications, virtualization has been proven to be among the best approaches for the realization of mixed criticality systems embracing the fog computing paradigm while lowering the space, weight, power and cost (SWaP-C) of the deployment. However, the virtualization support for important hardware accelerators presents on MPSoCs, such as Real-Time Processing Units (RPUs), used for real-time and/or safety-critical workloads, is still overlooked. In this paper we propose the concept of the Omnivisor a software layer that virtualize an entire MPSoC improving the resource utilization while simplifying the implementation of a mixed criticality system over these heterogeneous boards. We identify the RPU virtualization as a building block for its realization. Therefore, as a major contribution for this paper, we design and implement a component, named RPUGuard, which is able to guarantee isolated communication channels with a fixed bandwidth between virtual machines, running on regular Application Processing Units (APUs), and the RPU on the same MPSoC. We evaluated RPUGuard on the Zynq Ultrascale+ board, in the context of a challenging case study concerning the magnetic control system of the ITER experimental nuclear fusion reactor. Results demonstrate how our solution can mitigate the weaknesses of current asymmetric communication techniques, while providing isolation guarantees to critical communication channels.
... Bao [102] is also a proposal based on this concept of static partitioning, in which the hypervisor is freed from resource management, once the CPU cores, memory or I/O devices have been assigned to each guest OS. One of the limitations of this simplistic approach is that the number of guest OSs is limited by the number of physical CPUs, unless other virtualization technology runs on top of the static partitioning hypervisor. ...
Virtualization has become one of the main tools for making efficient use of the resources offered by multicore embedded platforms. In recent years, even sectors such as space, aviation, and automotive, traditionally wary of adopting this type of technology due to the impact it could have on the safety of their systems, have been forced to introduce it into their day-to-day work, as their applications are becoming increasingly complex and demanding. This article provides a comprehensive review of the research work that uses or considers the use of a hypervisor as the basis for building a virtualized safety-critical embedded system. Once the hypervisors developed or adapted for this type of system have been identified, an exhaustive qualitative comparison is made between them. an exhaustive qualitative comparison is made between them. To the best of our knowledge, this is the first time that all this information is collected in a single article. Therefore, the main contribution of this article is that it collects and categorizes the information of each hypervisor and compares them with each other, so that this article can be used as a starting point for future researchers in this area, who will be able to quickly check which hypervisor is best suited to their research needs.
... There are also hypervisors that were designed in a lightweight manner, with the intention to be applied in resource-constrained environments at the Edge level. Examples of lightweight hypervisors are Bao [47] and ACRN [48], among others. ...
... It strongly focuses on isolation for fault containment and real-time behavior. Its implementation comprises a thin layer of privileged software leveraging ISA virtualization support to implement the static partitioning hypervisor architecture [47]. ACRN targets itself to IoT and Edge systems, placing a lot of emphasis on performance, real-time capabilities, and functional safety. ...
Virtualization plays an essential role in providing security to computational systems by isolating execution environments. Many software solutions, called hypervisors, have been proposed to provide virtualization capabilities. However, only a few were designed for being deployed at the edge of the network in devices with fewer computation resources when compared with servers in the Cloud. Among the few lightweight software that can play the hypervisor role, seL4 stands out by providing a small Trusted Computing Base and formally verified components, enhancing its security. Despite today being more than a decade with seL4 microkernel technology, its existing userland and tools are still scarce and not very mature. Over the last few years, the main effort has been to increase the maturity of the kernel itself, and not the tools and applications that can be hosted on top. Therefore, it currently lacks proper support for a full-featured userland Virtual Machine Monitor, and the existing one is quite fragmented. This article discusses the potential directions to a standard VMM by presenting our view of design principles and the feature set needed. This article does not intend to define a standard VMM, we intend to instigate this discussion through the seL4 community.
... There are also hypervisors that were designed in a lightweight manner, with the intention to be applied in resource-constrained environments at the Edge level. Examples of lightweight hypervisors are Bao [32] and ACRN [33], among others. ...
... It strongly focuses on isolation for fault-containment and real-time behavior. Its implementation comprises a thin-layer of privileged software leveraging ISA virtualization support to implement the a static partitioning hypervisor architecture [32]. ACRN targets itself to IoT and Egde systems, placing a lot of emphasis to performance, real-time capabilities and functional safety. ...
Virtualization plays an essential role in providing security to computational systems by isolating execution environments. Many software solutions, called hypervisors, have been proposed to provide virtualization capabilities. However, only a few were designed for being deployed at the edge of the network, in devices with fewer computation resources when compared with servers in the Cloud. Among the few lightweight software that can play the hypervisor role, seL4 stands out by providing a small Trusted Computing Base and formally verified components, enhancing its security. Despite today being more than a decade with seL4 microkernel technology, its existing userland and tools are still scarce and not very mature. Over the last few years, the main effort has been put into increasing the maturity of the kernel itself and not the tools and applications that can be hosted on top. Therefore, it currently lacks proper support for a full-featured userland Virtual Machine Monitor, and the existing one is quite fragmented. This article discusses the potential directions to a standard VMM by presenting our view of design principles and feature set needed. This article does not intend to define a standard VMM, we intend to instigate this discussion through the seL4 community.