About
95
Publications
42,270
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,099
Citations
Introduction
Current institution
Publications
Publications (95)
This paper presents an open-source kernel-level heterogeneous memory characterization framework (MemScope) for embedded systems that enables users to understand and precisely characterize the temporal behavior of all available memory modules under configurable contention stress scenarios. Since kernel-level provides a high degree of control over al...
Practitioners designing reinforcement learning policies face a fundamental challenge: translating intended behavioral objectives into representative reward functions. This challenge stems from behavioral intent requiring simultaneous achievement of multiple competing objectives, typically addressed through labor-intensive linear reward composition...
Virtualization has become widespread across all computing environments, from edge devices to cloud systems. Its main advantages are resource management through abstraction and improved isolation of platform resources and processes. However, there are still some important tradeoffs as it requires significant support from the existing hardware infras...
A key design decision for data systems is whether they follow the row-store or the column-store paradigm. The former supports transactional workloads, while the latter is better for analytical queries. This decision has a significant impact on the entire data system architecture. The multiple-decadelong journey of these two designs has led to a new...
Embodied vision-based real-world systems, such as mobile robots, require a careful balance between energy consumption, compute latency, and safety constraints to optimize operation across dynamic tasks and contexts. As local computation tends to be restricted, offloading the computation, ie, to a remote server, can save local resources while provid...
The ever-increasing demand for high performance in the time-critical, low-power embedded domain drives the adoption of powerful but unpredictable, heterogeneous Systems-on-Chip. On these platforms, the main source of unpredictability—the shared memory subsystem—has been widely studied, and several approaches to mitigate undesired effects have been...
In today’s multiprocessor systems-on-a-chip, the shared memory subsystem is a known source of temporal interference. The problem causes logically independent cores to affect each other’s performance, leading to pessimistic worst-case execution time analysis. Memory regulation via throttling is one of the most practical techniques to mitigate interf...
Transactional and analytical database management systems (DBMS) typically employ different data layouts: row-stores for the first and column-stores for the latter. In order to bridge the requirements of the two without maintaining two systems and two (or more) copies of the data, our proposed system Relational Memory employs specialized hardware th...
8 Temporal isolation is one of the most significant challenges that must be addressed before Multi-9 Processor Systems-on-Chip (MPSoCs) can be widely adopted in mixed-criticality systems with 10 both time-sensitive real-time (RT) applications and performance-oriented non-real-time (NRT) 11 applications. Specifically, the main memory subsystem is on...
8 The correctness of safety-critical systems depends on both their logical and temporal behavior. 9 Control-flow integrity (CFI) is a well-established and understood technique to safeguard the logical 10 flow of safety-critical applications. But unfortunately, no established methodologies exist for 11 the complementary problem of detecting violatio...
Analytical database systems are typically designed to use a column-first data layout to access only the desired fields. On the other hand, storing data row-first works great for accessing, inserting, or updating entire rows. Transforming rows to columns at run-time is expensive, hence, many analytical systems ingest data in row-first form and trans...
Newly emerging multiprocessor system-on-a-chip (MPSoC) platforms provide hard processing cores with programmable logic (PL) for high-performance computing applications. In this paper, we take a deep look into these commercially available heterogeneous platforms and show how to design mixed-criticality applications such that different processing com...
Reinforcement Learning (RL) agents trained in simulated environments and then deployed in the real world are often sensitive to the differences in dynamics presented, commonly termed the sim-to-real gap. With the goal of minimizing this gap on resource-constrained embedded systems, we train and live-adapt agents on quadrotors built from off-the-she...
Prompted by the ever-growing demand for high-performance System-on-Chip (SoC) and the plateauing of CPU frequencies, the SoC design landscape is shifting. In a quest to offer programmable specialization, the adoption of tightly-coupled FPGAs co-located with traditional compute clusters has been embraced by major vendors. This CPU+FPGA architectural...
The proliferation of multi-core, accelerator-enabled embedded systems has introduced new opportunities to consolidate real-time systems of increasing complexity. But the road to build confidence on the temporal behavior of co-running applications has presented formidable challenges. Most prominently, the main memory subsystem represents a performan...
Unarbitrated contention over shared resources at different levels of the memory hierarchy represents a major source of temporal interference. Hardware manufacturers are increasingly more receptive to issues with temporal interference and are starting to propose concrete solutions to mitigate the problem. Intel Resource Director Technology (RDT) rep...
Benchmarking is crucial for testing and validating any system, including ---and perhaps especially--- real-time systems.
Typical real-time applications adhere to well-understood abstractions: they exhibit a periodic behavior, operate on a well-defined working set, and strive for stable response time, avoiding non-predicable factors such as page fau...
We explore if unikernel techniques can be integrated into a general-purpose OS while preserving its battle-tested code, development community, and ecosystem of tools, applications, and hardware support. Our prototype demonstrates both a path to integrate unikernel techniques in Linux and that such techniques can result in significant performance ad...
Benchmarking is crucial for testing and validating any system, even more so in real-time systems. Typical real-time applications adhere to well-understood abstractions: they exhibit a periodic behavior, operate on a well-defined working set, and strive for stable response time avoiding non-predicable factors such as page faults. Unfortunately, avai...
With the increasing popularity of unmanned aircraft for both research, military, and commercial applications, significant effort has been undertaken in order to improve these aircraft's performance and flight characteristics. Unmanned aircraft research and development is often based on or culminates with flight testing and significant effort has be...
We focus on the problem of reliably training Reinforcement Learning (RL) models (agents) for stable low-level control in embedded systems and test our methods on a high-performance, custom-built quadrotor platform. A common but often under-studied problem in developing RL agents for continuous control is that the control policies developed are not...
Analytical database systems are typically designed to use a column-first data layout that maps better to analytical queries of access patterns. This choice is premised on the assumption that storing data in a row-first format leads to accessing unwanted fields; moreover, transforming rows to columns at runtime is expensive. On the other hand, new d...
Actors and critics in actor-critic reinforcement learning algorithms are functionally separate, yet they often use the same network architectures. This case study explores the performance impact of network sizes when considering actor and critic architectures independently. By relaxing the assumption of architectural symmetry, it is often possible...
The growing application space of Unmanned Aerial Vehicles (UAVs) is creating the need for aircraft capable of autonomous, long-distance, and long-endurance flights. The two main challenges are the limited power capacity of UAVs, as well as the adaptation to real-time detected stimuli, changing the course of the mission. The UIUC-TUM Solar Flyer add...
There exists a divide between the ever-increasing demand for high-performance embedded systems and the availability of practical methodologies to understand the interplay of complex data-intensive applications with hardware memory resources. On the one hand, traditional static analysis approaches are seldomly applicable to latest-generation multi-c...
The sharp increase in demand for performance has prompted an explosion in the complexity of modern multi-core embedded systems. This has lead to unprecedented temporal unpredictability concerns in Cyber-Physical Systems (CPS). On-chip integration of programmable logic (PL) alongside a conventional Processing System (PS) in modern Systems-on-Chip (S...
A critical problem with the practical utility of controllers trained with deep Reinforcement Learning (RL) is the notable lack of smoothness in the actions learned by the RL policies. This trend often presents itself in the form of control signal oscillation and can result in poor control, high power consumption, and undue system wear. We introduce...
Multicore processors provide great average-case performance. However, the use of multicore processors for safety-critical applications can lead to catastrophic consequences because of contention on shared resources. The problem has been well-studied in literature, and solutions such as partitioning of shared resources have been proposed. Strict par...
In modern real-time multicore systems, understanding and adequately managing shared caches is essential to ensure
the temporal isolation of critical tasks. Recent research has
identified and extensively studied the sources of unpredictability
imputable to shared caches, heavily promoting techniques such
as cache partitioning and internal resources...
Actors and critics in actor-critic reinforcement learning algorithms are functionally separate, yet they often use the same network architectures. This case study explores the performance impact of network sizes when considering actor and critic architectures independently. By relaxing the assumption of architectural symmetry, it is often possible...
The vast majority of high-performance embedded systems implement multi-level CPU cache hierarchies. But the exact behavior of these CPU caches has historically been opaque to system designers. Absent expensive hardware debuggers, an understanding of cache makeup remains tenuous at best. This enduring opacity further obscures the complex interplay a...
We focus on the problem of reliably training Reinforcement Learning (RL) models (agents) for stable low-level control in embedded systems and test our methods on a high-performance, custom-built quadrotor platform. A common but often under-studied problem in developing RL agents for continuous control is that the control policies developed are not...
A critical problem with the practical utility of controllers trained with deep Reinforcement Learning (RL) is the notable lack of smoothness in the actions learned by the RL policies. This trend often presents itself in the form of control signal oscillation and can result in poor control, high power consumption, and undue system wear. We introduce...
The proliferation of multi-core, accelerator-enabled embedded systems has introduced new opportunities to consolidate real-time systems of increasing complexity. But the road to build confidence on the temporal behavior of co-running applications has presented formidable challenges. Most prominently, the main memory subsystem represents a performan...
Clouds inherit CPU scheduling policies of operating systems. These policies enforce fairness while leveraging best-effort mechanisms to enhance responsiveness of all schedulable entities, irrespective of their service level objectives (SLOs). This leads to unpredictable performance that forces cloud providers to enforce strict reservation and isola...
The vast majority of high-performance embedded systems implement multi-level CPU cache hierarchies. But the exact behavior of these CPU caches has historically been opaque to system designers. Absent expensive hardware debuggers, an understanding of cache makeup remains tenuous at best. This enduring opacity further obscures the complex interplay a...
Consolidating hard real-time systems onto modern multi-core Systems-on-Chip (SoC) is an open challenge. The extensive sharing of hardware resources at the memory hierarchy raises important unpredictability concerns. The problem is exacerbated as more computationally demanding workload is expected to be handled with real-time guarantees in next-gene...
The growing application space of Unmanned Aerial Vehicles (UAVs) is creating the need for aircraft capable of autonomous, long-distance, and long-endurance flights. The two main challenges are the limited power capacity of UAVs, as well as the adaptation to real-time detected stimuli, changing the course of the mission. This paper describes the con...
This paper presents a flight and ground testing data set for a trainer-type unmanned aircraft, a Great Planes Avistar Elite, which is in the series of aircraft data sets that are being published online and freely available as part of the Unmanned Aerial Vehicle Database (UAVDB). The Unmanned Aerial Vehicle Database is being continually expanded to...
Multi-core processors have replaced single-core systems in almost every segment of the industry. Unfortunately, their increased complexity often causes a loss of temporal predictability which represents a key requirement for hard real-time systems. Major sources of unpredictability are shared low level resources, such as the memory hierarchy and th...
Real-time and cyber-physical systems need to interact with and respond to their physical environment in a predictable time. While multicore platforms provide incredible computational power and throughput, they also introduce new sources of unpredictability. Large fluctuations in latency to access data shared between multiple cores is an important c...
Cache memories in modern embedded processors are known to improve average memory access performance. Unfortunately, they are also known to represent a major source of unpredictability for hard real-time workload. One of the main limitations of typical caches is that content selection and replacement is entirely performed in hardware. As such, it is...
Unikernels have demonstrated enormous advantages over Linux in many important domains, causing some to propose that the days of Linux's dominance may be coming to an end. On the contrary, we believe that unikernels' advantages represent the next natural evolution for Linux, as it can adopt the best ideas from the unikernel approach and, along with...
Multiprocessor Systems-on-Chip (MPSoC) integrating hard processing cores with programmable logic (PL) are becoming increasingly common. While these platforms have been originally designed for high performance computing applications, their rich feature set can be exploited to efficiently implement mixed criticality domains serving both critical hard...
One of the main predictability bottlenecks of modern multi-core embedded systems is contention for access to shared memory resources. Partitioning and software-driven allocation of memory resources is an effective strategy to mitigate contention in the memory hierarchy. Unfortunately, however, many of the strategies adopted so far can have unforese...
Little innovation has been made to low-level attitude flight control used by unmanned aerial vehicles, which still predominantly uses the classical PID controller. In this work we introduce Neuroflight, the first open source neuro-flight controller firmware. We present our toolchain for training a neural network in simulation and compiling it to ru...
Unmanned aerial vehicles (UAVs) are rapidly increasing in popularity for civilian, military, and research applications, and as part of this uptrend, significant effort has been undertaken to integrate an increasing amount of sensing into these vehicles. This sensing, or in other words, acquisition of sensor data, is part of the core functionality o...
This paper presents a data set for subscale general aviation aircraft, a 26%-scale Cub Crafters CC11-100 Sport Cub S2, which will be the first of a series of aircraft data sets that will be published online and freely available as part of the Unmanned Aerial Vehicle Database (UAVDB). The Unmanned Aerial Vehicle Database will be expanded to include...
One of the primary sources of unpredictability in modern multi-core embedded systems is contention over shared memory resources, such as caches, interconnects, and DRAM. Despite significant achievements in the design and analysis of multi-core systems, there is a need for a theoretical framework that can be used to reason on the worst-case behavior...
One of the primary sources of unpredictability in modern multi-core embedded systems is contention over shared memory resources, such as caches, interconnects, and DRAM. Despite significant achievements in the design and analysis of multi-core systems, there is a need for a theoretical framework that can be used to reason on the worst-case behavior...
This paper presents the evaluation of the memory subsystem of the Xilinx Ultrascale+ MPSoC. The characteristics of various memories in the system are evaluated using carefully instrumented micro-benchmarks. The impact of micro-architectural features like caches, prefetchers and cache-coherency are measured and discussed. The impact of multi-core co...
In recent years, we have seen an uptrend in the popularity of UAVs driven by the desire to apply these aircraft to areas such as precision farming, infrastructure and environment monitoring, surveillance, surveying and mapping, search and rescue missions, weather forecasting, and more. The traditional approach for small size UAVs is to capture data...
Autopilot systems are typically composed of an "inner loop" providing stability and control, while an "outer loop" is responsible for mission-level objectives, e.g. way-point navigation. Autopilot systems for UAVs are predominately implemented using Proportional, Integral Derivative (PID) control systems, which have demonstrated exceptional perform...
Autopilot systems are typically composed of an "inner loop" providing stability and control, while an "outer loop" is responsible for mission-level objectives, e.g. way-point navigation. Autopilot systems for UAVs are predominately implemented using Proportional, Integral Derivative (PID) control systems, which have demonstrated exceptional perform...
In this article, we describe a general methodology for enhancing sensing accuracy in cyber-physical systems that involve structured human interactions in noisy physical environment. We define structured human interactions as domain-specific workflow. A novel workflow-aware sensing model is proposed to jointly correct unreliable sensor data and keep...
Achieving strong real-time guarantees in multi-core platforms is challenging due to the extensive hardware resource sharing in the memory hierarchy. Modern platforms and OS's, however, provide no means to appropriately handle memory regions that are crucial for real-time performance. In this paper, we propose a new OS-level abstraction, namely Dete...
In the last decade there has been a steady uptrend in the popularity of embedded multi-core platforms. This represents a turning point in the theory and implementation of real-time systems. From a real-time standpoint, however, the extensive sharing of hardware resources (e.g. caches, DRAM subsystem, I/O channels) represents a major source of unpre...
A rolling rig for propeller performance testing was developed. The rolling rig presented was used for performance testing of a Mejzlik 27 x 12 TH propeller, which is used on the UIUC AeroTestbed and the UIUC Subscale Sukhoi unmanned research aircraft. The performance parameters measured for the propeller will be used in the future to aid in the cal...
Embedded systems in safety-critical environments are continuously required to deliver more performance and functionality, while expected to provide verified safety guarantees. Nonetheless, platform-wide software verification (required for safety) is often expensive. Therefore, design methods that enable utilization of components such as real-time o...
Multiple resource co-scheduling algorithms and pipelined execution models are becoming increasingly popular, as they better capture the heterogeneous nature of modern architectures. The problem of scheduling tasks composed of multiple stages tied to different resources goes under the name of ?flow-shop scheduling?. This problem, studied since the ?...
Architects of multicore chips for avionics must define and bound intercore interference, which requires assuming a constant worst-case execution time for tasks executing on the chip. With the Single Core Equivalent technology package, engineers can treat each core as if it were a single-core chip.
Multicore processors are being extensively used by real-time systems, mainly because of their demand for increased computing power. However, multicore processors have shared resources that affect the predictability of real-time systems, which is the key to correctly estimate the worst-case execution time of tasks. One of the main factors for unpred...
In this paper, we describe a general methodology for enhancing measurement accuracy in cyber-physical systems that involve structured human interactions with a noisy physical environment. We define structured human interactions as those that follow a domain-specific workflow. The idea of the paper is simple: we exploit knowledge of the workflow to...
This paper presents a sensor data acquisition unmanned aerial system (SDAC-UAS) for flight state monitoring and aerodynamic data collection research on small to mid-sized unmanned aerial vehicles (UAVs). The SDAC-UAS was developed to provide the ground-based human (safety) pilot an easily discernible display of sensor and state data for aircraft mo...
DRAM consists of multiple resources called banks that can be accessed in parallel and independently maintain state information. In Commercial Off-The-Shelf (COTS) multicore platforms, banks are typically shared among all cores, even though programs running on the cores do not share memory space. In this situation, memory performance is highly unpre...
An increasing demand for high-performance systems has been observed in the domain of both general purpose and real-time systems, pushing the industry towards a pervasive transition to multi-core platforms. Unfortunately, well-known and efficient scheduling results for single-core systems do not scale well to the multi-core domain. This justifies th...
As real-time embedded systems become more complex, there is the need to build them using high performance commercial off-The-shelf (COTS) components. However, tasks can exhibit hard to predict worst case execution times (WCET) when executing on commodity hardware, due to contention among shared physical resources. Past work has introduced the PRedi...
This paper describes a high-frequency sensor data acquisition system (SDAC) for flight control and aerodynamic data collection research on small to mid-sized unmanned aerial vehicles (UAVs). The system is both low weight and low power, operates at 100 Hz and features: a high-frequency, high-resolution six degree-offreedom (6-DOF) inertial measureme...
Unmanned Aerial Vehicles (UAVs) are becoming increasingly popular thanks to an increase in the accessibility of components with high reliability and reduced cost, making them suitable for civil, military and research purposes. Vehicles classified as UAVs can have largely different properties in terms of physical design, size, power, capabilities, a...
Modern industrial plants, vehicles and other cyber-physical systems are increasingly being built as an aggregation of embedded platforms. Together with the soaring number of such systems and the current trends of increased connectivity, new security concerns are emerging. Classic approaches to security are not often suitable for embedded platforms....
Multi-core architectures are shaking the fundamental assumption that in real-time systems the WCET, used to analyze the schedulability of the complete system, is calculated on individual tasks. This is not even true in an approximate sense in a modern multi-core chip, due to interference caused by hardware resource sharing. In this work we propose...