Conference PaperPDF Available

Use of multicore processors in avionics systems and its potential impact on implementation and certification

Conference Paper

Use of multicore processors in avionics systems and its potential impact on implementation and certification

Abstract and Figures

With the wide availability of multiple core (multicore) processors, their reduced space, weight and power (SWaP) properties make them extremely attractive for use in avionics systems. In order to implement a solution on a multicore platform, the developer will be confronted with numerous implementation and certification issues that are not present in unicore or discrete multiple processor implementations. These issues involve both hardware and software aspects of certification and the interoperation of the two. This paper will provide guidance to the developer on the issues that must be addressed from both a hardware and software aspect in order to understand the potential and limitations of multicore solutions.
Content may be subject to copyright.
USE OF MULTICORE PROCESSORS IN AVIONICS SYSTEMS AND ITS
POTENTIAL IMPACT ON IMPLEMENTATION AND CERTIFICATION
Larry M. Kinnan, Wind River, Stow, Ohio
Abstract
With the wide availability of multiple core
(multicore) processors, their reduced space, weight
and power (SWaP) properties make them extremely
attractive for use in Avionics systems. In order to
implement a solution on a multicore platform, the
developer will be confronted with numerous
implementation and certification issues that are not
present in unicore or discrete multiple processor
implementations. These issues involve both hardware
and software aspects of certification and the
interoperation of the two. This paper will provide
guidance to the developer on the issues that must be
addressed from both a hardware and software aspect
in order to understand the potential and limitations of
multicore solutions.
Current multiple uniprocessor
implementation
In order to achieve system reliability
requirements, most modern avionics systems employ
fault tolerant design in order to achieve its goals. This
is true for both military and commercial aircraft
systems although military systems typically do not
obtain DO-178B/DO-254 (ED-12B/ED-80)
certification for their aircraft [2][3] (typically military
programs are compliant rather than actually
certified). To provide fault tolerance a number of
methods are employed including redundant
computing elements, multiple communications paths
and other techniques[4]. Avionics computer
platforms typically now include multiple, discrete
processors in duplex, triplex or quad arrangements
forming a redundant set allowing for redundant
computation steps which are then compared either on
a real time basis or at pre-defined synchronization
points in the computational time line. Each processor
has its own memory and cache systems that are
completely isolated from the other processors
contained in the module (typically these modules are
single printed circuit cards containing the
processors). This provides a significant fault isolation
mechanism that prevents unintended interactions at
the processor level. These processors are then
interconnected by some type of hardware
interconnect fabric, typically an Field Programmable
Gate Array (FPGA) in order to allow for
synchronizing their operation either in real time or at
pre-defined synch points in the computation time
line. In these types of multiple processor systems, the
results are compared and should a discrepancy occur
the majority of the processors that agree are deemed
to be correct. The processor that is in disagreement is
voted out of the set and remedial action is then taken,
this action typically takes the form of a reset or
restart of the processor to determine if the error was a
single incident (event) or if a more fundamental
failure has occurred. Should it be determined that the
failure is not a single point event that processor is
effectively removed from the system operation and
the aircraft continues to safely operate. Additionally,
pertinent information is also logged to non-volatile
storage to allow for later maintenance once the
aircraft is on the ground. One such notional system is
shown in Figure 1 implementing a dual processor
arrangement [5].
DO-254/ED-80 hardware certification relies
upon what is referred to as service experience in
order to augment design assurance when using
commercial, off the shelf (COTS) components such
as processors discussed above [1][3]. This allows the
developer and certification authority to take and
acknowledge industry life cycle experience for parts
such as processors, FPGAs, ASICs, etc. This permits
the developer to take credit for this service
experience (history) so long as it is clearly identified
up front in the Plan for Hardware Aspects of
Certification (PHAC) and documented in the
Hardware Accomplishment Summary (HAS) as
mandated by DO-254/ED-80[2]. The capability
service experience provides for these COTS
components during certification will be critical to our
discussion of multicore processors and their use in
avionics systems.
Figure 1. Notional dual processor avionics compute element
Multicore processors
The drive towards multicore solutions has been
primarily due to the physical limits of
semiconductor electronics. These limits include
heat dissipation and computational capacity limits.
Multicore processors provides a method of
increasing scale without the inherent limitations of
multiple unicore processing elements particularly in
the areas of size, heat and power consumption (also
known as SWaP – Space, Weight and Power). Due
to these limits, there are finite limits to how fast the
operating frequency can be increased. By using
multiple, proven processing cores on a single
underlying die along with reduced lead runs
significant gains can be achieved in many areas of
performance. This close proximity of the individual
cores also allows for the cache coherency circuitry
to operate at much higher clock rates than is
physically possible with discrete uniprocessors.
While these advantages offer a compelling reason
to multicore solutions, there are some drawbacks.
Multicore processors are typically more expensive
to manufacture and have lower yields in production
that tend to drive costs upwards. While these can be
detriments to their use, the advantages to developers
are clear and the industry is continuing in this
direction in order to avoid obsolescence.
In order to take advantage of multicore
processors, design changes need to be made to the
software running on them. Simply adding additional
cores does not geometrically improve performance
of the software. To gain significant performance the
software must make use of thread level parallelism
to efficiently use the multiple cores. The other area
software can benefit by use of multicore processors
is to run virtual machines on each core since they
can run independently and execute in parallel. This
virtual machine concept also allows for partitioning
of functions that can help to achieve safety through
isolation of these functions.
Multicore hardware certification issues
While most multicore processor designs use
existing core designs arranged in two, four, eight or
sixteen (or more) core arrangements, the ability to
use service experience in these processors is
reduced considerably since the implementation is
new and has limited service history, especially in
CPU 1
CPU
2
RAM
RAM
Comparison
Unit
Interface Unit
Interface Unit
Bus A
Enable
Enable
the area of avionics computational elements
requiring certification. This coupled with the
conservative approach used by certification
authorities can lead to significant hurdles to
hardware certification under DO-254/ED-80
guidelines and processes[1][3]. In addition, most
semiconductor manufacturers view the aerospace
market as ancillary to their primary markets of
networking and consumer devices since they the
volume for multicore chips is quite small compared
to these primary markets. This has led to underlying
design fabric choices in multicore chips that may
make achieving hardware certification extremely
difficult or impossible in some cases. The following
examples illustrate some but not all of these issues.
Shared cache certification issues
A significant number of multicore chip designs
provide separate L1 instruction and data caches but
then share a common L2 cache between cores. This
can be seen in the notional multicore chip layout
shown in Figure 2. Use of L2 cache is extremely
important to performance especially in time and
space partitioned operating systems used in
Integrated Modular Avionics (IMA) core modules.
Having both cores share a common L2 cache can
create a number of undesirable effects and
interactions. The first such effect is the possibility
that one core could block the other core’s access to
the L2 cache which could lead to significant
processing delays and hence non-determinism in the
operation of the overall system. This is especially
true if both cores are running the same software in a
synchronized fashion. To avoid such issues, some
multicore chips permit the L2 cache to be
segregated such that each core has exclusive access
to their own private segment of L2 cache. While
this may avoid the possible contention issue as well
as the intermixing of data within the cache leading
to cache thrashing, it effectively reduces the
available L2 cache to each core thereby reducing
the performance of the software. Segregation of the
L2 cache also typically prevents the use of the built
in hardware invalidate of the L2 cache since all of
the L2 cache would be invalidated. This has
significant performance impacts in a time and space
partitioned operating system since the caches must
be flushed on each partition switch in order to
maintain low jitter margins for determinism and
hence certification. One positive note is that new
multicore parts are being developed and will be
available shortly that have completely separated and
isolated L2 cache memory and control on a per core
basis that should eliminate this particular issue.
Shared peripherals coherency fabric
certification issues
Most multicore chips as well as a number of
uniprocessor chips are commonly referred to as
System on Chip (SoC) meaning that they not only
contain computational core(s) but also additional
specialized peripherals such as Ethernet, serial and
other I/O as well as possibly other specialized
elements that are available for use by any of the
cores in the chip. In order to provide orderly access
to these shared peripherals these chips implement
what is called a coherency fabric to arbitrate those
accesses. In order to assign access in a hierarchical
fashion, this coherency fabric implements a priority
scheme typically set by default by the chip
manufacturer. In most uses, such as network or
consumer devices, this pre-assigned priority is of
little concern. But in a safety critical system that is
destined to be certified such configurations cannot
be ignored since they may prevent access by one or
more cores to time critical usage of the peripheral.
This has led in at least one case of access being
blocked to a PCI-Express bus in a major program
that was discovered during certification testing. In a
number of manufacturer’s cases, how to configure
these priorities beyond the default settings is not
clearly documented in the manufacturer’s data
sheets. Some of the hesitancy regarding disclosing
the internal registers of this coherency fabric is
related to exposing details of the underlying
intellectual property of the manufacturer. This runs
counter to the required transparency needed to
achieve certification and places the certification
authority and the manufacturer in conflict since
hardware certification is based on service history
and direct access to the underlying implementation
of the fabric being used to communicate between
cores and the external world. It is clear that in order
to resolve this issue, semiconductor manufacturers
will need to provide transparency in their designs
sufficient to allow certification authorities the same
level of confidence they now have with discrete,
Figure 2. Notional dual core layout with shared L2 cache
multiprocessor designs using FPGA and ASIC
components.
Shared memory controller certification issues
Similar to the shared cache issue, multicore
processors typically use a single memory controller
for RAM. This RAM is then partitioned among the
processor cores for run time operation. In many
cases there may be the possibility of access being
blocked depending on the operations of one core.
This problem can be made more problematic based
on the software running on each core especially if
they are significantly different in functionality, such
as in the case of a dual core processor with one core
running a ARINC 653 partitioned operating system
and the second core running a simple operating
system using linear addressing space that
implements an I/O offload engine. Even when
running identical software loads, this interaction at
the memory controller level may prevent the
synchronization of the processors as is typically
done today. Without the possibility of providing
some hardware method of synchronizing the cores
and their operation, this is then left to software
which could introduce schedule skid and jitter
resulting in non-deterministic behavior. Such
behavior would not be certifiable unless large jitter
margins were provided for in the requirements.
Figure 3 shows a notional implementation of such a
multicore processor using two cores.
Additional factors with multicore processors
As discussed previously, use of multiple
processors provides redundancy and reliability in
avionics systems. A subtle factor that may not be
apparent with multicore processors is the loss of
redundancy in clock signal feeds and power
distribution. With multiple, discreet processors it is
possible to have separate and isolated power feeds
to each processor in the system thus eliminating any
potential for a single point failure causing
catastrophic failure modes in the module. The same
is true of clock source being fed from multiple,
synchronized sources preventing single point
failures. The same cannot be said of multicore
designs that typically do not have such redundant
feeds for power and clocks. This can lead to
possible single point failures disabling an entire
module even though the failure that occurred only
should have affected a single core. This again is an
issue of transparency in the internal design and
layout of the multiple cores and the underlying
substrate from which they derive their clock and
power feeds.
A final factor to consider is the inter-processor
(or in this case inter-core) communications. While
in most cases shared memory is easily provided
with the hardware design of multicore chips,
implementing a safe and robust communications
mechanism over such shared memory is not as
clear. This is especially true in the case of time and
Core #1
Core #2
L1 I-Cache
L1 D-Cache L1 I-Cache
L1 D-Cache
L2
Cache
Coherency Module
Figure 3 – Notional dual core processor with shared memory controller
space partitioned system that uses the ARINC 653
specification[6] and hence ARINC ports for
communications[5]. Care must be given to insure
that the underlying shared memory implementation
does not compromise the port communications is
such a way as to invalidate its use. This can be
easily seen when shared memory is not protected
such that each core has equal write access
permissions to the memory space.
Summary
While multicore processors offer a compelling
attractiveness to system designers, especially in the
area of reduced space, weight and power (SWaP)
properties, it is clear that there are numerous issues
related to the hardware design of the processors
themselves as well as how they interact with
software that could cause issues with eventual
certification of both the hardware and software
employing these devices. It is clear the
semiconductor manufacturers must take into
account safety in the implementation of these
devices in order to allow their usage and eventual
certification for use in avionics systems even
though the aerospace industry comprises only a
small segment of their market for multicore chips.
References
[1] Vance Hilderman and Tony Baghai, 2007.
“Avionics Certification”, Avionics Communication,
Inc. Leesburg, VA. USA
[2] RTCA DO-178B “Software Considerations
in Airborne Systems and Equipment Certification”
www.rtca.org
[3] RTCA DO-254 “Design Assurance
Guidelines for Airborne Electronic Hardware”
www.rtca.org
[4] Albert Helfrick, 2004. “Principles of
Avionics”, Avionics Communication, Inc.
Leesburg, VA. USA
Core #1
Core #2
L1 I-Cache
L1 D-Cache L1 I-Cache
L1 D-Cache
L2
Cache
Coherency Module
SDRAM Controller
System Bus
[5] Cary R. Spitzer, 2007.“Avionics
Development and Implementation”, CRC Press,
Taylor and Francis Group, Boca Raton, FL. USA
[6] Avionics Application Software Standard
Interface Part 1 Required Services (ARINC
Specification 653-2), 2006. Aeronautical Radio,
Inc., Annapolis, Maryland
Email Address
mailto:larry.kinnan@windriver.com
Conference Identification
28th Digital Avionics Systems Conference
October 25-29, 2009
... A reconfigurable multi-core architecture that could host safety critical tasks, see [2,38,41], for instance, can become an example of a safe avionics processor by taking advantage of the inherent redundancy that enables graceful degradation [17]: when some core fails, we can use the remaining ones by reallocating affected applications to a healthy area of the chip [36]. The inherent redundancy in such parallel architecture can thus be seen as an opportunity to increase the reliability of aerospace computing systems, be it in safety critical embedded systems or for computing centers requiring guaranties of continuity of service. ...
Article
Full-text available
This work presents an online decentralized allocation algorithm of a safety-critical application on parallel computing architectures, where individual Computational Units can be affected by faults. The described method includes representing the architecture by an abstract graph where each node represents a Computational Unit. Applications are also represented by the graph of Computational Units they require for execution. The problem is then to decide how to allocate Computational Units to applications to guarantee execution of a safety-critical application. The problem is formulated as an optimization problem with the form of an Integer Linear Program. A state-of-the-art solver is then used to solve the problem. Decentralizing the allocation process is achieved through redundancy of the allocator executed on the architecture. No centralized element decides on the allocation of the entire architecture, thus improving the reliability of the system. Inspired by multi-core architectures in avionics systems, an experimental illustration of the work is also presented. It is used to demonstrate the capabilities of the proposed allocation process to maintain the operation of a physical system in a decentralized way while individual components fail.
... However, the multi-core integration (multi-IMA) of legacy single-core IMA systems is a certification challenge, because legacy partitions temporal and spatial isolation must be guaranteed without incurring in huge re-certification costs [165,166,167,24,168,169,170,171]. The updated ARINC 653 supports the usage of multi-core devices but restricting the execution of safety partitions to a single core at a time with the remaining cores disabled, which is against SWaP reduction. ...
Article
Full-text available
Multi-core devices are envisioned to support the development of next-generation safety-critical systems, enabling the on-chip integration of functions of different criticality. This integration provides multiple system level potential benefits such as cost, size, power, and weight reduction. However, safety certification becomes a challenge and several fundamental safety technical requirements must be addressed, such as temporal and spatial independence, reliability, and diagnostic coverage. This survey provides a categorization and overview at different device abstraction levels (nanoscale, component, and device) of selected key research contributions that support the compliance with these fundamental safety requirements.
... This interference can degrade the robustness of partitioning and thus jeopardize the promise of IMA to provide a consolidated platform as safe as the equivalent federated system. Kinnan [2] lays out some of the key challenges to certification of multicore processors, focusing much of his attention on the problem of analyzing interference for shared resources. The MULCORS [3] research study conducted for the European Aviation Safety Agency also identifies flight certification issues related to multicore COTS platforms. ...
... However, their use is still limited in contexts where reliability and predictability are strict requirements and a certification is mandatory. As discussed in [2], the use of multicore processing platforms poses several issues with respect to certification, from both the hardware and the software points of view (e.g., shared cache, peripherals, and memory controller certification issues). For this, the avionic and space domains are particularly interesting scenarios. ...
Multicore systems are becoming widely diffused in the space domain, mainly because of the opportunities to improve performance and, at the same time, to optimize orthogonal metrics (e.g., cost, power consumption). However, the need to consider safety integrity levels, dictated by the relevant standards (i.e., ECSS-Q-ST-80C, ECSSQ-ST-30, ECSSQ-ST-40C, NPD 8700.1E, NPR 8715.3D, JMR-001, OST 134-1021-99), into space applications adds further challenges. In fact, in a multicore system, multiple processing cores share various resources; therefore, the implementation of modern and complex integrated modular avionics systems is problematic. Actually, integrated modular avionics systems are inherently mixed-critical ones where spatial and temporal isolation in the shared resources access is essential. As a way to mitigate such a problem, the exploitation of virtualization technologies allows to guarantee isolation and certification requirements but, at the same time, introduces scheduling overhead and new issues. The selection of the best virtualization technology is just one part of the solution. Then, further benchmarking and validation activities should help developers to certify virtualized environments running on multicore architectures. In such a context, this work proposes a methodology for characterization of hypervisor technologies in the space domain. Experimental results obtained by applying the proposed methodology in order to characterize some reference hypervisors features close the paper.
... In collaboration with the French aerospace company Safran, the Indian Institute of Science along with Morphing Machines developed REDEFINE 1 [6], a reconfigurable multi-core architecture that could host safety critical applications. REDEFINE can become an example of a safe multi-core processor by taking advantage of the inherent redundancy of such processors that enables graceful degradation [7]: when some core fails, we can use the multiple remaining ones by reallocating affected applications to a healthy area of the chip. ...
Preprint
This work presents a decentralized allocation algorithm of safety-critical application on parallel computing architectures, where individual Computational Units can be affected by faults. The described method consists in representing the architecture by an abstract graph where each node represents a Computational Unit. Applications are also represented by the graph of Computational Units they require for execution. The problem is then to decide how to allocate Computational Units to applications to guarantee execution of the safety-critical application. The problem is formulated as an optimization problem, with the form of an Integer Linear Program. A state-of-the-art solver is then used to solve the problem. Decentralizing the allocation process is achieved through redundancy of the allocator executed on the architecture. No centralized element decides on the allocation of the entire architecture, thus improving the reliability of the system. Experimental reproduction of a multi-core architecture is also presented. It is used to demonstrate the capabilities of the proposed allocation process to maintain the operation of a physical system in a decentralized way while individual component fails.
... Because of this, prior work has identified interference paths, where the contention and arbitration of shared resources might impact program execution time [15,20,21,7,8]. Components of interference paths include caches, memory buses, and main memory systems. Although not widely adopted, various schemes have been proposed to limit this interference. ...
Preprint
Multi-core architectures can be leveraged to allow independent processes to run in parallel. However, due to resources shared across cores, such as caches, distinct processes may interfere with one another, e.g. affecting execution time. Analysing the extent of this interference is difficult due to: (1) the diversity of modern architectures, which may contain different implementations of shared resources, and (2) the complex nature of modern processors, in which interference might arise due to subtle interactions. To address this, we propose a black-box auto-tuning approach that searches for processes that are effective at causing slowdowns for a program when executed in parallel. Such slowdowns provide lower bounds on worst-case execution time; an important metric in systems with real-time constraints. Our approach considers a set of parameterised "enemy" processes and "victim" programs, each targeting a shared resource. The autotuner searches for enemy process parameters that are effective at causing slowdowns in the victim programs. The idea is that victim programs behave as a proxy for shared resource usage of arbitrary programs. We evaluate our approach on: 5 different chips; 3 resources (cache, memory bus, and main memory); and consider several search strategies and slowdown metrics. Using enemy processes tuned per chip, we evaluate the slowdowns on the autobench and coremark benchmark suites and show that our method is able to achieve slowdowns in 98% of benchmark/chip combinations and provide similar results to manually written enemy processes.
... The avionics industry, academia and certification authorities have undertaken research projects into the use of MCP architectures in avionics applications. A number of researchers have found that there is variation between MCP designs in terms of their suitability for use in avionics applications, due to the impact of architectural design features on application isolation and determinism [1] [2]. These relate to factors arising from shared resources on the device, which include use of a single memory controller or shared bus is used by multiple cores (providing a risk of resource contention), and similarly use of separate or shared Level 2 caches per core, as shown in Figure 1. ...
Conference Paper
Full-text available
With the wide availability of multiple core (multi-core) processors, their reduced size, weight and power (SWaP) properties make them extremely attractive for use in Avionics systems. In order to implement a solution on a multi-core platform, the developer will be confronted with numerous implementation and certification obstacles that are not present in uni-core or discrete multiple processor implementations. Achieving safety certification of a multi-core system requires close collaboration between the avionics developers, semiconductor vendors and regulatory agencies. Evolving certification policies and guidance will include both hardware and software aspects of certification. This paper will provide an update of work by Wind River on implementing a COTS ARINC 653 solution for multi-core and provide guidance to the developer on the issues that must be addressed from both a hardware and software perspective in order to understand the potential benefits and certification limitations of multi-core solutions. [ © Copyright Wind River. Uploaded to researchgate.net with permission of Wind River. Not to be posted to other Internet websites without prior written permission of Wind River. ] Updated 22 Mar 2018
Chapter
Automotive and, especially, the aerospace industry requires reducing the size, weight, and power (SWaP) of the embedded electronics. Additionally, both industries demand an increment of the embedded electronics’ power capability and low data transfer latency. Multi-core processors’ (MCP) advent allows integrating multiple applications in the same System on Chip (SoC) in high-performance computers. The usage of MCP processor in safety-critical systems such as Flight Control Computers and Autonomous Driving depends on solving determinism issues related to the processor’s resource contention. The usage of a resource from one core can impact the other cores, preventing the delivery of the algorithm output at a specific time, and this failure can lead to a catastrophic event. This paper introduces an analysis of interference paths to map the possible resource contention in MCP and presents an approach to mitigate the low-level cache (LLC) contention using the page coloring technique.
Book
Renamed to reflect the increased role of digital electronics in modern flight control systems, Cary Spitzer’s industry-standard Digital Avionics Handbook, Second Edition is available in two comprehensive volumes designed to provide focused coverage for specialists working in different areas of avionics development. The second installment, Avionics: Development and Implementation explores the practical side of avionics. The book examines such topics as modeling and simulation, electronic hardware reliability, certification, fault tolerance, and several examples of real-world applications. New chapters discuss RTCA DO-297/EUROCAE ED-124 integrated modular avionics development and the Genesis platform.
Avionics Certification Software Considerations in Airborne Systems and Equipment Certification " www.rtca.org [3] RTCA DO-254 " Design Assurance Guidelines for Airborne Electronic Hardware " www.rtca.org [4] Albert Helfrick
  • Vance Hilderman
  • Tony Baghai
Vance Hilderman and Tony Baghai, 2007. " Avionics Certification ", Avionics Communication, Inc. Leesburg, VA. USA [2] RTCA DO-178B " Software Considerations in Airborne Systems and Equipment Certification " www.rtca.org [3] RTCA DO-254 " Design Assurance Guidelines for Airborne Electronic Hardware " www.rtca.org [4] Albert Helfrick, 2004. " Principles of Avionics ", Avionics Communication, Inc. Leesburg, VA. USA Core #1 Core #2
Principles of Avionics
  • Albert Helfrick
Albert Helfrick, 2004. "Principles of Avionics", Avionics Communication, Inc. Leesburg, VA. USA Core #1 Core #2
Avionics Certification
  • Vance Hilderman
  • Tony Baghai
Vance Hilderman and Tony Baghai, 2007. "Avionics Certification", Avionics Communication, Inc. Leesburg, VA. USA