MULCORS Presentation
Download
Report
Transcript MULCORS Presentation
Legal Entity/Division - Date
Multicore For Avionics
Certification Issue
2013 – 03 – 22
2 /
Context 1/2
This presentation is based on the final report
that concludes the MULCORS project contracted
with EASA.
The reports
provides the main outputs,
recommendations and conclusions per EASA
Specifications attached to the Invitation to
Tender EASA.2011.OP.30.
Access to MULCORS report
https://www.easa.europa.eu/safety-andresearch/research-projects/large-aeroplanes.php
Context 2/2
3 /
CONTEXT
Provide a survey of Multi-core processors market availability
Define multi-core processors assessment & selection criteria
Perform investigations on a representative multi-core processor
Identify mitigation means, design and usage rules & limitations
Suggest recommendations for multi-core processor introduction
Suggest complementary or modification to EASA guidance
BACKGROUND
Digital Embedded Aircraft Systems
Use of COTS processors in Embedded Aircraft Equipment
Use of Multi-Core in Embedded Military Aircraft Equipment
AGENDA
4 /
Multi-core:
Introduction
Problems to Solve
Regarding certification
Software Aspects
Failure Mitigation Means & COTS Relative
Features
Conclusion
5 /
Introduction
MULTI-CORE
Multi-Core: Introduction
6 /
Multi-Core processor Architecture: Unified Memory Access
Multi-core processors architecture is organized around one memory shared
between all cores
Architecture requiring arbitration management on one hand and integrity
mechanisms on the other hand to manage communication between cores
and synchronization if required
In multi-core processors we need to take care about how Cache Memory
Coherency is assumed
Multi-Core: Introduction
7 /
Multi-Core processor Architecture: Distributed Architecture
Each core has the use of a dedicated memory with or without dedicated cache
depending on the processor architecture
Memory Cache Management is simplified and occurs in the same way as in a
single core processor (separate cache and memory are dedicated to each
core).
Multi-Core: Introduction
8 /
Multi-Core processor Architecture: Single Address space,
Distributed Memory
Cores have their own cache, they can also have dedicated memory but they
can have access to other core memories using the bus or the Network
In some multi-core architecture, the cluster bus is also part of the global
network. In this variant of architecture, the bandwidth is at least dimensioned
to sustain all the transfers in a cluster without causing perturbation to the
others
Multi-Core: Introduction
9 /
Airb. SW
Airb. SW
Airb. SW
Drivers
Drivers
Drivers
O.S.
O.S.
O.S.
Intended Function
BSP
BSP
BSP
Core
Core
Core
Core
Core
Core
Cache
Cache
Cache
Cache
Cache
Cache
BUS
Register
BUS
Register
Register
Register
EXT MEMORY
External Network
Hypervisor
HW adaptation Layer (BSP)
Hypervisor layer (when required)
Operating System
Drivers
Airborne Software
External Bus
EXT MEMORY
Register
Register
Register
INTERCONNECT
Register
10 /
Problems to Solve
MULTI-CORE
Multi-Core: Introduction
11 /
What is a multicore processor?
A multicore processor can be characterized by N (N ≥ 2) processing cores + a
set of shared resources (Memories, PCIe, Ethernet, Cache, Registers, etc.)
Two types of processors can be found
The ones where interconnect between cores is based on an arbitrated bus
The ones where interconnect between cores is based on a network
Multicore management can be summarize to shared resources conflicts
management (when SW is in DAL_A, DAL_B or DAL_C)
Multi-Core: Introduction
12 /
Access conflits
To interconnect between cores
If InterConnect = bus Access arbitration is done at this level
If InterConnect = network Access arbitration depend of numbers of authorized parallel
routes (example : Memories accesses, Bus accesses, Networks accesses, etc.)
Conflicts
Management
Conflicts
Management
Conflicts
Management
Conflicts
Management
Conflicts
Management
Multi-Core: Introduction
13 /
Accesses conflicts
To external Memories
If InterConnect = bus Accesses arbitration has been realized at InterConnect level
If InterConnect = network Accesses arbitration are done at Memory Controller level
In case of more than one Memory Controller, arbitration can be simplified
Gestion
des
conflits
Gestion
des
conflits
Gestion
Gestion
des
des
conflits
conflits
Multi-Core: Introduction
14 /
Accesses conflicts
Accesses to PCI / PCIe bus or ETHERNET Network
If InterConnect = bus Accesses arbitration has been realized at InterConnect level
If InterConnect = network Accesses arbitration is done at each controller level: PCI /
PCIe bus one or Ethernet network one
Depending of numbers of Accesses Controller, arbitration can be simplified (ex : for two
accesses controller bus and network 2 simultaneous accesses can be sustained).
Gestion
des
conflits
Gestion
des
conflits
Gestion
des
conflits
Multi-Core: Introduction
15 /
DETERMINISM IN EMBEDDED AIRCRAFT
SYSTEMS
Abstract notion partially described in DO-297
Definition based on
Execution
“Demonstrate that the Embedded Aircraft System mode during non-faulty
software execution remains nominal or degraded into an acceptable state”
Accumulate sufficient knowledge on the processor’s internal mechanisms.
WCET
analysis
Platform
Integrity
Usage Domain
More or Less difficult to analysis regarding the Airborne Software knowledge.
Robust
Partitioning (not only for IMA system)
Ensure by HW mechanism
Ensure by Operating System
Ensure at Airborne Software
Multi-Core: Introduction
16 /
Multicore COTS Processors
Conflicts Management
Spatial Management: how to manage accesses to be sure that one core
can’t access to a space reserved for another core.
Temporal Management:
How to manage accesses done by one core to all shared resources
(Memories, I/O, etc.) to be sure that accesses can be limited in time
whatever activities of other core are (normal or abnormal).
Upper bound will be used for WCET computation
For
Memory Accesses
Spatial Management is done by MMU and IOMMU (when existing)
Temporal Management is more complex linked to interconnect (transaction
management), Memory Controller and Memory (transaction realization).
Operating System
Architecture Choice regarding Industry needs
Computer Number Reduction with low impact on legacy application
AMP
Application Performance Improvement
SMP
17 /
Processor Selection
MULTI-CORE
Processor Selection: Selection Criteria
18 /
Selection criteria regarding the manufacturer situation
Manufacturer has experience in the avionic domain
Manufacturer is involved in the certification process
Manufacturer publishes specific communications
Manufacturer has a sufficient life expectancy
Manufacturer ensures a long term support
Selection criteria regarding the Manufacturer openness
regarding design and tests information
Design information on a COTS processor is mandatory to certify an avionic
platform
Strong impact on the performance of the chip.
Il some manufacturers may not agree to communicate specific design information required to
ensure determinism it is relevant to favor manufacturers who agree.
Moreover, for an avionic component, it is necessary to perform specific
robustness tests, such as a SEE (Single Event Effect) or SER
Processor Selection: Selection Criteria
19 /
Focus on Architecture: Virtual Memory
Management
Virtual memory service (Memory Management Unit).
MMU components:
Translating virtual addresses into physical addresses,
Verifying that the requesting software has the sufficient access rights.
Multicore platforms VMM can be located at core, at processor or at both levels.
Addresses translator and access rights checker.
Storage device, Translation Look aside Buffers (TLB) to save locally the address
translation rules.
Virtual memory is defined with pages frames (size & offset).
Focus on Architecture: Private cache & Scratchpad
Use of hierarchical memory (caches and scratchpads) improves the
performance of software.
Scratchpad
usually viewed as a cache with its management implemented by
software.
In a general way, timing variability when accessing private caches and
scratchpads is considered to be bounded. Content prediction depends on the
cache replacement policy.
Processor Selection: Selection Criteria
20 /
Focus on HW assists for Debug & Monitoring
COTS processors provide debug mechanisms that enable
breakpoint insertion, single step execution
Usual way to debug bare metal software is to use the JTAG
interface.
On top of an operating system, debuggers such as GDB can be
used.
21 /
Regarding Certification
MULTI-CORE
Multi-Core Processor features: INTERCONNECT
22 /
INTERCONNECT
Overview
Interconnect,
the first shared resource between cores.
Interleaves the concurrent transactions sent by the cores to the shared
resources like caches, memories and I/O mapped in the address space.
Its architecture has a strong impact on determinism and ensuring
partitioning insurance, and on the complexity of worst case analyses.
Interconnect usually implements the following services:
Arbitration of incoming requests. This stage depends on several parameters:
Allocation of the physical destination devices when they are duplicated.
For example when there is more than one MEMORY controllers.
Allocation of a path to the destination.
Arbitration rules
Arbiter internal logic
Network topology
When several paths exist between the source and the destination (depends
on routing rules).
Support for atomic operations, hardware locking mechanisms
Snooping mechanisms for cache coherency
Inter Processors Interruptions (IPI) for inter-core communications
Multi-Core Processor features: SHARED CACHE
23 /
SHARED CACHE
Use of a shared cache in Embedded Aircraft Systems requires a solution to the
following problems:
Shared cache content prediction. WCET calculability and robust partitioning requirements.
Cache content integrity. Take care of SEU/MBU.
Concurrent accesses impact. Potential restrictions on concurrent accesses to shared cache
have to appear in the Interconnect Usage Domain in the same way as concurrent accesses to
shared memory.
Cache organizations
Fully associative: Each memory row may be stored anywhere in the cache.
N-way set associative cache: Each memory row may be stored in any way of some specific
sets of cache lines.
Direct mapped cache: Each memory row may be stored in a single cache line.
Classic replacement policies are:
Least Recently Used
Pseudo Least Recently Used:
Most Recently Used
First In First Out
Random
Multi-Core Processor features: impact on Determinism
24 /
CACHE COHERENCY MECHANISM
Required in architecture that integrates several storage devices
hosting one same data.
Two families of coherency protocols:
Invalidate protocols:
Accessed cache line is marked as invalidated in all locations.
Further accesses will miss and require a load to the main memory.
Class of protocols easier to implement and offers better performances.
Update protocols:
Accessed cache line is updated.
Update request is broadcasted to all nodes : the ones containing the cache line are
automatically updated.
Benefit: cache access will always hit without requesting the interconnect, thus traffic on
the interconnect may be easier to control.
Multi-Core Processor features: SHARED SERVICES
25 /
SHARED SERVICES
Airborne Embedded Equipment is in charge of providing shared
services among the cores.
Shared services:
Interrupts generation and routing to cores
Core and processor clock configurations
Timer configurations
Watchdog configurations
Power supply and reset
Support for atomic operations
Multi-Core Processor features: CORES
26 /
CORES
The cores support the execution of multiple software instances in
parallel.
They interact within two mechanisms:
Inter-core interrupts
Shared memory
In the Embedded Aircraft Systems context, the use of inter-core
interrupts (point-to-point or broadcast) might be the same as any
external interrupt. It is acceptable under some conditions including
(but not restricted to):
As a protection mechanism (a core can interrupt another core if it detects a faulty
execution inside it)
When the destination core is actively waiting for being interrupted.
Memory mapping defined in the Memory Management Unit.
Multi-core platforms embed one MMU per core. T
Memory mapping definition is distributed among the cores.
This raises the feature of coherency maintenance between all MMU.
A non-coherent configuration may weaken Robust Partitioning.
Multi-Core Processor features: PERIPHERALS
27 /
PERIPHERALS: MAIN MEMORY AND I/O’S
Sharing the main memory means sharing the physical storage
resources and the memory controllers.
Storage resource can be partitioned when necessary: (space partitioning).
Sharing accesses to the memory controllers may in some cases increase the timing
variability of a transaction with a factor higher than the number of accessing masters.
Shared I/O features are similar to shared services configuration:
Access simultaneously read and/or write buffers.
Classic rules of time and space partitioning can apply: when it is not possible ensure
that concurrent accesses will occur in disjoint time windows.
Initiate specific protocols operations: uninterrupted access is required during the
protocol execution to be able to fulfill correctly the concerned protocol.
Like shared services, concurrent accesses to shared I/O may occur simultaneously
from different cores.
Some I/O are accessed according to a protocol, others are accessed from a read
and/or write buffer Atomic access patterns have to be ensured.
28 /
Software Aspects
MULTI-CORE
Multitasks scheduling features
29 /
Classic approach for a multitasked system is the hierarchical model
based on processes (or partition) and threads
In ARINC 653, equivalent components are partitions and processes).
Parallel programming models include two kinds of tasks: periodic
and sporadic.
Processes and threads activation depends on a scheduling
algorithm.
For an Embedded Aircraft Systems system, a scheduling algorithm
shall verify the following properties:
Feasibility:
Predictability:
Processes (or partitions) are executed from isolated memory areas.
Inside a process, one or more threads are executed in the same address space.
Critical property ensuring that a set of tasks will meet its deadline.
Pre-emptive and priority based scheduling algorithms are preferred
for single-core processors
30 /
Airborne Software migration from single-core to multi-core
Porting multitasked Airborne Software from a single-core to a multicore platform, required:
Airborne Software execution will still be correct
Worst Case Execution Time will be calculated for each task or process.
Multitasked airborne software may not be efficiently executed on a
multi-core platform if its tasks have dependencies requiring a
specific execution order.
Care has to be taken if the Airborne Software is implemented within
a cooperative tasks model.
Such an implementation usually removes protections in critical sections
accesses.
In multi-core execution, critical section might be executed in parallel by
different tasks, resulting in an erroneous execution critical section requires
semaphore protection
Partitioned system features
31 /
Components evolution to take benefit of multi-core platforms
The most “flexible” component is the
integration software layer. Possible designs:
A single OS instance shared among all the cores
A private OS instance per core
A virtualization layer hosting several operating systems
in dedicated virtual machines.
Partition Deployment
One partition is activated on all cores and has an exclusive access to platform
resources
Symmetrical
Multi-processing (SMP).
Each partition are activated on one core with true parallelism between partitions
Asymmetrical Multi-processing (AMP).
Operating System global view
32 /
From Single Core to Multi-Core in AMP (Asymmetric multi-processing)
APP1
APP2
APP3
T1
T1
T1
T2
T2
T3
T3
T2
T3
T4
T4
T5
Space & Time Partitionning
Space & Time Partitionning
Space & Time Partitionning
Operating System
Operating System
Operating System
CORE
CORE
CORE
BRIDGE
Memory
Controller
I/O
Controller
INTERCONNECT
BUS /
Network
Interface
Memory
Controller
Solve
Conflict
I/O
Controller
BUS /
Network
Interface
Memory
Controller
Operating System global view
33 /
From Single Core to Multi-Core in SMP (Symmetric multi-processing)
APP1
T1
APP2
APP3
T1
T1
T1
T2
T2
APP1
T2
T2
T3
T3
T3
T4
T4
T3
T5
T4
Space & Time Partitionning
Space & Time Partitionning
Operating System
Operating System
CORE
BRIDGE
Memory
Controller
I/O
Controller
CORE
CORE
INTERCONNECT
BUS /
Network
Interface
Memory
Controller
Solve
Conflict
I/O
Controller
BUS /
Network
Interface
Memory
Controller
Current mono-core concept
34 /
APP2
APP1
T1
APP3
T1
T1
T2
T2
T3
T2
T3
T3
T4
T4
T5
Space & Time Partitionning
Operating System
CORE
BRIDGE
Memory
Controller
I/O
Controller
BUS /
Network
Interface
Thread /
Process
T4
T3
T1
T3
T3
T2
Partition 1
T1
Partition 2
T3
T2
T2
T1
T4
T1
T2
T1
OS
Core
T5
T4
T1
T1
Appli. 1
T
Appli. 2
T
Appli. 3
T
idle
Partition 3
Partition 4
time
35 /
APP4
APP1
APP3
APP2
T1
T1
T1
T1
T1
T2
T3
T3
T2
T2
T3
T3
T4
T3
T4
T1
T2
T2
T2
AMP
APP5
APP5
T4
T3
T5
Space & Time Partitionning
Space & Time Partitionning
Operating System
Operating System
CORE
CORE
When AMP mode is selected,
the Use of Hypervisor is
recommended to master the
behavior of the Interconnect
Usage Domain
INTERCONNECT
Memory
Controller
T5
T2
T2
T2
T1
T1
T3
T3
T2
T1
T2
T1
Partition 2.2
T1
T1
Partition 2.4
Partition 2.3
T4
T3 T3
T1
T3 T3
T3
T2
T1
Partition 1.1
T1
Partition 1.2
T3
T2
T2
T1
T2
T1
OS 2
T3
T3
Thread /
Process
T4
T4
Partition 1.1
Core 2
BUS /
Network
Interface
I/O
Controller
T1
T1
OS 1
Core 1
Memory
Controller
Appli. 1
T
Appli.2
T
Appli 3
T
Appli 4
T
Appli 5
T
Appli 6
T
Appli 7
T
idle
Partition 1.3
Partition 1.4
time
SMP
36 /
APP2
APP1
T3
APP3
T2
T1
T3
T1
T4
T2
When SMP mode is selected,
Processes, Threads or Tasks
should be allocated to cores
statically to achieve
determinism
T2
T4
T1
T3
T5
Space & Time Partitionning
Operating System
CORE
CORE
INTERCONNECT
I/O
Controller
T2
BUS /
Network
Interface
T2
Memory
Controller
T2
Thread /
Process
T2
T5
T4
T1T1
Partition 1
T4
T4
T3
T3
T1T3
Partition 2
T1T1
Partition 3
OS
Core 1 Core 2
Memory
Controller
T3
T1
Appli. 1
T
Appli. 2
Appli. 3
T
T
idle
T1
Partition 4
time
37 /
Failure Mitigation Means & COTS Relative Features
MULTI-CORE
Multi-Core: Failure Mitigation
38 /
FMEA and/or FFPA for a single or a multi-core processor is
not achievable at processor level
Software Error Rate SEE (Single Event Effect)
Mitigation has to be provided, by the equipment provider, at board level
where this processor is used
Measurements on SER are usually performed by the manufacturers on
their own
Deep Sub Micronics
DSM has impact of long term reliability
39 /
An example of Stress Tests
Stress test have been kept identical from generation to generation to be
able to guarantee in the industrial grade a usable life compatible with
Avionics Requirements
40 /
CONCLUSION
CONCLUSIONS
41 /
Complexity of Multi-Core Processors has increased over the
past few years, while the level of demonstration for design
assurance should remain at least the same as- or better than
for COTS without such increment in complexity.
A COTS component remains a COTS component (features
proprietary data from the COTS manufacturer).
Approaches:
Access to additional data under agreements with the COTS manufacturer
And/or mitigation of potential COTS faults or errors at board or equipment
level,
CONCLUSIONS
42 /
In this report we put emphasis on specific Multi-Core
features linked to Shared Resource Accesses like Memory,
Bus, Network, Internal Registers, Clock Management, etc.
These features are the main differences between singlecore and multi-core devices that have to be managed
At Airborne Software Level
If Airborne Software behavior is well known and well managed, then by allocating
Airborne Software applications to cores, we can demonstrate the non-interaction
between cores.
The
interconnect behavior shall be well known and well managed
At Hypervisor level
In this configuration, the Hypervisor is used to constraint the behavior of the
interconnect. These constraints reduce the global performance of the multi-core
processor but offer determinism and so the global behavior can be demonstrated.
43 /
COMPLEMENTARY
INFORMATION
COMPLEMENTARY INFORMATION
44 /
Multi-core Processor Usage Domain
Definition, Validation and Verification of a Usage Domain (UD) for
such highly complex COTS Multi-Core processors is required.
This approach is already known and offered by existing certification
guidance for Complex and Highly Complex COTS.
One recommendation would be to distinguish between the UD rules
related to segregation constraints (e.g. segregation between cores),
from the UD rules related to local limitations (within a single core).
COMPLEMENTARY INFORMATION
45 /
Significant features
Determining WCET, knowing the high variability of execution time,
the following step by step approach can be one of the solution to
ensure the temporal deterministic behavior of processors; such an
approach is also valid for multi-core processors:
Characterization
Determination
of execution time jitters of the operating system services,
of the Worst Case Execution Time (WCET) plus allowed
margins,
Incorporated
real-time monitoring of actual exec time versus allowed WCET,
Collect
data for assessment of the processor + Airborne Software operating
behavior,
Depending
Apply
on the above assessment, establish additional rules or limitations,
necessary modifications
COMPLEMENTARY INFORMATION
46 /
Robust partitioning
Mitigation to cater for the inherent complexity of multi-core processors
via functional robustness at Airborne Software level is possible
whenever the developer has allowed access to- and detailed knowledge
of- the computing platform.
Defensive programming techniques can be used to compensate for
potential misbehaviors. This possibility is not accessible for Software
execution platforms where Airborne Software developers have only
access to an allocated portion of the platform with strict rules and
requirements to meet in order to allow adequate operation of the whole
integrated system.
Multi-software architectures are now common, hence robust
partitioning of Airborne Software must then be ensured. For example an
essential feature is the execution time variations due to jittering on
partition switching that should be minimized to allow time-deterministic
behavior. Indeed, guidance is that temporal determinism shall be
ensured knowing given criteria..