© ARM 2017 Optimizing ARM Cortex-A and Cortex-M based heterogeneous multiprocessor systems for rich embedded applications Kinjal Dave Embedded World, Nuremberg, 2017 Senior product manager, CPU Group 16 th March 2017
Apr 12, 2017
Title 44pt sentence case
Affiliations 24pt sentence case
20pt sentence case
© ARM 2017
Optimizing ARM Cortex-A and Cortex-M based heterogeneous multiprocessor systems for rich embedded applications
Kinjal Dave
Embedded World, Nuremberg, 2017
Senior product manager, CPU Group
16th March 2017
© ARM 2017 2
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Topics
Introduction
System design
Software
© ARM 2017 3
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Topics
System design
Software
Introduction§ Why heterogeneous processing?
§ Use cases
§ Terminology
© ARM 2016 4
Text 54pt sentence case Thanks for reading
For more information on ARM Cortex-A and Cortex-M processors visit arm.com
Sign-up for the latest news and information from ARM
© ARM 2017 5
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Modern compute systems have diverse workloads
Power
Time
Sleep mode
Interactive mode
Ambient mode
© ARM 2017 6
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Diversity of workloads across markets
© ARM 2017 7
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Why heterogeneous computing?
“Right-sized processing”
Increase system
performance
Increase system
efficiency
Reduce system cost
© ARM 2017 8
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
ARM architecture for diverse computing needs
Cortex-AHighest performance
Optimized for rich operating systems
Cortex-RFast response
Optimized for high performance,
hard real-time applications
Cortex-MSmallest/ lowest power
Optimized fordiscrete processing and
microcontrollers
© ARM 2017 9
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Heterogeneous systems are extremely diverse
MCU
CPU GPU ISP Video
Display Audio DSP DDR
Interconnect
A heterogeneous system using different compute elements
A heterogeneous subsystem using ARM Cortex processors
© ARM 2017 10
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Heterogeneous multicore processors
Multicore
HeterogeneousHomogeneous
Performance asymmetry Functional asymmetry
§ Same ISA § Same microarchitecture§ Same view of memory
§ Same ISA § Different microarchitecture§ Same view of memory§ OS/ Software symmetry
§ Different ISA § Different microarchitecture§ Different view of memory§ OS/ Software asymmetry
Interconnect Interconnect
Cortex-A + Cortex-M systems
Interconnect
© ARM 2017 11
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Use cases of HMP systems
Cortex-A Rich OS, high performance
Cortex-R Modem Real-time control
Cortex-M System control, sensor fusion
Sensor fusion System control
Mobile ADASWearables
© ARM 2017 12
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Use cases of HMP systems in embedded
Cortex-A Rich UI and OS, high performance
Cortex-M Real-time control and monitoring
Deterministic sensor control
Real-time monitoring
MedicalConsumerIndustrial
© ARM 2017 13
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
System design
Topics
Introduction
Software
ARM activities
§ System design considerations
§ Security models
© ARM 2017 14
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Architectural differences between Cortex families
Cortex-A Cortex-R Cortex-M
Lower power, smaller area
Higher performance
Rich OS/ RTOS RTOS only
32/64-bit ARM &Thumb ISA 32-bit ARM &Thumb ISA Thumb ISA
SW-managed interrupts HW-managed interrupt
AMBA AXI AMBA AHB/AXI AMBA AXI
Deterministic SW-managed
Operating system
Instruction set
Interrupts
Bus interface
© ARM 2017 15
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
System design considerations
§ How to address the memory map differences?
§ How to distribute interrupts?
§ How to handle inter-processor communication?
§ How to handle Secure/Non-secure state communication?
Generic HMP compute subsystem using Cortex processors
Interconnect
Shared L2
AHB interconnect
Local memory
DMC
DDR
Cortex-A subsystem
Cortex-Rsubsystem
Cortex-Msubsystem
AHB interconnect
TimerSensor SRAM
© ARM 2017 16
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Architectural support for ARM TrustZone
ARMv6-M,ARMv7-M
ARMv8-AARMv7-A
ARMv8-M
Trusted software
Crypto TRNG
Non-trusted(normal)
Trusted(Secure)
Trusted hardwareSecure system
Securestorage
TrustZone§ Isolate trusted resources from non-trusted§ Isolate non-trusted software § Reduce attack surface of key components
© ARM 2017 17
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
TrustZone for ARMv8-A and ARMv8-M
TrustZone for ARMv8-A TrustZone for ARMv8-M
Secure statesNon-secure states Secure statesNon-secure states
Secure transitions handled by the processorto maintain real-time latency
Secure app/libs
OS support API /
Secure OS
Non-secureOS
Non-secureapp
Secure app/libs
Secure OS
Rich OS, e.g. Linux
Secure monitor
© ARM 2017 18
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
TrustZone security using Cortex-M processors
ARMv6-M and ARMv7-M processors
§ Always Secure or always Non-secure
§ Use case dependent
Designer needs to be careful
§ Power management – Secure access only
§ Debug system needs to match security domain for each processor
Shared L2
Cortex-A subsystem
AHB interconnect
Local memory
System Control Processor(SCP)
Power control
Always Secure(secure boot)
Interconnect
DMC
DDR AHB interconnect
Local memory
Audiosubsystem
Audio interface
Always Non-secure
Cortex-M subsystem
© ARM 2017 19
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
TrustZone for ARMv8-M: More flexibility for designers
Non-secure Non-secure
Secure Secure
Cortex-A processor
Cortex-M processor
(ARMv8-M)
Shared secure world(e.g. Cortex-M as a smart DMA engine)
Shared L2
Cortex-A subsystem
AHB interconnect
Local memory
Interconnect
DMC
DDR AHB interconnect
Local memory
Audiosubsystem
Audio interface
Cortex-M subsystem
Secure/Non-secure
Smart DMA engine
© ARM 2017 20
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Software
Topics
Introduction
System design§ Overview of software challenges
§ Making software development easier
© ARM 2017 21
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Overview of software challenges
Developer productivity
Usability, portability, debugging
Data sharing
Is coherency necessary?
Taskpartitioning
How to optimally partition tasks?
© ARM 2017 22
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Standardization of software interfaces
§ CMSIS adopting OpenAMP§ Cortex Microcontroller Software
Interface Standard (CMSIS)§ Now open source on Github
§ OS support for HMP systems § Remote Processor Messaging (RPMsg) for
inter-processor communication§ Management framework using
remoteproc
Cortex-Aprocessor
Memory subsystem
Cortex-M processor
SRAM
Rich OS/ RTOS
RTOSRPMsg
remoteproc
© ARM 2017 23
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Debug Linux and RTOS apps from a single tool
Cortex-A Cortex-M
RTOS systemLinux kernel
Linux application
JTAG
TCP/IP
CoreSight
Microcontrollerapplication
Debug:
üThe Cortex-M application
üThe Cortex-A Linux kernel
üThe Cortex-A Linux application
DS-MDK debugger enables complete visibility to all software applications in the heterogeneous system
© ARM 2017 24
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case§ Quickly discover hot spots
of your application
§ Simplifies efficient task partitioning on your system
§ System-wide views as bottlenecks are often outside the CPU
Performance tuning for HMP systems
© ARM 2017 25
Title 40pt sentence case
Bullets 24pt sentence case
bullets 20pt sentence case
Summary
You can build heterogeneous multicore systems today using different Cortex processors and other system IP
ARM architecture enhancements make future HMP systems better System and software considerations required in the context of use cases
ARM is working on several activities to make HMP easier
The trademarks featured in this presentation are registered and/or unregistered trademarks of ARM Limited (or its subsidiaries) in the EU and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners.Copyright © 2017 ARM Limited
© ARM 2017