Linaro/UDS plenary Orlando, 03-Nov-2011 David Brash ARM Technology Update
Nov 18, 2014
Linaro/UDS plenary
Orlando, 03-Nov-2011
David Brash
ARM Technology Update
Agenda
ARMv7-A update
Cortex-A7 announcement
Energy efficient processing big-LITTLE: Cortex-A15 & Cortex-A7
Eco-system development
The architecture roadmap: ARMv7 => ARMv8 ARMv8-A announcement at TechCon 2011
ARM Cortex-A15 Momentum
Expanding list of ARM Partners with
designs in progress
…and 5 other ARM partners
Products expected in 2012
Introducing the Cortex-A7 A highly efficient core for future smartphones
Entry-level, some mainstream workloads
...and more
Redefines mobile computing
big.LITTLE processing model
Po
wer
Performance
Cortex-A15
Cortex-A7
Cortex-A7 is ~1/6th the power,but half the performance, at thenominal operating point
Highest Cortex-A15Operating Point
Highest Cortex-A7 Operating Point
Lowest Cortex-A15 Operating Point
Lowest Cortex-A7 Operating Point
Overdrive Condition
Full backward compatibility
with Cortex-A processors
Feature set and software
compliant with Cortex-A15
Virtualization
Large Address Extensions
Scalable and Extensible
Multi-processor
System Coherency
Small <0.5mm2 in 28nm process
ARM Cortex-A7RTL available Now
Cortex-A15/7 big.LITTLE Processing
Cortex-A15
MPCore
L2 Cache
CPU
Cortex-A7
MPCore
L2 Cache
CCI-400 Coherent Interconnect
CPUCPU CPU
Interrupt Control
Uses the right processor for the right job
Up to 70% energy savings on common workloads
Flexible and transparent to apps – importance of
seamless software handover
big
“Demanding tasks”
LITTLE
“Always on, always
connected tasks”
Performance and Energy-Efficiency
Simple, in-order, 8 stage pipeline
Performance better than today’s
mainstream, high-volume smartphones
Most energy-efficient applications processor from ARM
Complex, out-of-order, multi-issue pipeline
Up to 5x the performance of today’s
mainstream, high-volume smartphones
Highest performance in mobile power envelope
Cortex-A7
Cortex-A15
LIT
TL
Eb
ig
Queue
Issue
Integer
big.LITTLE Cluster Migration Mechanics
Migration Stimulus Received
Save State
Normal Operation
Snooping Allowed
Outbound Processor (s): Cluster B
Cache Invalidate
Ready for migration
Switch State (Snoop Outbound Processor)
Inbound Processor(s): Cluster A
Outbound Processor OFF
Stimulus from OS/Virtualizervia system firmware interface
Enable Snooping
Restore State
Normal Operation
Power Down
Power On & Reset
Disable Snooping
Clean Cache
Less than 100-cycles
Less than 20 micro-secondsThis is the “critical period” where no work is being done on either cluster
Cycle count is OSdependent
Leading Software Ecosystem
Broad support for Cortex-A processors
100,000s of apps already optimized
Increasing ARM focus on the platform
1TB of physical address space (Cortex-A7/A15 systems) meets awide spectrum of developer needs
a vehicle for software development
and sharing
Linaro key to Linux and other open-source
software and tools deployment Virtualization
and
Firmware
OS
Power Management Software
Applications and Middleware
Many ARMv7-A software developmentslogically extend into ARMv8-A
Focus for ARM system and software development Cortex-A15 cluster
Cortex-A7 cluster
Mali graphics support + Memory, IO, debug etc...
Increasing use of “models-first”: processor, memory & IO
Cortex-A15/A7/MALI platform
CPU 0
L2
Ca
ch
e
Cortex-A15 Cluster
LPDDR2/DDR3
Controller
DM
C-4
00
System Power
Debug & Trace
2012 Compute Subsystem
AMBA Extensions
Interface (Slave)
AMBA Extensions
Interface (Master)
JTAG &
Trace
PMIC/
APB Bus
CPU 2
CPU 1 CPU 3
CPU 0
L2
Ca
ch
e
Cortex-A7 Cluster
CPU 2
CPU 1 CPU 3
Shader
Core
0
Mali T600 series GPU
Shader
Core
1
Shader
Core
2
Shader
Core
3
Cache Coherent Interconnect (CCI-400)
DDR PHY or DDR Memory
NIC 400
CoreSight
Resources
Mgt
SMMU
L2
Ca
ch
e
NIC 400
On-Chip
Memories
(RAM, ROM)
Base
Peripheral
ARMv8-A (announced 27-Oct-2011)
What is ARMv8?
Next version of the ARMv8 architecture First release covers the Applications profile only: ARMv8-A
Addition of a 64-bit operating capability Introduction of new 64-bit execution state – AArch64
Maintain low power heritage – critique features against PPA* impact
ARMv7-A compatibility a critical consideration – AArch32 Interprocessing: defined relationship between 32- and 64-bit
execution
Maintain ARMv7-A (AArch32) momentum alongside AArch64 Strong compatibility plus ongoing evolution
*PPA: Power Performance Area
ARMv8-A – Context
• ARMv8• A-profile only
(at this time)
• 64-bit architecture
support
AArch64 - registers
X0 X8 X16 X24
X1 X9 X17 X25
X2 X10 X18 X26
X3 X11 X19 X27
X4 X12 X20 X28
X5 X13 X21 X29
X6 X14 X22 X30*
X7 X15 X23
EL0 EL1 EL2 EL3
Stack Ptr SP_EL0 SP_EL1 SP_EL2 SP_EL3 (PC)
Exception Link Register
ELR_EL1 ELR_EL2 ELR_EL3
Saved/Current Process Status
Register
SPSR_EL1 SPSR_EL2 SPSR_EL3 (CPSR)
* procedure_ LR
V0 V8 V16 V24
V1 V9 V17 V25
V2 V10 V18 V26
V3 V11 V19 V27
V4 V12 V20 V28
V5 V13 V21 V29
V6 V14 V22 V30
V7 V15 V23 V31
64-bit registers
{32-bit SP, 64-bit DP} scalarFP / 128-bit vectors
Exception model overview
EL2
AArch32 AArch64
EL0
EL1
User
IF EL3 is 64-bit
Svc Abt Und
FIQ IRQ Sys
Hyp
User
Svc Abt Und
FIQ IRQ Sys
EL3
EL0
EL1h EL1t
EL3h EL3t
EL2h EL2t
SecureNon-secure SecureNon-secure
EL0
EL1h EL1t
„h‟andler & „t‟hreadstack options
Svc Abt Und
FIQ IRQ Sys
Mon
IF EL3 is 32-bit
ARMv7-Acompatibility
Interprocessing:• EL3: Secure Monitor => EL2: Hypervisor) => EL1: OS = EL0: Application
• AArch64 → AArch32 transition can occur on a transition down the hierarchy (EL3 → EL0)• AArch32 → AArch64 transition can occur on a transition up the hierarchy (EL0 → EL3)
Interprocessing & AArch32 save/restoreR0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13 (SP)
R14 (LR)
SP_svc
LR_svc
SP_irq
LR_irq
SP_und
LR_und
SP_fiq
LR_fiq
SP_abt
LR_abt
SP_hyp
R8_fiq
R9_fiq
R10_fiq
R11_fiq
R12_fiq
SP_mon
LR_mon
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R0
R1
R2
R3
R4
R5
R6
R7
X16 R14_irq
X17 R13_irq
X18 R14_svc
X19 R13_svc
X20 R14_abt
X21 R13_abt
X22 R14_und
X23 R13_und
X24 R8_fiq
X25 R9_fiq
X26 R10_fiq
X27 R11_fiq
X28 R12_fiq
X29 R13_fiq
X0 R0
X1 R1
X2 R2
X3 R3
X4 R4
X5 R5
X6 R6
X7 R7
X8 R8usr
X9 R9usr
X10 R10usr
X11 R11usr
X12 R12usr
X13 R13usr
X14 R14usr
X15 R13_hyp
X30 R14_fiq
PC
A/CPSR SPSR_svc SPSR_abt SPSR_und SPSR_irq SPSR_hyp
ELR_hyp
SPSR_fiq SPSR_mon
AArch32
AArch64
SP_EL0
PSTATE
PC
SP_EL1-3
ELR_EL1-3
SPSR_EL1-3
Summary
Cortex-A7 a highly efficient application processor
Cortex-A7 enables big.LITTLE Processing to
expand performance and battery-life
Seamless and transparent to application software
ARM increasing its platform software investments
A catalyst for many activities
The ARM architecture roadmap is now clearer ARMv8-A architecture development is well advanced
(Specification release expected 2H-2012)