technische universiteit eindhoven Department of Electrical Engineering Electronic Systems Platform-based Design 5KK70 MPSoC Controlling the Parallel Resources Bart Mesman & Henk Corporaal
technische universiteit eindhoven
Department of Electrical EngineeringElectronic Systems
Platform-based Design 5KK70
MPSoCControlling the Parallel Resources
Bart Mesman & Henk Corporaal
2
Electronic Systems
Contents
GPUs PicoChip Real-Time Scheduling basics Resource Management
3
Electronic Systems
GPU basics
Synthetic objects are represented with a bunch of triangles (3d) in a language/library like OpenGL or DirectX plus texture
Triangles are represented with 3 vertices A vertex is represented with 4 coordinates with
floating-point precision Objects are transformed between coordinate
representations Transformations are matrix-vector multiplications
4
Electronic Systems
GPU DirectX 10 pipeline
5
Electronic Systems
NVIDIA GeForce 6800 3D Pipeline
6
Electronic Systems
GeForce 8800 GPU
330 Gflops, 128 processors with 4-way SIMD
7
Electronic Systems
GPU: Why more general-purpose programmable?
All transformations are shading Shading is all matrix-vector multiplications Computational load varies heavily between
different sorts of shading Programmable shaders allow dynamic resource
allocation between shaders
Result: Modern GPUs are serious competitor for
general-purpose processors!
8
Electronic Systems
Pico Chip
9
Electronic Systems
Pico Chip
10
Electronic Systems
Pico Chip
11
Electronic Systems
Fault-Tolerance
12
Electronic Systems
Pico Chip
13
Electronic Systems
Real-time systems (Reinder Bril)
Correct result at the right time: timeliness Many products contain embedded computers, e.g.
cars, planes, medical and consumer electronics equipment, industrial control.
In such systems, it’s important to deliver correct functionality on time.
Example: inflation of an air bag
14
Electronic Systems
Example: Multimedia Consumer Terminals
DVD CDxfront end
YC interface
IEEE 1394interface
DVB Tuner
Cable modem
CVBSinterface
VGA
RF Tuner
(by courtesy of Maria Gabrani)
15
Electronic Systems
Example: High quality video & real time
original
up-scaled
Rendered stream: 60 Hz (TV screen)
Input stream: 24 Hz (movie)
TV companies invest heavily in video enhancement,e.g. temporal up-scaling
16
Electronic Systems
Example: High quality video & real time
original
up-scaled
Input stream: 24 Hz (movie)
TV companies invest heavily in video enhancement,e.g. temporal up-scaling
displayed
• Deadline miss leads to “wrong” picture.
• Deadline misses tend to come in bursts (heavy load).
• Valuable work may be lost.
17
Electronic Systems
Real-time systems
Guaranteeing timeliness requirements: real-time tasks with timing constraints scheduling of tasks
Fixed-priority scheduling (FPS) is the de-facto standard for scheduling in real-time systems.
FPS: supported by commercially available RTOS; analytic and synthetic methods.
18
Electronic Systems
Recap of FPS
Fixed Priority Pre-emptive Scheduling (FPPS) A basic scheduling model Analysis Example Optimality of RMS and DMS
19
Electronic Systems
FPPS: A basic scheduling model
Single processor Set of n independent, periodic tasks 1, …, n
Tasks are assigned fixed priorities, and can be pre-empted instantaneously.
Scheduling: At any moment in time, the processor is used to execute the highest priority task that has work pending.
20
Electronic Systems
FPPS: A basic scheduling model
Task characteristics: period T, (worst-case) computation time C, (relative) deadline D,
Assumptions: non-idling; context switching and scheduling overhead is ignored; execution of releases in order of arrival; deadlines are hard, and D T; 1 has highest and n has lowest priority. No data-dependencies between tasks
21
Electronic Systems
FPPS: Example
Worst-case response time WR for task 3: First point in time that 1, 2, and 3 are finished
time0 10 20 30 40 50 60
Task 1
Task 3
Task 2
1 2
1
6543
2 3
WR3 = 56WR2 = 17
WR1 = 3
22
Electronic Systems
FPPS: Analysis
Schedulable iff: WRi Di for 1 i n
Necessary condition:
Sufficient condition for RMS: U LL(n) = n (21/n – 1), i.e.i >j iff Ti < Tj;
Di = Ti.
11
ni i
i
T
CU
23
Electronic Systems
FPPS: Analysis
Otherwise, i.e. U 1 and not RMS, or n(21/n – 1) < U < 1 and RMS
exact condition: Critical instant: simultaneous release of i with all
higher priority tasks WRi is the smallest positive solution of
jij j
i CT
xCx
24
Electronic Systems
FPPS: Example
Task set Γ consisting of 3 tasks:
Notes: RM priority assignment and Di = Ti (RMS);
U1 + U2 + U3 = 0.97 1, hence Γ could be schedulable;
Utilization bound: U(n) LL(n) = n (21/n – 1): U1+U2 = 0.88 > LL(2) 0.83,
therefore U(3) > LL(3), hence another test required.
Task PeriodT Computation time C
Utilization U
1 10 3 0.3
2 19 11 0.58
3 56 5 0.09
25
Electronic Systems
FPPS: Example
Time line
time0 10 20 30 40 50 60
Task 1
Task 3
Task 2
1 2
1
6543
2 3
WR3 = 56WR2 = 17
WR1 = 3
26
Electronic Systems
FPPS: Optimality of RMS and DMS
Priority assignment policies: Rate Monotonic (RM): i >j iff Ti < Tj
Deadline Monotonic (DM): i >j iff Di < Dj
Under arbitrary phasing: RMS is optimal among FPS when Di = Ti;
DMS is optimal among FPS when Di Ti,
where optimal means: if an FPS algorithm can schedule the task set, so can RMS/DMS.
27
Electronic Systems
Non-Preemptive Systems (Akash Kumar)
State-space needed is smaller Lower implementation cost Less overhead at run-time Cache pollution, memory size
Task
28
Electronic Systems
Why FPS doesn’t work for “future” high-performance platforms
Heavy-duty DSPs: Preemption not supported If it was: Context switching is significant
Data-dependencies not taken into account Multi-processor
29
Electronic Systems
Related Research – Feasibility Analysis
Preemptive
Non-Preemptive
Homogeneous MPSoC
[Liu, Layland, 1973]
Heterogeneous MPSoC
[Jeffay, 1991]
[Baruah, 2006]
[ , 2020??]
A
B
C
D
P1 P2 P3
P4 P5 P6
30
Electronic Systems
Unpredictability – Variation in Execution Time
A
B
t1t0 t2 t3SteadyState
P1
P2
P3
50 50
50
A 50 50
50
B
A
B
t1t0 t2 t3SteadyState
49 49
49
A 49 49
49
B
31
Electronic Systems
Problem
No good techniques exist to analyze and schedule applications on non-
preemptive heterogeneous systems
Resource Manager proposed to schedule applications such that they meet their
performance requirements on non-preemptive heterogeneous systems
32
Electronic Systems
Our Assumptions
Heterogeneous MPSoC Applications modeled as SDF
Non-preemptive system – tasks can not be stopped Jobs can be suspended
Lot of dynamism in the system Jobs arriving and leaving at run-time Variation in execution time
Very simple arbiter at coresA2B2
C2
D2
Job
Task
33
Electronic Systems
Resource Manager
ResourceManager
Reconfigure to meet above qualitymilliseconds
Local Processor
Arbiter
Task level micro sec
A B
Core
Application QoS
Manager
Application level few sec
34
Electronic Systems
Architecture Description
Computation resources available are described Each processor can have different arbiter
In this model First Come First Serve mechanism is used
Resource manager can configure/control the local arbiters: to regulate the progress of application if needed
P1 P2 P3
Resource Manager Local Arbiter
35
Electronic Systems
Resource Manager
Responsible for two main things Admission control
Incoming application specifies throughput requirement Execution-time and mapping of each actor Repetition vector used to compute expected utilization RM checks if enough resources present Allocates resources to applications if admitted
36
Electronic Systems
Admission Control
P1 P2 P3
Typing SmsVideo Conf Play MP3
Resource Reqmt
Exceeded!
37
Electronic Systems
Resource Manager
Admission control Budget enforcement
When running, each application signals RM when it completes an iteration
RM keeps track of each application’s progress Operation modes
‘Polling’ mode ‘Interrupt’ mode
Suspends application if needed
38
Electronic Systems
Budget Enforcement (Polling)
Performance goes down!
Resource
Manager
Better than required!
New job enters!
job suspended!
job resumed!
39
Electronic Systems
Experiments
A high-level simulation model developed POOSL – a parallel simulation language used
A protocol for communication defined System verified with a number of application
SDF models Case study done with H263 and JPEG
application models Impact of varying ‘polling’ interval studied
40
Electronic Systems
Performance without Resource Manager
41
Electronic Systems
Performance with RM – I (2.5m cycles)
42
Electronic Systems
Performance with RM – II (500k cycles)