A Software Framework for Embedded Multi-Core Systems Taehyo Song Real-Time Operating Systems Laboratory, Seoul National University, Korea 2008-09-29
May 10, 2015
A Software Framework for Embedded Multi-Core Systems
Taehyo Song
Real-Time Operating Systems Laboratory,
Seoul National University, Korea
2008-09-29
2
OutlineOutline
Introduction Problem Definition and Solution Overview Framework for Embedded Multi-Core Systems Experimental Evaluation (not yet done) Conclusion
A Software Framework for Embedded Multi-Core Systems
3
Motivation (1)Motivation (1)
Multi-core systems are spreading to embedded areaHard to achieve higher clock rate because of the heat and power consumption problems
Power is very important factor in embedded systems
Introduction
Multi-core performance and power scenario
Performance and power1. Standard single-core processor
over-clocked 20%
2. Standard single-core processor
3. Dual-core processor each core under-clocked 20%
4
Motivation (2)Motivation (2)
Developer need to be preparedCan’t take advantages from previous method
Use multi-threading technology to take advantages• Steep learning curve to learn multi-threaded programming
• Leading to deadlock and complex debugging scenarios
• Lengthy development time
Introduction
Gap between two classes of software developers
5
OutlineOutline
Introduction Problem Definition and Solution Overview
Problems
Problem Definitions
Solution Overview
Framework for Embedded Multi-Core Systems Experimental Evaluation Conclusion
A Software Framework for Embedded Multi-Core Systems
6
Problems (1)Problems (1)
Existing technologies such as OMP, MPI, TBB support to develop parallel software
Cannot apply to embedded system since it has resource restrictions
Limitations of the automatic parallelization [1]
Problem Definition and Solution Overview
[1] Zhao-Hui Du, Chu-Cheow Lim, Xiao-Feng Li et al., “A Cost-Driven Compilation Framework for Speculative Parallelization of Sequential Programs, PLDI, 2004
Basic compilation Current best compilation Enabling optimization
7
Problems (2)Problems (2)
Parallel design patterns can help developersLearning design patterns and particularly applying them is not an easy task [2]
Conversion from patterns to codes for implementation is not that straight forward and without proper guidance and coaching this process could create problems [3]
Problem Definition and Solution Overview
[1] Masita Abdul Jalil, Shahrul Azman Mohd Noah, “The Difficulties of Using Design Patterns among Novices: An Exploratory Study”, 2007[2] Ó Cinnéide, M. & Tynan R., “A Problem-Based Approach to Teaching Design Patterns”, 2004.
8
Problem DefinitionProblem Definition
How to develop embedded software that is running on multi-core system easily and effectively in terms of development time and performance respectively
Problem Definition and Solution Overview
9
Solution OverviewSolution Overview
Efficient use of design patterns by using frameworkReconstruct existing parallel design patterns to suit for framework and add to the framework
Problem Definition and Solution Overview
10
OutlineOutline
Introduction Problem Definition and Solution Overview Framework for Embedded Multi-Core Systems
Embedded System Framework
Overview of the Apparatus Framework
Analysis the Apparatus Framework
Proposed Framework
Experimental Evaluation Conclusion
A Software Framework for Embedded Multi-Core Systems
11
Proposal FrameworkProposal Framework
Framework for Embedded Multi-Core SystemsModify modules of Apparatus Framework to suit for multi-core System
Improve existing parallel design patterns and reconstruct it to suit for framework, add to Apparatus Framework
Framework for Embedded Multi-Core Systems
Hardware
BSP
eCos OSE RTOS Linux uC/OS
Kernel
Pattern
Multi-Core
Pattern
RT_STL
Application
12
Analysis of Apparatus Framework (1)Analysis of Apparatus Framework (1)
Several features are applicable for multi-thread programming
Inter-task Communication• Communication between tasks is generally performed by
asynchronous communication
Messages • A task reads messages and reacts according to the type and
content of this message
Resource Protection • A set of mechanisms to serialize access to shared resources is
provided: semaphores, mutexes, monitors and guards
Framework for Embedded Multi-Core Systems
13
Analysis of Apparatus Framework (2)Analysis of Apparatus Framework (2)
Apparatus Framework is not designed for multi-core systems
Modules are not thread safe• Queue, State Machine, Device Accessing, Communication etc.
Pattern service does not provide patterns that are used for multi-core systems
• Task parallelization patterns
• Data parallelization patterns
Framework for Embedded Multi-Core Systems
14
Thread Safety ModulesThread Safety Modules
Com, Device, RT-STL and Pattern modules are unsafe under concurrent operations
Attempting concurrent modifications could corrupt them
Solutionwrap a lock around container accesses
It does not cause any overhead when single thread is used since lock can be removed at development time using configuration tool
Modules Modification of Apparatus Framework
15
Types of ParallelizationTypes of Parallelization
Parallelization generally falls between two polesTask Parallelization
• Task parallelization refers to the alternative scenario where several nodes may execute different algorithms on the same data source, or alternatively on several data sources [4]
Data Parallelization• Data parallelization refers to a given data source being
processed using the same technique on each node in the parallel architecture [5]
Parallel Design Patterns to Framework
[4] Akram Hameed, “Parallelization of the AAE algorithm”, 2007[5] Wilkinson, B, Allen M , “Parallel programming: techniques and applications using networked workstations and parallel computers”, Prentice-Hall, 1998
16
Task Parallelization PatternsTask Parallelization Patterns
Task Parallel ModelThe program is split into a number of tasks
Each task is assigned to a specific core
PatternsThread Pool
Fork/Join Pattern
Pipeline Pattern
Parallel Design Patterns to Framework
17
Thread Pool Pattern (1)Thread Pool Pattern (1)
Thread PoolAllocate number of threads to thread pool
• A number of N threads are created to perform a number of M tasks (N << M)
• A thread completes its task, it will request the next task from the queue until all tasks have been completed
• The thread can then terminate, or sleep until there are new tasks available
Parallel Design Patterns to Framework
18
Conceptual Diagram
Thread Pool Pattern (2)Thread Pool Pattern (2)
Parallel Design Patterns to Framework
Task 1
Task Task Task Task
Task K Task M
Task 5
Task Queue
Thread Pool
Completed Tasks
Task Task Task Task
Thread
19
Thread Pool Pattern (3)Thread Pool Pattern (3)
Hot SpotTasks that will be assigned to thread
Scheduling policy of threads
Class Diagram
Parallel Design Patterns to Framework
…
ThreadPoolManager
+Initialize()+AssignThread()+RemoveThread()
Client
0..*1
Task
+runBody()+virtual run()
SchedulingPolicy
+Policy()
1
1
SP1 SP2 UDSP
0..*1
ConcreteTask
+run()
ApparatusTask
+virtual runBody()
Queue
0..*1
1
1
20
Fork/Join Pattern (1)Fork/Join Pattern (1)
Fork/JoinA main task forks off some number of other tasks that then continue in parallel to accomplish some portion of the overall work
• The fork operation starts a new parallel fork/join subtask
• The join operation causes the current task not to proceed until the forked subtask has completed
Parallel Design Patterns to Framework
21
Fork/Join Pattern (2)Fork/Join Pattern (2)
Conceptual Diagram
Parallel Design Patterns to Framework
Task
Thread
TaskTask
Task 1 Task 2.1
Task 2.2
Task
Thread 1 Thread 2
Fork
Join
22
Fork/Join Pattern (3)Fork/Join Pattern (3)
Hot SpotTasks that will be processed concurrently
Class Diagram
Parallel Design Patterns to Framework
Fork/ J oinManager
+addTask()+fork()+join()
ConcreteTask
+run()
ApparatusTask
+virtual runBody()
FJTask
+runBody()+virtual run()
ThreadPoolManager
+Initialize()+AssignThread()+RemoveThread()
*1
Client
1
1..*
0..*1 *1
23
Pipeline Pattern (1)Pipeline Pattern (1)
PipelineExecute tasks or group of tasks in regular sequence when the execution order of tasks is regular, one-way and static
• Enables the decomposition of a repetitive sequential process into a succession of distinguishable sub-processes
• Each of processes can be efficiently executed on a distinct processing element or elements which operate concurrently
Parallel Design Patterns to Framework
24
Pipeline Pattern (2)Pipeline Pattern (2)
Conceptual Diagram
Parallel Design Patterns to Framework
Input DataInput Data ClientClient
Task
Data
25
Pipeline Pattern (3)Pipeline Pattern (3)
Hot SpotSet of tasks that will be processed serially or concurrently
Class Diagram
Parallel Design Patterns to Framework
PipelineManager
+virtual runBody()+addTask()+removeTask()
Queue
1
1
FJTask
+runBody()+virtual run()
ApparatusTask
+virtual runBody()
ConcreteTask
+run()
*1
ThreadPoolManager
+Initialize()+AssignThread()+RemoveThread()
1
1..*
Client
0..*1
0..*1 0..* 1
26
Data Parallelization PatternsData Parallelization Patterns
Data Parallel ModelData is distributed over the cores
Each core works on a different part of the same data structure
PatternsSPMD Pattern
Parallel Design Patterns to Framework
27
SPMD Pattern (1)SPMD Pattern (1)
Single Program, Multiple DataThe same program is executed on different cores, over distinct data sets
Each task is characterized by the data over which the common code is executed
Parallel Design Patterns to Framework
28
SPMD Pattern (2)SPMD Pattern (2)
Conceptual Diagram
Parallel Design Patterns to Framework
Input DataInput Data ClientClient
Thread 1 Thread 2 Thread 3 Thread 4
Data
Task
Thread
29
SPMD Pattern (3)SPMD Pattern (3)
Hot SpotTask to be executed in separate threads
Load balancing policy
Class Diagram
Parallel Design Patterns to Framework
…
SPMDManager
+Initialize()+SetTask()+Execute()
LoadBalancer
+DistributionPolicy()
1
1
LBP1 LBP2 UDLBP
Client
*1Task
+runBody()+virtual run()
11
ConcreteTask
+run()
ApparatusTask
+virtual runBody()
30
OutlineOutline
Introduction Problem Definition and Solution Overview Framework for Embedded Multi-Core Systems Experimental Evaluation Conclusion
A Software Framework for Embedded Multi-Core Systems
31
Experimental EvaluationExperimental Evaluation
Remaining this part as a future work Comparison results will be given
Using framework and NOT using framework• Throughput time when developing application that has same
functionalities
• Comparing with LOC (Lines-Of-Code)
Case-study is needed for evaluation
Experimental Evaluation
32
OutlineOutline
Introduction Problem Definition and Solution Overview Framework for Embedded Multi-Core Systems Experimental Evaluation Conclusion
A Software Framework for Embedded Multi-Core Systems
33
ConclusionConclusion
New software framework for embedded multi-core systems
Framework provides parallel features for multi-core systems
Experimental evaluation will be givenShow that using new software framework gives good results
Conclusion
34
ReferencesReferences
Ralph E. Johnson, “Frameworks=(Components+Patterns)”, 1997
Bruce Powel Douglass, “Real-Time Design Patterns: Robust Scalable Architecture for Real-Time Systems”, 2002
Timothy G. Mattson, Beverly A. Sanders, Berna L. Massingill, “Patterns for Parallel Programming”, 2004
Max Domeika, “Software Development for Embedded Multi-Core Systems”, 2008
A.Pastino, C.C.Ribeiro, N.Rodriguez, “Developing SPMD Applications with Load Balancing”, 2003
Michael A. Bender, Jeremy T. Fineman, Seth Gilbert et al., “On-the-Fly Maintenance of Series-Parallel Relationships in Fork-Join Multithreaded Programs”, 2004
A Software Framework for Embedded Multi-Core Systems
Thank You!!!Thank You!!!