Top Banner
Paper Report Presenter: Jyun- Yan Li Multiprocessor System-on- Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin Tseng Institute of Information Systems and Applications, National Tsing Hua University, Hsinchu, Taiwan, ROC Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, ROC SoC Technology Center, Industrial Technology Research Institute, Hsinchu, Taiwan, ROC 2009 15th International Conference on Parallel and Distributed Systems
15

Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Dec 13, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Paper Report

Presenter: Jyun-Yan Li

Multiprocessor System-on-Chip Profiling Architecture: Design and

Implementation

Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin TsengInstitute of Information Systems and Applications, National Tsing Hua University, Hsinchu, Taiwan, ROCDepartment of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, ROCSoC Technology Center, Industrial Technology Research Institute, Hsinchu, Taiwan, ROC

2009 15th International Conference on Parallel and Distributed Systems

Page 2: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

With the growing needs for advanced functionalities in modern embedded systems, it is now necessary to integrate multiple processors in the system, preferably on a single chip, to support the required computing complexity. The problem is that such multiprocessor system-on-chip (MPSoC) architecture is very complex and its internal behavior is very difficult to track. An effective tool for profiling the behavior of the MPSoC system is in great need. Such a tool is very useful during system design for exploiting various options and identifying potential bottlenecks.

In this paper, we introduce the MultiProcessor Profiling Architecture (MPPA) -- a general framework for profiling MPSoC embedded systems. The MPPA framework entails the use of FPGA emulation for the target system, the embedding of performance counters for recording system events, and the development of OS drivers for collecting the profiled data. To demonstrate its use, we show the implementation of an MPSoC emulation system based on Leon3 cores following the MPPA framework. We also show how the MPPA framework and the emulator help the designers to identify performance problems and improve their MPSoC embedded system design.

Abstract

Page 3: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Related workHardware& Software

debugging[7]

Count events & recode

[16,18,21]

This paper

Low level and system level record with timestamp &

send to host

Hardware implementation

Count events

LEON3 & AMBA

[11,13,17][10]

Linux device driver

[3]

Software implementation

Integration MultiProcessor

Profiling Architecture (MPPA)

Drive the MPPA

Debugging

Page 4: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Support multiprocessor architecture A processor core can’t access other’s core without special support

Add profiling mechanism Lead to large modify in original structure

Insert register in architecture It’s no systematic

What is the Problem

Page 5: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

MultiProcessor Profiling Architecture (MPPA)• Event sensing : detecting specific hardware events and notifying the

event collectiong• Event collectiong : accumulating event counts from the event sensing

Proposal Method

Page 6: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Mechanism

Page 7: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Design flow

Page 8: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Integrate MPPA with LEON3

Page 9: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Hardware implementationenable and disable CVM and clear the

EC’s

record event occurrences

manipulating the counter value

and monitoring the input event

signal

(EC)(CVM)

Page 10: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Using device driver Small overhead (<5 cache miss)

A set of Power Management Unit (PMU) library for user program pmu_init : opens device node and memory mapping pmu_clear : zeros all the counter values and enables

performance event counting pmu_msg : stop monitoring and read event statistics pmu_end : close device node

Software implementation

Page 11: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Target platform : Xilinx ML501 FPGA emulation board and about 80 MHz

Twenty-three 32 bits counters

Total gate count increase 0.66%

Experiment Result

Xilinx Vertex5 FPGA synthesis result of target platform with MPPA architecture

Page 12: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Case 1

Case study

Page 13: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

case2

Case study (cont.)

Page 14: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

This paper present a MPPA which an efficient, compact, and less intrusive design for performance measurement without dedicated bus.

Using MPPA can help designer to find MPSoC bottlenecks

Conclusion

Page 15: Presenter: Jyun-Yan Li Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin.

Using this idea can insert event sensors into processor or master and slave to detect event Ex: cache miss, pipeline stall

It not present how to deal with interrupt

My comment