Top Banner
CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University of Notre Dame 1 Department of Computer Science and Engineeri How Can Co-Design Help? The Salishan Conference on High- Speed Computing
25

CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

Dec 31, 2015

Download

Documents

Lesley Bailey
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 1The Salishan conference on High-Speed Computing

No Free Lunch, No Hidden Cost

X. Sharon Hu

Dept. Computer Science and Engineering

University of Notre Dame

11Department of Computer Science and Engineering

How Can Co-Design Help?

The Salishan Conference on High-Speed Computing

Page 2: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 2The Salishan conference on High-Speed Computing

Theme: Exposing Hidden Execution Costs

Cost of execution: performance and power Computation Communication Data motion Synchronization …

How can we strike a balance between the extremes? Hide as much as possible? Explicitly manage “all” costs?

My “position”: Expose widely and choose wisely Focus on power

Page 3: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 3The Salishan conference on High-Speed Computing

Why Taking the Position?

Expose widely Better understanding the contribution by each

component Allowing application-specific tradeoffs Providing opportunities for powerful co-design tools

Choose wisely Requiring sophisticated co-design tools Exploring more algorithm/software options

Page 4: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 4The Salishan conference on High-Speed Computing

But Easier Said Than Done! Heterogeneity

Compute nodes: (multi-core) CPU, GP-GPU, FPGA, … Memory components: on-chip, on-board, disks, … Communication infrastructure: bus, NoC, networks, …

Parallelism (”non-determinism”) Data access: movement, coherence, … Resource contention synchronization

Page 5: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 5The Salishan conference on High-Speed Computing

Outline

Why expose widely?

How to benefit from exposing widely?

How to choose wisely?

Going forward

Page 6: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 6The Salishan conference on High-Speed Computing

Why Expose Widely? (1)

Different programs has different power distribution

MemoryConstSM

ConstCache

TextCache

GPU Cores

}

Hong and Kim, ISCA 2010

GPU Power Distribution (NVidia GTX 280)

Page 7: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 7The Salishan conference on High-Speed Computing

Why Expose Widely? (2)

Energy consumptions of three sorting algorithms (Pentium 4 + GeForce 570)

Data movement impacts different algorithms differently

Page 8: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 8The Salishan conference on High-Speed Computing

Why Expose Widely? (3)

Application dependent

Massaki Kondo, et. al., SigARCH 2007

Performance degradation due to memory bus contention

Page 9: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 9The Salishan conference on High-Speed Computing

Outline

Why expose widely?

How to benefit from exposing widely?

How to choose wisely?

Going forward

Page 10: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 10The Salishan conference on High-Speed Computing

How to Benefit from “Exposing Widely”?

Co-design is the key Expose all factors impacting the “execution model”

Computation: processing resource Data motion: memory components and hierarchy Communication: bus and network Resource contention, synchronization… Some examples

Software macromodelingHardware module-based modeling

Optimize through power management Keep in mind Amdahl’s law

Page 11: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 11The Salishan conference on High-Speed Computing

Macromodeling: Algorithm Complexity Based

Relate power/energy of a program with its complexity

Example: E = C1S + C2S2 + C3S3 (Tan, et. al. DAC’01) where S is the size of the array for a sorting algorithm

Example: Ecomm = C0 + C1S (Loghi, et. al. ACMTECS’07) where S is the size of exchanged messages

More sophisticated models to account for both computing and communication

How to handle resource contention?

Page 12: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 12The Salishan conference on High-Speed Computing

Power Modeling of Bus Contension

Penolazzi, Sander and Ahmed Hemani: DATE’11 Characterization step

C%N,1 : percentage of cycle difference between the N-

processor case and 1-processor case Can be one by IP providers on chosen benchmarks

Prediction step

)1(,)(

)( %1, CTCt

TNt

cycleE

NE Nstall

idleaa EnEEnE )()1()(

Page 13: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 13The Salishan conference on High-Speed Computing

Hierarchical Module-Based Power Modeling Accumulate energy/power of modules

CPU+GPU example

Access rate: software dependent

Data movement contributes to memory power

Resource contention modifies access rate

)()()( iotheri

iitotal MPMPMUtilP

idlei

imemGPUCPUtotal PMPPPPP )(

)()(

)()()(

ii

iii

MNonGatedPMMaxP

MgArchScalinMAccessRateMP

Adapted from Isci and Martonosi, Micro’03

Page 14: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 14The Salishan conference on High-Speed Computing

Outline

Why expose widely?

How to benefit from exposing widely?

How to choose wisely?

Going forward

Page 15: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 15The Salishan conference on High-Speed Computing

Managing Bus Contention to Reduce Energy

M. Kondo, H. Sasaki and H. Nakamura, 2006

Counter for mem request

Register for PU identification

Thresholds for selecting which PU uses what Vdd value

Page 16: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 16The Salishan conference on High-Speed Computing

Application Mapping to Reduce Energy (1)

Application mapping for heterogeneous systems

J1 J2

J3 J4

([minR1,maxR1], D1) ([minR2,maxR2], D2)

PE 1 PE 2

PE 3 PE 4

Memory

([minR4,maxR4], D4)([minR3,maxR3], D3)

R. Racu, R. Ernst, A. Hamann, B. Mochocki and X. Hu, “Methods for power optimization in distributed embedded systems with real-time requirements,”, CASES’06.

Page 17: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 17The Salishan conference on High-Speed Computing

Application Mapping to Reduce Energy (2)

Optimization: Minimize power/energy dissipation Satisfying timing properties (e.g. average path latency,

average lateness, etc.) …

Search Space: Scheduling parameter, traffic shaping, … Task level DVFS, i.e. task speed assignment Resource level DVFS, i.e., resource speed assignment …

Page 18: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 18The Salishan conference on High-Speed Computing

Application Mapping (3): Sensitivity Analysis

R. Racu, R. Ernst, A. Hamann, B. Mochocki and X. Hu, “Methods for power optimization in distributed embedded systems with real-time requirements,”, CASES’06.

Page 19: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 19The Salishan conference on High-Speed Computing

Application Mapping (4): GA-Based Approach

PowerAnalyzer

2’. Scheduling Trace

3’. Power Dissipation

Power model needed

Page 20: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 20The Salishan conference on High-Speed Computing

A Sample Result

Page 21: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 21The Salishan conference on High-Speed Computing

Outline

Why expose widely?

How to benefit from exposing widely?

How to choose wisely?

Going forward

Page 22: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 22The Salishan conference on High-Speed Computing

Going Forward: Systematic Co-design Effort

Expose more More hardware counters / registers More efficient/accurate high-level power models Better models for resource contention and

synchronization

Choose better Handling parallelism

Algorithm, OS, hardwareResource contentionsynchronization

Handling non-determinismWorst case boundsStatistical analysisInterval-based techniques

Page 23: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 23The Salishan conference on High-Speed Computing

ES Design v.s. HPCS Design Differences (maybe)

Application specific workloads v.s. domain specific workloads

Constraints, objectives, desirables? latency, throughput, energy, cost, reliability, fault

tolerance, IP protection/privacy, ToM, … Other issues: homogeneous v.s. heterogeneous, levels

of complexity, user expertise,…

Similarities Ever increasing hardware capability: multi-core, multi-

thread, complex communication fabrics, memory hierarchy, …

Productivity gap Common concerns: latency, throughput, energy, cost,

reliability, fault tolerance, …

Page 24: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 24The Salishan conference on High-Speed Computing

Leverage Co-Design for HPC

Systematic performance estimation Formal methods: scenario-based, statistical analysis Hybrid approaches: analytical+simulation Seamless migration from one abstraction level to the

next

Efficient design space exploration Efficient search techniques Multiple-level abstraction models Multiple-attribute optimization Others: memory and communication analysis and

design

Page 25: CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

CSE Dept., (XHU) 25The Salishan conference on High-Speed Computing

Thank you!