Top Banner
© 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER Processors Saritha Vinod Power Systems Performance Analyst IBM Systems [email protected]
14

Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

Sep 19, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Emerging Workload Performance Evaluation on Future Generation OpenPOWER Processors

Saritha Vinod

Power Systems Performance Analyst

IBM Systems

[email protected]

Page 2: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Agenda

2

• Emerging Workloads Characteristics and Performance• Performance Modelling Lifecycle for Future Generation Processors• Workload Tracing Process• Workload Tracing Methods & Tools • Key Challenges in Workload Tracing• Performance Evaluations using Traces

• Microarchitecture Design Analysis • Software Performance Optimizations• Performance Verification

• Summary

Page 3: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Emerging Workloads Characteristics and Performance

3

• New industry trends leading to emerging workloads in domains such as Cognitive

computing, Deep Learning, Analytics, Cloud etc.

• To achieve best performance it is important for the next generation processor design to

address some of the following emerging workload characteristics

Instruction mixes & compute needs

Cache access patterns & prefetch

Data access patterns

Sharing of data

Data affinities

Branch prediction

OS and Hypervisor calls

Page 4: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Performance Modelling Lifecycle for Future Generation Processors

4

Develop/Config

ure processor

Model

Design/ Feature

Evaluation

Identify

bottlenecks

Design

Enhancements

WorkloadsInstruction

Traces

Processor Performance Modeling Lifecycle

Remodel

Reached Target

Performance ?

Model

Final

Processor

Model

Traces provide key workload

characteristics

Enable performance

evaluation of future

generation processors

Page 5: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Workload Tracing Process

5

Instruction Traces

Core Model

I/O Model

Memory Model

Model statistics

Pipeline Visualizations

In

put

Mo

dels

Outp

ut

Workload

Trace Post processing & Validation

Recaptu

re T

ra

ce

Tra

ce

Genera

tion

Perform

ance

Mo

delling

Functional

SimulatorHW Trace Valgrind

Page 6: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Workload Tracing Methods & Tools

6

Functional Simulator Hardware Traces Valgrind Framework

• Highly Controlled simulation environment

• Supports sampling of multi-phase workloads

• System level tracing

• Not well-suited for workloads with complex stack, large memory and highly threaded workloads

• Used forcommercial workloads with high core counts and memory requirements

• Instruction and bus traces

• System level tracing

• Complex setup process

• Lacks support for generating sampled traces

• Useful for tracing hot functions or problem areas in the application

• Supports sampling

• Provides only application tracing, no system level

Reference : IBM SDK for Linux on Power https://www-304.ibm.com/webapp/set2/sas/f/lopdiags/sdklop.html

Reference : IBM POWER8 Functional Simulator (systemsim)http://www-304.ibm.com/webapp/set2/sas/f/pwrfs/pwrfsinstall.html

Page 7: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Key Challenges in Workload Tracing

7

Challenges

• Hardware models execute only a subset of instructions; most workloads run into billions of instructions.

• Overall runtime of emerging workloads increasing

• A smaller subset of runtime with representative workload behavior required for design studies.

• Selection depends on the design needs and the workload characteristics

• The selected segment need to retain the original workload characteristics

Resolutions

• Identify workload interval to trace –workload steady state, phases based on performance counter data

• Representative trace segment selection – sampled, contiguous, filtered or at unit level

• Trace profile validation – capturing the right application runtime, maintaining the CPI characteristics

Page 8: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Performance Evaluations using Traces

8

Microarchitecture

Design

• Design evaluations of new processor features

• Tuning and trade-off analysis

Software Performance Optimizations

• Analysis of hot functions and bottlenecks in applications

• Compiler optimizations

• System tuning

Performance Verification

• Hardware model performance verification

Workload Traces

Page 9: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Microarchitecture Design Analysis and Optimization

9

• Tuning and trade-off analysis• Determine capacity – Cache size , queue size• Sensitivity analysis using various categories of workload traces

• New Design evaluations• New techniques for load-store handling• Branch prediction algorithms• Data prefetch design

Page 10: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Software Performance Optimizations

10

• Analyzing application performance bottlenecks• Back to back latency issues, LSU stalls, Branch mispredictions etc.

• Compiler optimizations• Microarchitecture dependent

• Scheduling, ISA exploitation

• Microarchitecture independent• Inlining, unrolling etc.

• Flag tuning

• System tuning • SMT levels• Prefetch settings• Large pages

Page 11: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Micro-architecture Pipeline View for Optimizations

11

Cycle accurate simulator • Micro-architecture

statistics • Pipeline view for the

instruction mix

References: IBM Power 8 Performance Simulator https://www-304.ibm.com/webapp/set2/sas/f/lopdiags/sdkdownload.html

Page 12: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Performance Verification

12

• Workload traces used for performance verification of hardware model• Broader performance comparison of final hardware model and the

performance model• To identify delta gaps in performance

Page 13: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Summary

13

• OpenPOWER processors designed to deliver superior performance

• Performance evaluation and micro-architecture analysis tools and methods available for open innovation

• Key insights derived from emerging workloads through traces• Enables micro-architecture design evaluations, trade-off

analysis, software/compiler optimizations and verification

Page 14: Emerging Workload Performance Evaluation on Future Generation … · 2019. 3. 1. · © 2016 OpenPOWER Foundation Emerging Workload Performance Evaluation on Future Generation OpenPOWER

© 2016 OpenPOWER Foundation

Thank you

14