Top Banner
CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013
22

CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Jan 04, 2016

Download

Documents

Earl Morgan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

CDA 3101 Fall 2013

Introduction to Computer Organization

Benchmarks

30 August 2013

Page 2: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Overview

• Benchmarks

• Popular benchmarks– Linpack– Intel’s iCOMP

• SPEC Benchmarks

• MIPS Benchmark

• Fallacies and Pitfalls

Page 3: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Benchmarks• Benchmarks measure different aspects of component and

system performance

• Ideal situation: use real workload• Types of Benchmarks

• Risk: adjust design to benchmark requirements– (partial) solution: use real programs and update constantly

• Engineering or scientific applications• Software development tools• Transaction processing• Office applications

• Real programs • Kernels

• Toy benchmarks • Synthetic benchmarks

Page 4: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

A / Benchmark Story

1. You create a benchmark called the vmark

2. Run it on lots of different computers

3. Publish the vmarks in www.vmark.org

4. vmark and www.vmark.org become popular – Users start buying their PCs based on vmark– Vendors would be banging on your door

5. Vendors examine the vmark code and fix up their compilers and/or microarchitecture to run vmark

6. Your vmark benchmark has been broken

7. Create vmark 2.0

Page 5: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Performance Reports• Reproducibility

– Include hardware / software configuration (SPEC)

– Evaluation process conditions

• Summarizing performance– Total time:

– Arithmetic mean: AM = 1/n * Σ exec timei

– Harmonic mean: HM = n / Σ (1/ratei)

– Weighted mean: WM = Σ wi * exec timei

– Geometric mean: GM = (Π exec time ratioi)1/n

GM (Xi) XiGM (Yi) Yi

= GM

Page 6: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Ex.1: Linpack Benchmark

• “Mother of all benchmarks”• Time to solve a dense systems of linear equations

DO I = 1, N DY(I) = DY(I) + DA * DX(I) END DO

• Metrics– Rpeak: system peak Gflops

– Nmax: matrix size that gives the highest Gflops

– N1/2: matrix size that achieves half the rated Rmax Gflops

– Rmax: the Gflops achieved for the Nmax size matrix

• Used in http://www.top500.org

Page 7: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Ex.2: Intel’s iCOMP Index 3.0• New version (3.0) reflects:

• Mix of instructions for existing and emerging software. • Increasing use of 3D, multimedia, and Internet software.

• Benchmarks• 2 integer productivity applications (20% each)

• 3D geometry and lighting calculations (20%)• FP engineering and finance programs and games (5%)• Multimedia and Internet application (25%. )• Java application (10%)

• Weighted GM of relative performance– Baseline processor: Pentium II processor at 350MHz

Page 8: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Ex.3: SPEC CPU Benchmarks• System Performance Evaluation

Corporation• Need to update/upgrade benchmarks

– Longer run time– Larger problems– Application diversity

• Rules to run and report– Baseline and optimized– Geometric mean of normalized execution times– Reference machine: Sun Ultra5_10 (300-MHz SPARC, 256MB)

• CPU2006: latest SPEC CPU benchmark (4th version)– 12 integer and 17 floating point programs

• Metrics: response time and throughput

www.spec.org

Page 9: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Ex.3: SPEC CPU Benchmarks

1989-2006

Previous Benchmarks, now retired

Page 10: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Ex.3: SPEC CPU Benchmarks

• Observe: We will use SPEC 2000 & 2006 CPU

benchmarks in this set of notes.

• Task: However, you are asked to read about

SPEC 2006 CPU benchmark suite, described at

www.spec.org/cpu2006

• Result: Compare SPEC 2006 with SPEC

2000 data www.spec.org/cpu2000 to answer

the extra-credit questions in Homework #2.

Page 11: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

SPEC CINT2000 Benchmarks1. 164.gzip C Compression2. 175.vpr C FPGA Circuit Placement and

Routing3. 176.gcc C C Programming Language Compiler4. 181.mcf C Combinatorial Optimization5. 186.crafty C Game Playing: Chess6. 197.parser C Word Processing7. 252.eon C++ Computer Visualization8. 253.perlbmk C PERL Programming Language9. 254.gap C Group Theory, Interpreter10. 255.vortex CObject-oriented Database11. 256.bzip2 C Compression12. 300.twolf C Place and Route Simulator

Page 12: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

SPEC CFP2000 Benchmarks1. 168.wupwise F77 Physics / Quantum Chromodynamics2. 171.swim F77 Shallow Water Modeling3. 172.mgrid F77 Multi-grid Solver: 3D Potential Field4. 173.applu F77 Parabolic / Elliptic Partial Differential Equations5. 177.mesa C 3-D Graphics Library6. 178.galgel F90 Computational Fluid Dynamics7. 179.art C Image Recognition / Neural Networks8. 183.equake C Seismic Wave Propagation Simulation9. 187.facerec F90 Image Processing: Face Recognition10. 188.ammp C Computational Chemistry11. 189.lucas F90 Number Theory / Primality Testing12. 191.fma3d F90 Finite-element Crash Simulation13. 200.sixtrack F77 High Energy Nuclear Physics Accelerator Design14. 301.apsi F77 Meteorology: Pollutant Distribution

Page 13: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

SPECINT2000 Metrics• SPECint2000: The geometric mean of 12 normalized

ratios (one for each integer benchmark) when each benchmark is compiled with "aggressive" optimization

• SPECint_base2000: The geometric mean of 12 normalized ratios when compiled with "conservative" optimization

• SPECint_rate2000: The geometric mean of 12 normalized throughput ratios when compiled with "aggressive" optimization

• SPECint_rate_base2000: The geometric mean of 12 normalized throughput ratios when compiled with "conservative" optimization

Page 14: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

SPECint_base2000 Results

Alpha/Tru6421264 @ 667 MHz

Mips/IRIXR12000@ 400MHz

Intel/NT 4.0PIII @ 733 MHz

Page 15: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

SPECfp_base2000 Results

Alpha/Tru6421264 @ 667 MHz

Mips/IRIXR12000@ 400MHz

Intel/NT 4.0PIII @ 733 MHz

Page 16: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Effect of CPI: SPECint95 Ratings

Microarchitecture improvements

CPU time = IC * CPI * clock cycle

Page 17: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Effect of CPI: SPECfp95 Ratings

Microarchitecture improvements

Page 18: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

SPEC Recommended Readings

SPEC 2006 – Survey of Benchmark Programs

http://www.spec.org/cpu2006/publications/CPU2006benchmarks.pdf

SPEC 2006 Benchmarks - Journal Articles on Implementation Techniques and Problems

http://www.spec.org/cpu2006/publications/SIGARCH-2007-03/

SPEC 2006 Installation, Build, and Runtime Issues

http://www.spec.org/cpu2006/issues/

Page 19: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Another Benchmark: MIPS • Millions of Instructions Per Second• MIPS = IC / (CPUtime * 106)• Comparing apples to oranges• Flaw: 1 MIPS on one processor does not accomplish

the same work as 1 MIPS on another– It is like determining the winner of a foot race by counting

who used fewer steps– Some processors do FP in software (e.g. 1FP = 100 INT)– Different instructions take different amounts of time

• Useful for comparisons between 2 processors from the same vendor that support the same ISA with the same compiler (e.g. Intel’s iCOMP benchmark)

Page 20: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Fallacies and Pitfalls

• Ignoring Amdahl’s law • Using clock rate or MIPS

as a performance metric• Using the Arithmetic Mean of normalized

CPU times (ratios) instead of the Geometric Mean• Using hardware-independent metrics

– Using code size as a measure of speed

• Synthetic benchmarks predict performance– They do not reflect the behavior of real programs

• The geometric mean of CPU times ratios is proportional to the total execution time [NOT!!]

Page 21: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Conclusions

• Performance is specific to a particular program/s• CPU time: only adequate measure of performance• For a given ISA performance increases come from:

– increases in clock rate (without adverse CPI affects)

– improvements in processor organization that lower CPI

– compiler enhancements that lower CPI and/or IC

• Your workload: the ideal benchmark• You should not always believe everything you read!

Page 22: CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.

Happy & Safe Holiday Weekend