Top Banner
Lecture 2a: Lecture 2a: Performance Performance Measurement Measurement
27

Lecture 2a: Performance Measurement

Jan 13, 2016

Download

Documents

Elmer

Lecture 2a: Performance Measurement. Performance Evaluation. The primary duty of software developers is to create functionally correct programs Performance evaluation is a part of software development for well-performing programs. Performance Analysis Cycle. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 2a: Performance Measurement

Lecture 2a:Lecture 2a:

Performance Performance MeasurementMeasurement

Page 2: Lecture 2a: Performance Measurement

Goals of Performance Analysis

The goal of performance analysis is to provide quantitative information about the performance of a computer system

Page 3: Lecture 2a: Performance Measurement

Goals of Performance Analysis Compare alternatives

• When purchasing a new computer system, to provide quantitative information Determine the impact of a feature

• In designing a new system or upgrading, to provide before-and-after comparison System tuning

• To find the best parameters that produce the best overall performance Identify relative performance

• To quantify the performance relative to previous generations Performance debugging

• To identify the performance problems and correct them Set expectations

• To determine the expected capabilities of the next generation

Page 4: Lecture 2a: Performance Measurement

Performance Evaluation

Performance Evaluation steps:

1. Measurement / Prediction• What to measure? How to measure?

• Modeling for prediction• Simulation

• Analytical Modeling

2. Analysis & Reporting• Performance metrics

Page 5: Lecture 2a: Performance Measurement

Performance Measurement

Interval Timers

• Hardware Timers

• Software Timers

Page 6: Lecture 2a: Performance Measurement

Performance Measurement

Hardware Timers

• Counter value is read from a memory location

• Time is calculated as

Clock Counter

Tc

n bits to processor memory bus

Time = (x2 - x1) x Tc

Page 7: Lecture 2a: Performance Measurement

Performance Measurement

Software Timers

• Interrupt-based

• When interrupt occurs, interrupt-service routine increments the timer value which is read by a program

• Time is calculated as

ClockPrescaling Counter

Tc

to processor interrupt input

T’c

Time = (x2 - x1) x T’c

Page 8: Lecture 2a: Performance Measurement

Performance Measurement

Timer Rollover

Occurs when an n-bit counter undergoes a transition from its maximum value 2n – 1 to zero

There is a trade-off between roll over time and accuracy

T’c 32-bit 64-bit

10 ns 42 s 5850 years

1 s 1.2 hour 0.5 million years

1 ms 49 days 0.5 x 109 years

Page 9: Lecture 2a: Performance Measurement

Timers

Solution:

1. Use 64-bit integer (over half a million year)

2. Timer returns two values:

• One represents seconds

• One represents microseconds since the last second

With 32-bit, the roll over is over 100 years

Page 10: Lecture 2a: Performance Measurement

Performance Measurement

Interval Timers

T0 Read current timeEvent being timed ();T1 Read current time

Time for the event is: T1-T0

Page 11: Lecture 2a: Performance Measurement

Performance MeasurementTimer Overhead

Initiate read_time

Current time is read

Event begins

Event ends; Initiate read_time

Current time is read

T1

T2

T3

T4

Measured time:

Tm = T2 + T3 + T4

Desired measurement:

Te = Tm – (T2 + T4)

= Tm – (T1 + T2) since T1 = T4

Timer overhead:

Tovhd = T1 + T2

Te should be 100-1000 times greater than Tovhd .

Page 12: Lecture 2a: Performance Measurement

Performance MeasurementTimer Resolution

Resolution is the smallest change that can be detected by an interval timer.

nT’c < Te < (n+1)T’c

If T’c is large relative to the event being measured, it may be impossible to measure the duration of the event.

Page 13: Lecture 2a: Performance Measurement

Performance MeasurementMeasuring Short Intervals

Te < T’c

T’c

Te

T’c

Te

1

0

Page 14: Lecture 2a: Performance Measurement

Performance MeasurementMeasuring Short Intervals

Solution: Repeat measurements n times. Approximates a binomial distribution.

Average execution time: T’e = (m/n) x T’c

m: number of 1s measured

T’c

Te

Page 15: Lecture 2a: Performance Measurement

Performance MeasurementMeasuring Short Intervals

Solution: Repeat measurements n times. Measure the total execution time (Tt)

Average execution time: T’e = (Tt / n ) – h

Tt : total execution time of n repetitions

h: repetition overhead

T’c

Te

Tt

Page 16: Lecture 2a: Performance Measurement

Performance Measurement Time

• Elapsed time / wall-clock time / response time• Latency to complete a task, including disk access,

memory access, I/O, operating system overhead, and everything (includes time consumed by other programs in a time-sharing system)

• CPU time• The time CPU is computing, not including I/O time or

waiting time• User time / user CPU time

• CPU time spent in the program• System time / system CPU time

• CPU time spent in the operating system performing tasks requested by the program

Page 17: Lecture 2a: Performance Measurement

Performance Measurement

UNIX time command

90.7u 12.9s 2:39 65%

Drawbacks:

• Resolution is in milliseconds

• Different sections of the code can not be timed

User time

System time

Elapsed time Percentage of

elapsed time

Page 18: Lecture 2a: Performance Measurement

Timers

Timer is a function, subroutine or program that can be used to return the amount of time spent in a section of code.

t0 = timer(); …< code segment > …t1 = timer();time = t1 – t0;

zero = 0.0;t0 = timer(&zero); …< code segment > …t1 = timer(&t0);time = t1;

Page 19: Lecture 2a: Performance Measurement

Timers

Read:

Wadleigh, Crawford pg 130-136 for:

time, clock, gettimeofday, etc.

Page 20: Lecture 2a: Performance Measurement

TimersMeasuring Timer Resolution

main() { . . .zero = 0.0;t0 = timer(&zero);t1 = 0.0;j=0;while (t1 == 0.0) {

j++;zero=0.0;t0 = timer(&zero);foo(j);t1 = timer(&t0);

}printf (“It took %d iterations for a nonzero time\n”, j); if (j==1) printf (“timer resolution <= %13.7f seconds\n”, t1);else printf (“timer resolution is %13.7f seconds\n”, t1);

}foo(n){ . . .

i=0;for (k=0; k<n; k++)

i++;return(i);

}

Page 21: Lecture 2a: Performance Measurement

TimersMeasuring Timer Resolution

Using clock():

Using times():

Using getrusage():

It took 682 iterations for a nonzero timetimer resolution is 0.0200000 seconds

It took 720 iterations for a nonzero timetimer resolution is 0.0200000 seconds

It took 7374 iterations for a nonzero timetimer resolution is 0.0002700 seconds

Page 22: Lecture 2a: Performance Measurement

TimersSpin Loops

For codes that take less time to run than the resolution of the timer First call to a function may require an inordinate amount of time. Therefore the minimum of all times may be desired.

main() { . . .zero = 0.0;t2 = 100000.0;for (j=0; j<n; j++) {

t0 = timer(&zero);foo(j);t1 = timer(&t0); t2 = min(t2, t1);

}t2 = t2 / n;printf (“Minimum time is %13.7f seconds\n”, t2);

}foo(n){ . . .

< code segment >}

Page 23: Lecture 2a: Performance Measurement

Profilers A profiler automatically insert timing calls into applications to

generate calls into applications

It is used to identify the portions of the program that consumes the largest fraction of the total execution time.

It may also be used to find system-level bottlenecks in a multitasking system.

Profilers may alter the timing of a program’s execution

Page 24: Lecture 2a: Performance Measurement

Profilers Data collection techniques

• Sampling-based

• This type of profilers use a predefined clock; every multiple of this clock tick the program is interrupted and the state information is recorded.

• They give the statistical profile of the program behavior.

• They may miss some important events.

• Event-based

• Events are defined (e.g. entry into a subroutine) and data about these events are collected.

• The collected information shows the exact execution frequencies.

• It has substantial amount of run-time overhead and memory requirement.

Information kept

• Trace-based: The compiler keeps all information it collects.

• Reductionist: Only statistical information is collected.

Page 25: Lecture 2a: Performance Measurement

Performance Evaluation

Performance Evaluation steps:

1. Measurement / Prediction• What to measure? How to measure?

• Modeling for prediction• Simulation

• Analytical Modeling

• Queuing Theory

2. Analysis & Reporting• Performance metrics

Page 26: Lecture 2a: Performance Measurement

Predicting Performance

Performance of simple kernels can be predicted to a high degree

Theoretical performance and peak performance must be close

It is preferred that the measured performance is over 80% of the theoretical peak performance

Page 27: Lecture 2a: Performance Measurement

Homework 1

Write a C program to measure the execution time (elapsed time) of an addition operation (i.e. a=b+c). Run your program on both Windows and Linux systems. Use a timer that has at least s resolution.

Prepare a one-page report and explain the following: Your method to measure time Your code Specifications of the system that you run your code (processor, clock

speed, etc.) Your measurement results Comments on your results