Apdex Implementation at AOL CMG International Conference San Diego, California December 5, 2007 Eric Goldsmith Operations Architect eric.goldsmith@corp.aol.com.

Post on 26-Mar-2015

220 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

Transcript

Apdex Implementation at AOL

CMG International Conference

San Diego, California

December 5, 2007

Eric GoldsmithOperations Architecteric.goldsmith@corp.aol.com

Session 45A

Slide 2

Our Environment

Operations organization

Measuring Web site performance from customer-centric view Full page load measured from outside datacenter Multiple geographic locations

Goals Short-term: Identify product issues/outages Long-term: Achieve uniform geographic performance, in parity with

competitors

Slide 3

Current Metrics & Shortcomings

Response Time & Availability Often don’t tell whole user-experience story

Reported as averages Hides variance, and is skewed by outliers

Reported in absolute numbers No context of a target (goal) value

Slide 4

Goals of Apdex use

Inclusive view of performance, availability, and data distribution

“Building in” of a target, and data normalization around it

Performance is evaluated qualitatively against a target

Slide 5

Data Source and Collection

Using commercial 3rd-party tool to gather measurements from multiple geographic locations

Data of interest for our Apdex calculations1. Date/Time

2. Measurement Value

3. Success/Error (Error = Frustrated)

4. Test Location

Data collection is batched (daily)

Slide 6

Calculation and Graphing in Excel

Calculate sub-score for each row (data point)If (error) score = 0

else if (measurement <= T) score = 1else if (measurement <= F) score = 0.5

else score = 0

Define interval over which to calculate Adpex score– Hourly, daily, weekly, etc.– Segregate by location, if desired– Apdex spec recommends >100 data points per interval

Then calculate overall Apdex score for interval =sum(sub-scores) / count(measurements)

Get fancy with DSUM() and DCOUNT() Database lookups simplify segregation by date, location, etc.

Slide 7

Target ‘T’ Determination

We chose our targets based on competitor performance For a given Web site, identify its target competitor (may be self)

The ‘T’ marker method we chose initially was based on “Best Time Multiple” “Measure average response time from a ‘good’ location, then add

50% to build in tolerance for other locations”

Instead, we averaged data from all locations Our thinking was that the 50% inflation wasn’t necessary because of

the natural diversity of the data from multiple geographic locations

Slide 8

Example Results Presentation

Performance - National

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

1-A

ug-0

7

2-A

ug-0

7

3-A

ug-0

7

4-A

ug-0

7

5-A

ug-0

7

6-A

ug-0

7

7-A

ug-0

7

8-A

ug-0

7

9-A

ug-0

7

10-A

ug-0

7

11-A

ug-0

7

12-A

ug-0

7

13-A

ug-0

7

14-A

ug-0

7

15-A

ug-0

7

16-A

ug-0

7

17-A

ug-0

7

18-A

ug-0

7

19-A

ug-0

7

20-A

ug-0

7

21-A

ug-0

7

22-A

ug-0

7

23-A

ug-0

7

24-A

ug-0

7

25-A

ug-0

7

26-A

ug-0

7

27-A

ug-0

7

28-A

ug-0

7

29-A

ug-0

7

30-A

ug-0

7

31-A

ug-0

7

Ap

dex

Sco

re

A [1.1] B [1.1] C [1.1]

Unacceptable

Poor

Fair

Good

Excellent

Slide 9

Example Results Presentation cont’d

Performance - Regional

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

1-A

ug-0

7

2-A

ug-0

7

3-A

ug-0

7

4-A

ug-0

7

5-A

ug-0

7

6-A

ug-0

7

7-A

ug-0

7

8-A

ug-0

7

9-A

ug-0

7

10-A

ug-0

7

11-A

ug-0

7

12-A

ug-0

7

13-A

ug-0

7

14-A

ug-0

7

15-A

ug-0

7

16-A

ug-0

7

17-A

ug-0

7

18-A

ug-0

7

19-A

ug-0

7

20-A

ug-0

7

21-A

ug-0

7

22-A

ug-0

7

23-A

ug-0

7

24-A

ug-0

7

25-A

ug-0

7

26-A

ug-0

7

27-A

ug-0

7

28-A

ug-0

7

29-A

ug-0

7

30-A

ug-0

7

31-A

ug-0

7

Ap

dex

Sco

re

A-East [1.1] B-East [1.1] C-East [1.1]

A-West [1.1] B-West [1.1] C-West [1.1]

Unacceptable

Poor

Fair

Good

Excellent

Slide 10

Problems with our initial T

Initial results were promising…but as we examined data over time, the Apdex results didn’t always correlate well with observations

Performance - West

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

1-A

ug-

07

2-A

ug-

07

3-A

ug-

07

4-A

ug-

07

5-A

ug-

07

6-A

ug-

07

7-A

ug-

07

8-A

ug-

07

9-A

ug-

07

10-A

ug-0

7

11-A

ug-0

7

12-A

ug-0

7

13-A

ug-0

7

14-A

ug-0

7

15-A

ug-0

7

16-A

ug-0

7

17-A

ug-0

7

18-A

ug-0

7

19-A

ug-0

7

20-A

ug-0

7

21-A

ug-0

7

22-A

ug-0

7

23-A

ug-0

7

24-A

ug-0

7

25-A

ug-0

7

26-A

ug-0

7

27-A

ug-0

7

28-A

ug-0

7

29-A

ug-0

7

30-A

ug-0

7

31-A

ug-0

7

Ap

de

x S

co

re

A-West [1.1] B-West [1.1] C-West [1.1]

Unacceptable

Poor

Fair

Good

Excellent

Target competitor never achieves Excellent level

Significant performance change not reflected(see next slide)

Slide 11

West Coast Page Load Time

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

12-A

ug-0

7

13-A

ug-0

7

14-A

ug-0

7

15-A

ug-0

7

16-A

ug-0

7

17-A

ug-0

7

Tim

e (s

ec)

Before After T F

Example of Initial T Problem

• 44% reduction in average load time

• But Apdex score didn’t change

Slide 12

Plan B

We experimented with various T determination techniques, and eventually settled on the “Empirical Data” method “Find T that results in the proper Apdex for a well studied group”

In our environment… For a given Web site, identify its target competitor (may be self)

– The performance of this competitor is defined as “Excellent” Determine the smallest T such that the competitor’s Apdex score

remains Excellent for a period of time (at least 1 month)

Slide 13

Performance - West

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

1-A

ug-0

7

2-A

ug-0

7

3-A

ug-0

7

4-A

ug-0

7

5-A

ug-0

7

6-A

ug-0

7

7-A

ug-0

7

8-A

ug-0

7

9-A

ug-0

7

10-A

ug-0

7

11-A

ug-0

7

12-A

ug-0

7

13-A

ug-0

7

14-A

ug-0

7

15-A

ug-0

7

16-A

ug-0

7

17-A

ug-0

7

18-A

ug-0

7

19-A

ug-0

7

20-A

ug-0

7

21-A

ug-0

7

22-A

ug-0

7

23-A

ug-0

7

24-A

ug-0

7

25-A

ug-0

7

26-A

ug-0

7

27-A

ug-0

7

28-A

ug-0

7

29-A

ug-0

7

30-A

ug-0

7

31-A

ug-0

7

Ap

dex

Sco

re

A-West [1.6] B-West [1.6] C-West [1.6]

Unacceptable

Poor

Fair

Good

Excellent

New T

With the new T, the Apdex results correlate better with observations

Target competitor now achieves Excellent level

Performance change now reflected

Slide 14

Changing T

Define technique for reevaluating T on an ongoing basis But don’t want to change T too often

Suggestions for reevaluating T: Quarterly, looking at prior 3 months of data When a significant product change occurs When requested (from business)

Slide 15

Example - T Change

Performance - National

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

1-A

ug-0

7

2-A

ug-0

7

3-A

ug-0

7

4-A

ug-0

7

5-A

ug-0

7

6-A

ug-0

7

7-A

ug-0

7

8-A

ug-0

7

9-A

ug-0

7

10-A

ug-0

7

11-A

ug-0

7

12-A

ug-0

7

13-A

ug-0

7

14-A

ug-0

7

15-A

ug-0

7

16-A

ug-0

7

17-A

ug-0

7

18-A

ug-0

7

19-A

ug-0

7

20-A

ug-0

7

21-A

ug-0

7

22-A

ug-0

7

23-A

ug-0

7

24-A

ug-0

7

25-A

ug-0

7

26-A

ug-0

7

27-A

ug-0

7

28-A

ug-0

7

29-A

ug-0

7

30-A

ug-0

7

31-A

ug-0

7

Ap

dex

Sco

re

A [1.6] B [1.6] C [1.6]A [1.1] B [1.1] C [1.1]

Unacceptable

Poor

Fair

Good

Excellent

Slide 16

Apdex vs. Other Metrics

Performance - National

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

1-S

ep-0

7

2-S

ep-0

7

3-S

ep-0

7

4-S

ep-0

7

5-S

ep-0

7

6-S

ep-0

7

7-S

ep-0

7

8-S

ep-0

7

9-S

ep-0

7

10-S

ep-0

7

11-S

ep-0

7

12-S

ep-0

7

13-S

ep-0

7

14-S

ep-0

7

15-S

ep-0

7

16-S

ep-0

7

17-S

ep-0

7

18-S

ep-0

7

19-S

ep-0

7

20-S

ep-0

7

21-S

ep-0

7

22-S

ep-0

7

23-S

ep-0

7

24-S

ep-0

7

25-S

ep-0

7

26-S

ep-0

7

27-S

ep-0

7

28-S

ep-0

7

29-S

ep-0

7

30-S

ep-0

7

Ap

dex

Sco

re

A [1.6] B [1.6] C [1.6]

Unacceptable

Poor

Fair

Good

Excellent

Slide 17

95.0

95.5

96.0

96.5

97.0

97.5

98.0

98.5

99.0

99.5

100.0

1-S

ep-0

7

2-S

ep-0

7

3-S

ep-0

7

4-S

ep-0

7

5-S

ep-0

7

6-S

ep-0

7

7-S

ep-0

7

8-S

ep-0

7

9-S

ep-0

7

10-S

ep-0

7

11-S

ep-0

7

12-S

ep-0

7

13-S

ep-0

7

14-S

ep-0

7

15-S

ep-0

7

16-S

ep-0

7

17-S

ep-0

7

18-S

ep-0

7

19-S

ep-0

7

20-S

ep-0

7

21-S

ep-0

7

22-S

ep-0

7

23-S

ep-0

7

24-S

ep-0

7

25-S

ep-0

7

26-S

ep-0

7

27-S

ep-0

7

28-S

ep-0

7

29-S

ep-0

7

30-S

ep-0

7

Ava

ilab

iilty

(p

erce

nt)

0.000.050.100.150.200.250.300.350.400.450.500.550.600.650.700.750.800.850.900.951.00

Ap

dex

Sco

re A [1.6]

B [1.6]

C [1.6]

Unacceptable

P oor

Fair

Good

Excellent

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Per

form

ance

(se

con

ds)

A

B

C

Apdex vs. Performance & Availability

Deep Dive 1

Virtually no change in Apdex for B, despite large change in performance and availability.

Deep Dive 2

Apdex shows B performing better than A. Perf/Avail charts show opposite.

Slide 18

Performance - National

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

12-S

ep-0

7

13-S

ep-0

7

14-S

ep-0

7

15-S

ep-0

7

Per

form

ance

(se

con

ds)

B B-Avg

T F

Deep Dive 1

0

10

20

30

40

50

60

Pe

rfo

rman

ce

(sec

on

ds)

S 377 (0.53)

T 314 (0.22)

F 22

A 0.75

S 422 (0.60)

T 195 (0.14)

F 88

A 0.74

S 419 (0.59)

T 219 (0.15)

F 73

A 0.74

Virtually no change in Apdex for B, despite large change in performance and availability.

Slide 19

Performance - National

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

19-S

ep-0

7

20-S

ep-0

7

21-S

ep-0

7

22-S

ep-0

7

23-S

ep-0

7

Per

form

ance

(se

con

ds)

A BA-Avg B-AvgT F

Deep Dive 2

0

10

20

30

40

50

60

Pe

rfo

rman

ce (

sec

on

ds)

807 (0.59)

390 (0.55)

552 (0.20)

215 (0.15)

15 106

0.79 0.70

341 (0.24)

436 (0.61)

1074 (0.38) 213 (0.15)

7 63

0.62 0.76

Apdex shows B performing better than A. Perf/Avail charts show opposite.

506 (0.36)

553 (0.79)

919 (0.32) 146 (0.10)

5 5

0.68 0.89

942 (0.66)

607 (0.84)

469 (0.17) 111 (0.08)

4 3

0.83 0.92

Slide 20

Closing Thoughts

We’re still exploring the application of Apdex in an Operations organization Can Apdex be used to identify the day to day "issues" traditionally

identified through analysis of performance and availability metrics? Or is it better suited as a method of performance representation for

the business side of the house?

Interesting to calc: what would it take for a product to achieve the next "band" of performance What performance level do I need to move from Poor to Fair Help in establishing interim targets

Thank You

Questions?

top related