Top Banner
Oh Boy! @OptimiseOrDie These A/B tests appear to be bullshit!
47

Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

Nov 22, 2014

Download

Internet

Craig Sullivan

An updated deck of a short talk (30m) given at the first Brighton CRO meetup. Contains useful AB testing tools as well as full speaker notes for most of the slides.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

Oh Boy!

@OptimiseOrDie

These A/B tests appear to be bullshit!

Page 2: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

• UX, Analytics, Testing and Innovation• Started doing testing & CRO 2004• Split tested over 40M visitors in 19

languages• 60+ mistakes with AB testing• I’ve made every one of them• Like riding a bike…

• Testing, Workshops, Mentoring, CRO Methodology, Growth

Page 3: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me
Page 4: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

Oh Boy!

@OptimiseOrDie

These A/B tests appear to be bullshit!

Page 5: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

Top F***ups for 20141. Testing in the wrong place2. Your hypothesis inputs are crap3. No analytics integration4. Your test will finish after you die5. You don’t test for long enough6. You peek before it’s ready7. No QA for your split test8. Opportunities are not prioritised9. Testing cycles are too slow10. You don’t know when tests are ready11. Your test fails12. The test is ‘about the same’13. Test flips behaviour14. Test keeps moving around15. You run an A/A test and waste time16. Nobody ‘feels’ the test17. You forgot you were responsive18. You forgot you had no traffic19. You ran the wrong test type20. You didn’t try all the flavours of testing

@OptimiseOrDie

slidesha.re/1wBbZ9c

Page 6: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

#fail

@OptimiseOrDie

Page 7: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

Page 8: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

Page 9: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

Oppan Gangnam Style!

@OptimiseOrDie

Page 10: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

Page 11: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

Page 12: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

Page 13: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

The 95% Stopping Problem• Many people use 95, 99% ‘confidence’ to

stop• This value is unreliable• Read this Nature article : bit.ly/1dwk0if• You can hit 95% early in a test• If you stop, it could be a false positive• Tools need to be smarter about inference• This 95% thingy – it’s last on your list for

reasons to stop testing• Let me explain

@OptimiseOrDie

Page 14: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

The 95% Stopping Problem

Page 15: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

The 95% Stopping Problem

@OptimiseOrDie

abtestguide.com/calc/

Page 16: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

The 95% Stopping Problem

@OptimiseOrDie

Page 17: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

“You should know that stopping a test once it’s significant is deadly sin number 1 in A/B testing land. 77% of A/A tests (testing the same thing as A and B) will reach significance at a certain point.”Ton Wesseling, Online Dialogue

“I always tell people that you need a representative sample if your data needs to be valid. What does ‘representative’ mean? First of all you need to include all the weekdays and weekends. You need different weather, because it impacts buyer behavior. But most important: Your traffic needs to have all traffic sources, especially newsletter, special campaigns, TV,… everything! The longer the test runs, the more insights you get.Andre Morys, Web Arts

The 95% Stopping Problem

Page 18: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

“Statistical Significance does not equal Validity”http://bit.ly/1wMfmY2

“Why every Internet Marketer should be a Statistician”http://bit.ly/1wMfs1G

“Understanding the Cycles in your site”http://mklnd.com/1pGSOUP

Three Articles you MUST read

Page 19: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

Business & Purchase Cycles

@OptimiseOrDie

• Customers change• Your traffic mix changes• Markets, competitors• Be aware of all the waves• Always test whole cycles• Minimum 2 cycles

(wk/mo)• Don’t exclude slower

buyers

Start Test Finish Avg Cycle

Page 20: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

20

Page 21: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

When to stop?• MINIMUM two business cycles (week/mo)• MINIMUM of 1 purchase cycle• MINIMUM 250 outcomes/conversions per creative• MORE if relative difference is low• ALWAYS test full weeks• KNOW what marketing and cycles are doing• RUN a test length calculator - bit.ly/XqCxuu• SET your test run time • Run it• Stop it• Analyse the data• Sometimes I run longer but beware!

@OptimiseOrDie

Page 22: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

No QA testing for the AB test?

Page 23: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

QA Test or Die!• Over 40% of tests have had QA issues.• It’s very easy to break or bias the testing

Browser testing www.crossbrowsertesting.comwww.browserstack.comwww.spoon.netwww.saucelabs.com

www.multibrowserviewer.com

Mobile devices www.appthwack.comwww.deviceanywhere.comwww.opendevicelab.com

Article bit.ly/1wBccsJ

@OptimiseOrDie

Page 24: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

Gamble the Company AWAY!

• I get 60-70% right• UX and Copywriters good at

picking!• C level execs are easy marks• Ironically, many decide ‘designs’• You need collaborative test

design• It’s a team game, with

customers• Flip a coin, anyone?

Page 25: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

WE’RE ALL WINGING IT

Page 26: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

2004 Headspace

What I thought I knew in 2004

Reality

Page 27: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

2014 Headspace

What I KNOW I know

Me, on a good day

Page 28: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

Guessaholics Anonymous

Page 29: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

Rumsfeldian Space

@OptimiseOrDie

Page 30: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

Rumsfeldian Space

@OptimiseOrDie

Page 31: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

23 : Business Future Testing?

“Congratulations! Today you’re the lucky winner of our random awards programme. You get all these extra features for free, on us. Enjoy.”

Page 32: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

The 5 Legged Optimisation Barstool@OptimiseOrD

ie

#1 Smart Talented Polymath People

Flexible and Agile teams

Page 33: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

Fittest? Agile!

Page 34: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

#2 : Analytics Investment (tools, people, dev time)

Page 35: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

#3 : User research and insight

Page 36: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

#3 : THE BEST IDEAS COME FROM?

Page 37: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

“On the average, five times as many people read the headline as read the body copy. When you have written your headline, you have spent eighty cents out of your dollar.”David Ogilvy

“In 9 years and 40M split tests with visitors, the majority of my testing success came from playing with the words.”@OptimiseOrDie

#4 : GREAT COPYWRITING

Page 38: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

• Google Content Experiments bit.ly/Ljg7Ds

• Optimizelywww.optimizely.com

• Visual Website Optimizerwww.visualwebsiteoptimizer.com

• Multi Armed Bandit Explanationbit.ly/Xa80O8

• New Machine Learning Toolswww.conductrics.comwww.rekko.com

@OptimiseOrDie

#5 : Split Testing Tools

Page 39: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

@OptimiseOrDie

#1 Culture & Team#2 Toolkit & Analytics investment#3 UX, CX, Service Design, Insight#4 Persuasive Copywriting#5 Experimentation (testing) tools

The 5 Legged Optimisation Barstool

Page 40: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

READ STUFF

Page 41: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

READ STUFF

Page 42: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

READ STUFF

Page 43: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

#5 : FIND STUFF

@OptimiseOrDie

@danbarker Analytics@fastbloke Analytics@timlb Analytics@jamesgurd Analytics@therustybear Analytics@carmenmardiros Analytics@davechaffey Analytics@priteshpatel9 Analytics@cutroni Analytics@avinash Analytics@AschottmullerAnalytics, CRO@cartmetrix Analytics, CRO@Kissmetrics CRO / UX@Unbounce CRO / UX@Morys CRO / Neuro@UXFeeds UX / Neuro@Psyblog Neuro@Gfiorelli1 SEO / Analytics

@PeepLaja CRO@TheGrok CRO@UIE UX@LukeW UX / Forms@cjforms UX / Forms@axbom UX@iatv UX@Chudders Photo UX@JeffreyGroks Innovation@StephanieRieger Innovation@BrianSolis Innovation@DrEscotet Neuro@TheBrainLadyNeuro@RogerDooley Neuro@Cugelman Neuro@Smashingmag Dev / UX@uxmag UX@Webtrends UX / CRO

Page 44: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

#5 : LEARN STUFF

@OptimiseOrDie

Baymard.comLukew.comSmashingmagazine.comConversionXL.comMedium.comWhichtestwon.comUnbounce.comMeasuringusability.comRogerDooley.comKissmetrics.comUxmatters.comSmartinsights.comEconsultancy.comCutroni.com

www.GetMentalNotes.com

Page 45: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

#12 : The Best Companies…• Invest continually in analytics instrumentation, tools, people• Use an Agile, iterative, cross-silo, one team project culture• Prefer collaborative tools to having lots of meetings• Prioritise development based on numbers and insight• Practice real continuous product improvement, not SLEDD*

• Are fixing bugs, cruft, bad stuff as well as optimising• Source photos and content that support persuasion and utility• Have cross channel, cross device design, testing and QA• Segment their data for valuable insights, every test or change• Continually reduce cycle (iteration) time in their process• Blend ‘long’ design, continuous improvement AND split tests• Make optimisation the engine of change, not the slave of ego

* Single Large Expensive Doomed Developments

Page 46: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

THE FUTURE OF TESTING

Page 47: Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me

Thank You!

@OptimiseOrDie

Mail : [email protected] : slideshare.com/sullivacLinkedin : linkd.in/pvrg14