Oh Boy! @OptimiseOrDie These A/B tests appear to be bullshit!
Nov 22, 2014
Oh Boy!
@OptimiseOrDie
These A/B tests appear to be bullshit!
@OptimiseOrDie
• UX, Analytics, Testing and Innovation• Started doing testing & CRO 2004• Split tested over 40M visitors in 19
languages• 60+ mistakes with AB testing• I’ve made every one of them• Like riding a bike…
• Testing, Workshops, Mentoring, CRO Methodology, Growth
Oh Boy!
@OptimiseOrDie
These A/B tests appear to be bullshit!
Top F***ups for 20141. Testing in the wrong place2. Your hypothesis inputs are crap3. No analytics integration4. Your test will finish after you die5. You don’t test for long enough6. You peek before it’s ready7. No QA for your split test8. Opportunities are not prioritised9. Testing cycles are too slow10. You don’t know when tests are ready11. Your test fails12. The test is ‘about the same’13. Test flips behaviour14. Test keeps moving around15. You run an A/A test and waste time16. Nobody ‘feels’ the test17. You forgot you were responsive18. You forgot you had no traffic19. You ran the wrong test type20. You didn’t try all the flavours of testing
@OptimiseOrDie
slidesha.re/1wBbZ9c
#fail
@OptimiseOrDie
@OptimiseOrDie
@OptimiseOrDie
Oppan Gangnam Style!
@OptimiseOrDie
@OptimiseOrDie
@OptimiseOrDie
@OptimiseOrDie
The 95% Stopping Problem• Many people use 95, 99% ‘confidence’ to
stop• This value is unreliable• Read this Nature article : bit.ly/1dwk0if• You can hit 95% early in a test• If you stop, it could be a false positive• Tools need to be smarter about inference• This 95% thingy – it’s last on your list for
reasons to stop testing• Let me explain
@OptimiseOrDie
@OptimiseOrDie
The 95% Stopping Problem
The 95% Stopping Problem
@OptimiseOrDie
abtestguide.com/calc/
The 95% Stopping Problem
@OptimiseOrDie
“You should know that stopping a test once it’s significant is deadly sin number 1 in A/B testing land. 77% of A/A tests (testing the same thing as A and B) will reach significance at a certain point.”Ton Wesseling, Online Dialogue
“I always tell people that you need a representative sample if your data needs to be valid. What does ‘representative’ mean? First of all you need to include all the weekdays and weekends. You need different weather, because it impacts buyer behavior. But most important: Your traffic needs to have all traffic sources, especially newsletter, special campaigns, TV,… everything! The longer the test runs, the more insights you get.Andre Morys, Web Arts
The 95% Stopping Problem
“Statistical Significance does not equal Validity”http://bit.ly/1wMfmY2
“Why every Internet Marketer should be a Statistician”http://bit.ly/1wMfs1G
“Understanding the Cycles in your site”http://mklnd.com/1pGSOUP
Three Articles you MUST read
Business & Purchase Cycles
@OptimiseOrDie
• Customers change• Your traffic mix changes• Markets, competitors• Be aware of all the waves• Always test whole cycles• Minimum 2 cycles
(wk/mo)• Don’t exclude slower
buyers
Start Test Finish Avg Cycle
20
When to stop?• MINIMUM two business cycles (week/mo)• MINIMUM of 1 purchase cycle• MINIMUM 250 outcomes/conversions per creative• MORE if relative difference is low• ALWAYS test full weeks• KNOW what marketing and cycles are doing• RUN a test length calculator - bit.ly/XqCxuu• SET your test run time • Run it• Stop it• Analyse the data• Sometimes I run longer but beware!
@OptimiseOrDie
No QA testing for the AB test?
QA Test or Die!• Over 40% of tests have had QA issues.• It’s very easy to break or bias the testing
Browser testing www.crossbrowsertesting.comwww.browserstack.comwww.spoon.netwww.saucelabs.com
www.multibrowserviewer.com
Mobile devices www.appthwack.comwww.deviceanywhere.comwww.opendevicelab.com
Article bit.ly/1wBccsJ
@OptimiseOrDie
Gamble the Company AWAY!
• I get 60-70% right• UX and Copywriters good at
picking!• C level execs are easy marks• Ironically, many decide ‘designs’• You need collaborative test
design• It’s a team game, with
customers• Flip a coin, anyone?
WE’RE ALL WINGING IT
2004 Headspace
What I thought I knew in 2004
Reality
2014 Headspace
What I KNOW I know
Me, on a good day
Guessaholics Anonymous
Rumsfeldian Space
@OptimiseOrDie
Rumsfeldian Space
@OptimiseOrDie
23 : Business Future Testing?
“Congratulations! Today you’re the lucky winner of our random awards programme. You get all these extra features for free, on us. Enjoy.”
The 5 Legged Optimisation Barstool@OptimiseOrD
ie
#1 Smart Talented Polymath People
Flexible and Agile teams
@OptimiseOrDie
Fittest? Agile!
@OptimiseOrDie
#2 : Analytics Investment (tools, people, dev time)
@OptimiseOrDie
#3 : User research and insight
@OptimiseOrDie
#3 : THE BEST IDEAS COME FROM?
“On the average, five times as many people read the headline as read the body copy. When you have written your headline, you have spent eighty cents out of your dollar.”David Ogilvy
“In 9 years and 40M split tests with visitors, the majority of my testing success came from playing with the words.”@OptimiseOrDie
#4 : GREAT COPYWRITING
• Google Content Experiments bit.ly/Ljg7Ds
• Optimizelywww.optimizely.com
• Visual Website Optimizerwww.visualwebsiteoptimizer.com
• Multi Armed Bandit Explanationbit.ly/Xa80O8
• New Machine Learning Toolswww.conductrics.comwww.rekko.com
@OptimiseOrDie
#5 : Split Testing Tools
@OptimiseOrDie
#1 Culture & Team#2 Toolkit & Analytics investment#3 UX, CX, Service Design, Insight#4 Persuasive Copywriting#5 Experimentation (testing) tools
The 5 Legged Optimisation Barstool
READ STUFF
READ STUFF
READ STUFF
#5 : FIND STUFF
@OptimiseOrDie
@danbarker Analytics@fastbloke Analytics@timlb Analytics@jamesgurd Analytics@therustybear Analytics@carmenmardiros Analytics@davechaffey Analytics@priteshpatel9 Analytics@cutroni Analytics@avinash Analytics@AschottmullerAnalytics, CRO@cartmetrix Analytics, CRO@Kissmetrics CRO / UX@Unbounce CRO / UX@Morys CRO / Neuro@UXFeeds UX / Neuro@Psyblog Neuro@Gfiorelli1 SEO / Analytics
@PeepLaja CRO@TheGrok CRO@UIE UX@LukeW UX / Forms@cjforms UX / Forms@axbom UX@iatv UX@Chudders Photo UX@JeffreyGroks Innovation@StephanieRieger Innovation@BrianSolis Innovation@DrEscotet Neuro@TheBrainLadyNeuro@RogerDooley Neuro@Cugelman Neuro@Smashingmag Dev / UX@uxmag UX@Webtrends UX / CRO
#5 : LEARN STUFF
@OptimiseOrDie
Baymard.comLukew.comSmashingmagazine.comConversionXL.comMedium.comWhichtestwon.comUnbounce.comMeasuringusability.comRogerDooley.comKissmetrics.comUxmatters.comSmartinsights.comEconsultancy.comCutroni.com
www.GetMentalNotes.com
#12 : The Best Companies…• Invest continually in analytics instrumentation, tools, people• Use an Agile, iterative, cross-silo, one team project culture• Prefer collaborative tools to having lots of meetings• Prioritise development based on numbers and insight• Practice real continuous product improvement, not SLEDD*
• Are fixing bugs, cruft, bad stuff as well as optimising• Source photos and content that support persuasion and utility• Have cross channel, cross device design, testing and QA• Segment their data for valuable insights, every test or change• Continually reduce cycle (iteration) time in their process• Blend ‘long’ design, continuous improvement AND split tests• Make optimisation the engine of change, not the slave of ego
* Single Large Expensive Doomed Developments
THE FUTURE OF TESTING
Thank You!
@OptimiseOrDie
Mail : [email protected] : slideshare.com/sullivacLinkedin : linkd.in/pvrg14