1 © 2012 SOASTA. All rights reserved. Testing in Production (TiP) Advances with Big Data and the Cloud Webinar Presents
Aug 21, 2015
1© 2012 SOASTA. All rights reserved.
Testing in Production (TiP) Advances with Big Data and the Cloud
Webinar
Presents
2© 2012 SOASTA. All rights reserved. October 30, 2012
Methodologies and technology for Testing in Production (TiP)
In This Webinar
TODAY’S PRESENTERS
Seth Eliot: Sr. Knowledge Engineer in Test, Microsoft- @setheliot
Rob Holcomb: VP Performance Engineering, SOASTA - @rcholcomb
Moderator: Brad Johnson - @bradjohnsonsv
Agenda: • Poll question• Leveraging active and passive monitoring for TiP• Testing and measuring system stress in production• Experimentation and iterative improvement• SOASTA CloudTest for TiP• Closing Poll
Questions: Submit in the question box during event
4© 2012 SOASTA. All rights reserved. October 30, 2012
Let’s talk TiP
Seth Eliot
Sr. Knowledge Engineer in Test
5
About Seth
o Currently with Microsoft Engineering Excellence focused on helping teams transition to The Cloud
o Previously with Bing, and before that Amazon.com
o Seth wishes to thank Brad Johnson, Rob Holcomb and SOASTA for this opportunity
The author is an employee of Microsoft Corporation.
The views expressed in this presentation are those of the author and do not necessarily reflect any
views or positions of Microsoft nor imply any relationship between Microsoft and SOASTA.
Testing at Microsoft1985
o Design, execute and document tests
o Generate Test Scripts and automatic testing packages
The Three (or more) V’s of Big Data
What is Big Data?
Value
Velocity
Variety
Volume
MB GB TBPB EB ZB
[Strata Jan 2012]
TestOpso Monitoring: What Ops doeso Testing: What Test Does
oTestOps: Change (augment) the “signal” used for quality
From Test Results… …to Big Data
The Big Data Signalo Is often found in Production
o May not always be “Big”
o The Quality Insights however should be Big
o TestOps: use this Big Data for quality assessment
o Big Data is in production
o Therefore we Test in Production
Why do we Test in Production?o Leverage the diversity of real users
o …and real prod environment…
o …to find bugs you cannot find pre-production
The Big Data Pipeline
o Facebook: Developers Instrument Everything
o Amazon: Central Monitoring
o Add some config Trending and Alerts
o Netflix: Custom libraries + AWS CloudWatch
Servers
CPU
How does TiP fit into Test strategy?
Does TiP Replace Up-Front Testing (UFT)?
The Death of BUFT (Big UFT)?
BUFTTestStrat
UFT TiPTestStrat
17
Four Categories of TiP
o Passive Monitoring o with Real Data
o Active Monitoringo with Synthetic Transactions
o Experimentationo on Real Users
o System Stresso of the Service and Environment
User Performance Testingo Collect specific telemetry about how long stuff takes from
user point of view
o Real User Data – Real User Experience
o End to End = complete request and response cycle
o From user to back-end round-trip
o Include traffic to partners, dependency response time
oMeasured from the user point of view
o From around the world
o From diversity of browsers, OS, devices
Hotmail JSI User Performance Testing
Big Data?
o Hotmail's JavaScript Instrumentation (JSI)
o Budget for 500 Million measurements / month
o Scale for backend collection and analysis
o PLT by browser, OS, country, cluster, etc..
o As experienced by Millions of Real Users
User Performance Testing ExamplesoHotmail
o Re-architected from the ground up around performance
o Read messages are 50% faster
o Windows Azure™
o Every API: Tracks how many calls were made; how many succeeded, and how long each call took to process
25
TiP Test Execution
o From the Inside
o Against internal APIs
o Automated
o From the Outside
o From User Entry Point
o E2E Scenario in Production
o Automated
o or Manual
27
Active Monitoringo Microsoft Exchange
o Instead of pass/fail signal look at thousands of continuous runs.
o Did we meet the "five nines" (99.999%) availability for scenario?
o Is scenario slower this release than last? - performance
[Deschamps, Johnston, Jan 2012]
28
Test Data Handlingo Synthetic Tests + Real Data = Potential Trouble
o Avoid it
o Tag it
o Clean it up
o Example: Facebook Test Users
o Cannot interact with real users
o Can only friend other Test Users
o Create 100s
o Programmatic Control
Experimentation
o Try new things… in production
o Build on successes
o Cut your losses… before they get expensive
“To have a great idea, have a lot of them”
-- Thomas Edison
31
Mitigate Risk with Exposure Controlo Launch a new Service – Everyone sees it
o Exposure Control – only some see it
By Browser
By Location By Percent(scale)
Example: Controlled Test Flight: Netflix
1B API requests per day
“Canary” Deployment[Cockcroft, March 2012]
38
Load Testing in Production
o Injects load on top of real user traffic
o Monitors for performance
oTo assess system capabilities and scalability
o Big Data
o Traffic mix: real user queries, simulate scenarios
o Real time telemetry: Monitor and Back-Off
o After the fact Analysis o Tune SLAs/Targetso Tune real-time monitors and alerts
39
Load Testing in Production
o Identified issues that only could be found in production
o Agile approach to implementation
o Rob will discuss some SOASTA case studies
40
Destructive Testing in Productiono Google first year of a new data center
o20 rack failures, 1000 server failures and thousands of hard drive failures
[Google DC, 2008]
o High Availability means you must embrace failureo How do you test
this?
41
Netflix Tests its “Rambo Architecture”o …system has to be able to succeed, no
matter what, even all on its owno Test with Fault Injection
o Netflix Simian Armyo Chaos monkey randomly kills production instance in AWSo Chaos Gorilla simulates an outage of an entire Amazon AZo Janitor Monkey, Security Monkey, Latency Monkey…..
[Netflix Army, July 2011]
44
Big Data Quality Signal
aka TestOps
KPI: Key Performance Indicator • Request latency• RPS• Availability / MTTR
Big Data
45© 2012 SOASTA. All rights reserved. October 30, 2012
Thank You!
Seth Eliot
Twitter: @setheliot
Blog: http://bit.ly/seth_qa
46© 2012 SOASTA. All rights reserved. October 30, 2012
CloudTest for TiP
RobHolcomb
VP Performance Engineering, Founder
47© 2012 SOASTA. All rights reserved. October 30, 2012
Testing in Production
o Start testing early and often!
o Don’t wait until the last minute
o Test in production for real results
o Test mix: baseline, stress, spike, endurance, failover, diagnostic
• Start with a baseline to understand general performance characteristics
• Test types chosen depend on the defined goals
o Test case selection: performance testing is not functional testing
o Integrated monitoring data; know when to say when
o Define a clear test strategy with test plans, goals, and deliverable dates
o Focus on actionable results!
Best Practices / Methodology
49© 2012 SOASTA. All rights reserved. October 30, 2012
Thank You!
Contact SOASTA:www.soasta.com/cloudtest/[email protected] us:
twitter.com/cloudtest
facebook.com/cloudtest
White Papers, Webinar Recordings, Case Studieswww.soasta.com - Knowledge Center
Next Webinar: Nov. 8, 2010 - 10 a.m. PST“RUM Expert Roundtable”
* Buddy Brewer & Philip Tellis (LogNormal founders); Aaron Kulick (WalmartLabs): Moderator - Cliff Crocker (SOASTA) *
Register at www.soasta.com/knowledge-center/webinars
Contact Seth: [email protected]@setheliot
Contact Rob: [email protected] @rcholcomb
50
References[Google Talk, June 2007] Google: Seattle Conference on Scalability: Lessons In Building Scalable Systems, Reza Behforooz
http://video.google.com/videoplay?docid=6202268628085731280
[Unpingco, Feb 2011] Edward Unpingco; Bug Miner; Internal Microsoft Presentation, Bing Quality Day
[Barranco, Dec 2011] René Barranco; Heuristics-Based Testing; Internal Microsoft Presentation
[Dell, 2012] http://whichtestwon.com/dell%e2%80%99s-site-wide-search-box-test
[Microsoft.com, TechNet] http://technet.microsoft.com/en-us/library/cc627315.aspx
[Cockcroft, March 2012] http://perfcap.blogspot.com/2012/03/ops-devops-and-noops-at-netflix.html
[Deschamps, Johnston, Jan 2012]
Experiences of Test Automation; Dorothy Graham; Jan 2012; ISBN 0321754069; Chapter: “Moving to the Cloud: The Evolution of TiP, Continuous Regression Testing in Production”; Ken Johnston, Felix Deschamps
[Google DC, 2008] http://content.dell.com/us/en/gen/d/large-business/google-data-center.aspx?dgc=SM&cid=57468&lid=1491495http://perspectives.mvdirona.com/2008/06/11/JeffDeanOnGoogleInfrastructure.aspx
[Kohavi, Oct 2010] Tracking Users’ Clicks and Submits: Tradeoffs between User Experience and Data Losshttp://www.exp-platform.com/Pages/TrackingClicksSubmits.aspx
[Strata Jan 2012] What is big data? - An introduction to the big data landscapehttp://radar.oreilly.com/2012/01/what-is-big-data.html
51
References, continued
[Netflix Army, July 2011] The Netflix Simian Army; July 2011http://techblog.netflix.com/2011/07/netflix-simian-army.html
[Google-Wide Profiling, 2010]
Ren, Gang, et al. Google-wide Profiling: A Continuous Profiling Infrastructure for Data Centers. [Online] July 30, 2010. research.google.com/pubs/archive/36575.pdf
[Facebook ships, 2011] http://framethink.blogspot.com/2011/01/how-facebook-ships-code.html
[Google BusinessWeek, April 2008]
How Google Fuels Its Idea Factory, BusinessWeek, April 29, 2008; http://www.businessweek.com/magazine/content/08_19/b4083054277984.htm
[IBM 2011] http://www.ibm.com/developerworks/websphere/techjournal/1102_supauth/1102_supauth.html
[Kokogiak, 2006] http://www.kokogiak.com/gedankengang/2006/08/amazons-digital-video-sneak-peek.html
[Google GTAC 2010] Whittaker, James. GTAC 2010: Turning Quality on its Head. [Online] October 29, 2010. http://www.youtube.com/watch?v=cqwXUTjcabs&feature=BF&list=PL1242F05D3EA83AB1&index=16.
[Google, JW 2009] http://googletesting.blogspot.com/2009/07/plague-of-homelessness.html
[STPCon, 2012] STPCon Spring 2012 - Testing Wanted: Dead or Alive – March 26, 2012
[Cook, June 2010] Ganglia, OSD: Cook, Tom. A Day in the Life of Facebook Operations. Velocity 2010. [Online] June 2010. http://www.youtube.com/watch?v=T-Xr_PJdNmQ