Emmanuel Cecchet, Robert Sims, Prashant Shenoy and Xin He University of Massachusetts Amherst mBenchLab Measuring QoE of Web Applications Using Mobile Devices http://benchlab.cs.umass.edu/
Dec 23, 2015
Emmanuel Cecchet, Robert Sims,
Prashant Shenoy and Xin He
University of Massachusetts Amherst
mBenchLabMeasuring QoE of Web ApplicationsUsing Mobile Devices
http://benchlab.cs.umass.edu/
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
2
THE WEB CIRCA 2000(1998)
Wikipedia (2001)
(2000)
(1999)(1995)
(2004)
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
3
THE WEB TODAY…
Amazon.com Tablet: 225 requests
30 .js 10 css 10 html 175 multimedia files
Phone: 15 requests 1 .js 1 css 10 html 8 multimedia files
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
4
THE WEB TODAY…
Wikipedia on Montreal 226 requests 3MB
WEB SITES ARE MORE AND MORE COMPLEX
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
5
BENCHMARKING TOOLS HAVE NOT KEPT UP
Workload definition Application under TestWeb Emulator
HTTP trace
Application under Test
Real DevicesReal BrowsersReal Networks
+
http://...http://...http://...http://...http://...http://...
http://...http://...http://...http://...http://...http://...
BenchLab approach
Traditional approach (TPC-W, RUBiS…)
MOBILES ADD COMPLEXITY
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
6
BROWSERS MATTER FOR QOE?
QoE measurement Old way: QoE = Server +
Network Modern way: QoE = Servers +
Network + Browser Browsers are smart
Parallelism on multiple connections
JavaScript execution can trigger additional queries
Rendering introduces delays in resource access
Caching and pre-fetching HTTP replay cannot approximate
real Web browser access to resources
GET /wiki/page
Analyze page
generate page
GET combined.min.cssGET jquery-ui.cssGET main-ltr.cssGET commonPrint.cssGET shared.cssGET flaggedrevs.cssGET Common.cssGET wikibits.jsGET jquery.min.jsGET ajax.jsGET mwsuggest.jsGET plugins...jsGET Print.cssGET Vector.cssGET raw&gen=cssGET ClickTracking.jsGET Vector...jsGET js&useskinGET WikiTable.cssGET CommonsTicker.cssGET flaggedrevs.jsGET Infobox.cssGET Messagebox.cssGET Hoverbox.cssGET Autocount.cssGET toc.cssGET Multilingual.cssGET mediawiki_88x31.png
Rendering + JavaScript
GET ExtraTools.jsGET Navigation.jsGET NavigationTabs.jsGET Displaytitle.jsGET RandomBook.jsGET Edittools.jsGET EditToolbar.jsGET BookSearch.jsGET MediaWikiCommon.css
0.90s
0.06s
send files
GET page-base.png GET page-fade.png GET border.png GET 1.png GET external-link.png GET bullet-icon.png GET user-icon.png GET tab-break.png GET tab-current.png GET tab-normal-fade.png GET search-fade.png GET search-ltr.png GET wiki.png GET portal-break.png
0.97s
Rendering0.28sGET arrow-down.pngGET portal-break.pngGET arrow-right.png
send files
send files
send files
Rendering + JavaScript
0.67s
0.14s
0.70s
0.12s
0.25s
1.02s
1.19s
1.13s
0.27s
Replay
1
2
3
4
0.25s
3.86s + 2.21s total rendering time1.88s
Total network time
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
7
USE CASES
ResearchersProvide an open source infrastructure for
accurate QoE on mobile devicesBenchmarking of
Real applications & workloads Over real networks Using real devices & browsers
UsersProvide QoE on relevant Web sitesDifferent from a basic SpeedTest
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
8
OUTLINE
The mBenchLab approach
Experimental results
What’s next?
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
9
MBENCHLAB ARCHITECTURE BenchLab
Dashboard Backend application Experiment Client App
mBA
mBA mBA
mBA
Web Frontend
Experiment scheduler
Traces (HAR or access_log)Results (HAR or latency)
Experiment ConfigBenchmark VMs
Exp
erim
ent
star
t/st
op
Tra
ce d
ownl
oad
Bro
wse
r re
gist
ratio
n
Res
ults
upl
oad
http://...http://...http://...
DevicesMetricsHTMLSnapshots…
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
10
Web Frontend
Experiment scheduler
Traces (HAR or access_log)Results (HAR or latency)
Experiment ConfigBenchmark VMs
Exp
erim
ent
star
t/st
op
Tra
ce d
ownl
oad
Bro
wse
r re
gist
ratio
n
Res
ults
upl
oad
RUNNING AN EXPERIMENT WITH MBENCHLAB
1. Upload traces 2. Define
experiment3. Start experiment
1. Start the BenchLab Dashboard and create an experiment
2. Start the mBenchLab App on the mobile device
3. Browser issues HTTP requests and App performs QoE measurements
4. Upload collected results to Dashboard
5. Analyze results
Detailed Network and Browser timings
Play trace
Uploadresults
• View results
• Repeat experiment• Export
setup/traces/ VMs/results
mBA
mBA mBA
mBA
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
11
MBENCHLAB ANDROID APP (MBA)
Issue HTTP requests in native Android browser Collect QoE measurements in HAR format Upload results to BenchLab Dashboard when done
mBenchLab Android App
Native Android browser
Selenium
Android driver
HAR recording
proxy
mBenchLab runtime GPS
Network
Storage trace
HAR #1
snap-shot HAR
#1 HAR #1
Wifi
3G/4G
Cloud Web services
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
12
MBA MEASUREMENTS QoE
Overall page loading time including Javascript execution and rendering time
Request failure/success/cache hit rate HTML correctness Rendering overview with screen snapshots
QoE Overall page loading time including Javascript
execution and rendering time Request failure/success/cache hit rate HTML correctness Rendering overview with screen snapshots
Network DNS resolution time Connection establishment time Send/wait/receive time on network connections
QoE Overall page loading time including Javascript
execution and rendering time Request failure/success/cache hit rate HTML correctness Rendering overview with screen snapshots
Network DNS resolution time Connection establishment time Send/wait/receive time on network connections
Device Hardware and software configurations Location (optional)
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
13
OUTLINE
The mBenchLab approachExperimental resultsWhat’s next?
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
14
EXPERIMENTAL SETUP Desktop
MacBook Pro using Firefox
Tablets Trio Stealth Pro ($50 Android 4 tablet) Kindle Fire
Phones Samsung S3 GT-I9300 (3G) Motorola Droid Razr (4G) HTC Desire C
Traces Amazon Craigslist Wikipedia Wikibooks
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
15
QOE ON DIFFERENT WEB SITES
Web sites can be very dynamic Amazon content is very dynamic (hurts repeatability) Craigslist is very simple and similar across platforms
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
16
INSTRUMENTATION OVERHEAD Hardware matters
Single core underpowered hardware shows ~3s instrumentation overhead on Wikipedia pages
Modern devices powerful enough to instrument with negligible overhead
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
17
QOE ON DIFFERENT MOBILE NETWORKS
Quantify QoE variation between Wifi vs Edge vs 3G vs 4G Performance varies based on location, network provider,
hardware…
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
18
IDENTIFYING QOE ISSUES
Why is the page loading time so high?
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
19
QOE BUGS: THE SAMSUNG S3 PHONE Number of HTTP requests and page sizes off for
Wikipedia pages
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
20
QOE BUGS: THE SAMSUNG S3 PHONE Bug in srcset implementation
<img src="pear-desktop.jpeg" srcset="pear-mobile.jpeg 720w, pear-tablet.jpeg 1280w" alt="The pear">
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
21
OUTLINE
The mBenchLab approach
Experimental results
What’s next?
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
22
RELATED WORK
Benchmarking tools have not kept up RUBiS, TPC-*: obsolete backends Httperf: unrealistic replay
Commercial benchmarks Spec-* Not open nor free
Mobility adds complexity Networks
Hossein et al., A first look at traffic on smartphones, IMC’10 Hardware
Hyojun et al., Revisiting Storage for Smartphones, FAST’12 Thiagarajan et al., Who killed my battery?, WWW’12
Location Nordström et al., Serval: An End-Host stack for Service
Centric Networking, NSDI’12
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
23
SUMMARY AND FUTURE WORK
Real devices, browsers, networks and application backends needed for modern WebApp benchmarking
mBenchLab provides For researchers: Infrastructure for Internet scale
Benchmarking of real applications with mobile devices For users: Insight on QoE with real Web sites rather
than a simple SpeedTest Larger scale experiments
More users, devices, locations More Web applications and traces
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
24
Q&A
SOFTWARE, DOCUMENTATION, RESULTS:http://benchlab.cs.umass.edu/
WATCH TUTORIALS AND DEMOS ON YOUTUBE
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
26
RELATED WORK
Hossein et al., A first look at traffic on smartphones, IMC’10 Majority of traffic is Web browsing 3G performance varies according to network provider Mobile proxies improve performance
Hyojun et al., Revisiting Storage for Smartphones, FAST’12 Device storage performance affects browsing experience
Thiagarajan et al., Who killed my battery?, WWW’12 Battery consumption can be reduced with better JS and CSS Energy savings with JPG
Nordström et al., Serval: An End-Host stack for Service Centric Networking, NSDI’12 Transparent switching between networks Location based performance
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
27
WEB APPLICATIONS HAVE CHANGED
Web 2.0 applicationso Rich client interactions (AJAX, JS…)o Multimedia contento Replication, caching…o Large databases (few GB to multiple TB)
Complex Web interactionso HTTP 1.1, CSS, images, flash, HTML 5…o WAN latencies, caching, Content Delivery Networks…
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
28
EVOLUTION OF WEB APPLICATIONS
Applications HTML CSS JS Multimedia Total
RUBiS 1 0 0 1 2
eBay.com 1 3 3 31 38
TPC-W 1 0 0 5 6
amazon.com 6 13 33 91 141
CloudStone 1 2 4 21 28
facebook.com 6 13 22 135 176
wikibooks.org 1 19 23 35 78
wikipedia.org 1 5 10 20 36
Number of interactions to fetch the home page of various web sites and benchmarks
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
29
TYPING SPEED MATTERS
Auto-completion in search fields is common Each keystroke can generate a query Text searches use a lot of resources
GET /api.php?action=opensearch&search=WGET /api.php?action=opensearch&search=WebGET /api.php?action=opensearch&search=Web+GET /api.php?action=opensearch&search=Web+2GET /api.php?action=opensearch&search=Web+2.GET /api.php?action=opensearch&search=Web+2.0
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
30
STATE SIZE MATTERS
Does the entire DB of Amazon or eBay fit in the memory of a cell phone? TPC-W DB size: 684MB RUBiS DB size: 1022MB
Impact of CloudStone database size on performance
Dataset size
State size (in GB)
Database rows
Avg cpu load with 25 users
25 users 3.2 173745 8%100 users
12 655344 10%
200 users
22 1151590 16%
400 users
38 1703262 41%
500 users
44 1891242 45%
CloudStone Web application server load observed for various dataset sizes using a workload trace of 25 users replayed with Apache HttpClient 3.
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
31
OUTLINE
What has changed in WebAppsBenchmarking real applications with BenchLab
Experimental results
Demo
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
32
3 options to record traces in HTTP Archive (HAR) format directly in Web browser at HA proxy load balancer level using Apache httpd logs
Internet
Frontend/Load balancer
Databases
App.Servers
RECORDING HTTP TRACES
HA Proxy recorder
httpd recorder
Recording in the Web browser
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
33
WIKIMEDIA FOUNDATION WIKIS
Wikimedia Wiki open source software stack Lots of extensions Very complex to setup/install
Real database dumps (up to 6TB) 3 months to create a dump 3 years to restore with default tools
Multimedia content Images, audio, video Generators (dynamic or static) to avoid copyright issues
Real Web traces from Wikimedia Packaged as Virtual Appliances
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
34
COMPLEXITY BEHIND THE SCENE
Browsers are smart Caching, prefetching, parallelism… Javascript can trigger additional requests
Real network latencies vary a lot Too complex to simulate to get accurate QoE
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
35
BENCHLAB DASHBOARD
Upload traces / VMsDefine and run
experimentsCompare resultsDistribute
benchmarks, traces, configs and results
http://...http://...http://...http://...http://...http://...
http://...http://...http://...http://...http://...http://...
Web Frontend
Experiment scheduler
Traces (HAR or access_log)Results (HAR or latency)
Experiment ConfigBenchmark VMs
Exp
erim
ent
star
t/st
op
Tra
ce d
ownl
oad
Bro
wse
r re
gist
ratio
n
Res
ults
upl
oad
JEE WebApp with embedded database Repository of benchmarks and traces Schedule and control experiment execution Results repository Can be used to distribute / reproduce
experiments and compare results
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
36
OPEN VS CLOSED
Open Versus Closed: A Cautionary Tale – B. Schroeder, A. Wierman, M. Harchor-Balter – NSDI’06 response time difference between open and close can
be large scheduling more beneficial in open systems
mB
ench
Lab
– c
ecch
et@
cs.u
ma
ss.e
du
37
SUMMARY AND FUTURE WORK
Larger scale experiments More users, devices, locations More Web applications and traces
Social and economical aspects User privacy Cost of mobile data plans Experiment feedback (to and from user)
Automated result processing Anomaly detection Performance comparison
Server side measurements with Wikipedia Virtual appliances Mobile proxy
Software distribution for other researchers to setup their own BenchLab infrastructure