1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM SIGCOMM Measurement Workshop San Francisco, CA, November 2001
Mar 27, 2015
1
Analyzing Browse Patterns of Mobile Clients
Lili QiuJoint work with Atul Adya and Victor Bahl
{adya,bahl,liliq}@microsoft.comMicrosoft Research
ACM SIGCOMM Measurement WorkshopSan Francisco, CA, November 2001
2
Outline Overview Related work Analysis of a popular mobile Web
site Document popularity analysis User behavior analysis System load analysis Content analysis
Summary and implications
3
Motivation Phenomenal growth in cellular industry
and handheld device Crucial to understand the performance
of wireless Web Limited understanding of how wireless
Web services are being used
4
Related Work
Workload of clients at wireline networks Server-based studies
[ABC+96], [AW96], [MS97], [AJ99],[PQ00] Proxy-based studies
[BCF+99], [DMF97], [GB97], [VDA+99], [WVS+99]
Client-based studies [CBC95] and [BBB+98]
Workload of wireless clients [KBZ+2000]
Only 80K requests over seven months
5
OverviewA popular mobile Web site
Content news, weather, stock quotes, email, yellow pages,
travel reservations, entertainment etc. Period studied
August 15, 2000 – August 26, 2000 33 million accesses in 12 days
Type of analyses This paper is a part of larger analysis study Analysis of browse pattern Analysis of notification logs Correlation between how browsing and notification
services are being used
6
Overview: Types of Analysis
Document popularity analysis User behavior analysis System load analysis Content analysis
7
Overview: User Categories Cellular users
Browse the Web in real time on cellular phones Offline users
Download content onto their PDAs for later (offline) browsing, e.g. AvantGo
Desktop users Signup services and specify preferences
Many more users now
User Type # Users # Requests
Cellular 58,432 2,210,758
Offline 50,968 20,508,272
Desktop 639,971 7,342,206
Misc. 1634 2,944,708
8
Document Popularity Previous Web research have found Web
accesses follow Zipf-like distribution (i.e. request frequency 1/i)
Two definitions of document URL <URL, parameter> (i.e. query)
9
Document Popularity (Cont.)
110
1001000
10000100000
1000000
1 10 100 1000
Popularity ranking of urls
# R
eque
sts
110
1001000
10000100000
1000000
1 10 100 1000 10000 100000
Popularity ranking of url and para. pairs#
Req
uest
s
Document Popularity does not closely follow Zipf-like distribution.
10
Document Popularity (Cont.) Majority of the requests
are concentrated on a small number of documents
0.1% - 0.5% URL and parameter combinations (i.e. 112 – 442) account for 90% requests
0
0.2
0.4
0.6
0.8
1
1.2
0 0.5 1 1.5
Percentage of documents
Perc
enta
ge o
f re
ques
ts
Very small amount of memory needed to cache popular query
results.
11
User Behavior Analysis Understand how long a wireless user
stays on the channel as he/she browses the Web
Determine user sessions Intuition: a session is idle for a sufficiently
long time, we say it has ended. Heuristic to determine a session inactivity
period
12
User Behavior Analysis (Cont.)
Determine the session inactivity period (s) Too small s => too many sessions Too large s => too few sessions An appropriate value is at the
knee point The knee point is between
30 to 45 seconds 95% users
Have session time less than 3 minutes Initiated less than 35 sessions during the 12 days
No. of sessions vs. session inactivity period
0
200000
400000
600000
800000
1000000
1200000
0 100 200 300 400 500 600
Session inactivity period (secs)
No
. of
sess
ion
s
We can reclaim IP addresses more quickly than 90 seconds used previously in [KBZ+2000].
13
System Load Analysis Understand how to optimize
Web server for better performance
Small replies 98% to wireless users < 3 KB 99% to offline users < 6.3 KB
Diurnal pattern and weekday vs. weekend variation
Over 60% browsing requests are from offline PDA users, and less than 7% are from wireless users.
CDF for No. of entries vs Reply Size/100
0
0.2
0.4
0.6
0.8
1
0.0 100.0 200.0 300.0 400.0 500.0 614.0 742.0 1025.
Wireless Offline All Desktop
1) Highly optimize sending small replies.2) Identify what type of user issued the request, and prioritize the request according to the user type.
14
Content Analysis
Rank #1 Rank #2 Rank #3
Wireless Stock quotes News Yellow pages
Offline Help News Stock quotes
Desktop Sign-ups Email Sports
Top three preferences for different kinds of users
Important to content providers: whatcontent is interesting to users
15
Summary of Results and Implications
Facts Implications
0.1% - 0.5% queries (i.e. 121-442) account for 90% requests.
Caching the results of popular queries can be very effective.
A large fraction of requests come from automated sync programs.
System designers should prioritize requests according to user type.
16
Summary of Results and Implications
Facts ImplicationsMost of the replies are short (< 3KB for wireless users, and < 6KB for offline users).
Wireless Web servers should highly optimize sending short replies.
The session inactivity period is between 30 to 45 seconds.
We may reclaim IP addresses more quickly than 90 seconds used previously.