Top Banner
1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM SIGCOMM Measurement Workshop San Francisco, CA, November 2001
16

1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

Mar 27, 2015

Download

Documents

Paige Brady
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

1

Analyzing Browse Patterns of Mobile Clients

Lili QiuJoint work with Atul Adya and Victor Bahl

{adya,bahl,liliq}@microsoft.comMicrosoft Research

ACM SIGCOMM Measurement WorkshopSan Francisco, CA, November 2001

Page 2: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

2

Outline Overview Related work Analysis of a popular mobile Web

site Document popularity analysis User behavior analysis System load analysis Content analysis

Summary and implications

Page 3: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

3

Motivation Phenomenal growth in cellular industry

and handheld device Crucial to understand the performance

of wireless Web Limited understanding of how wireless

Web services are being used

Page 4: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

4

Related Work

Workload of clients at wireline networks Server-based studies

[ABC+96], [AW96], [MS97], [AJ99],[PQ00] Proxy-based studies

[BCF+99], [DMF97], [GB97], [VDA+99], [WVS+99]

Client-based studies [CBC95] and [BBB+98]

Workload of wireless clients [KBZ+2000]

Only 80K requests over seven months

Page 5: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

5

OverviewA popular mobile Web site

Content news, weather, stock quotes, email, yellow pages,

travel reservations, entertainment etc. Period studied

August 15, 2000 – August 26, 2000 33 million accesses in 12 days

Type of analyses This paper is a part of larger analysis study Analysis of browse pattern Analysis of notification logs Correlation between how browsing and notification

services are being used

Page 6: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

6

Overview: Types of Analysis

Document popularity analysis User behavior analysis System load analysis Content analysis

Page 7: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

7

Overview: User Categories Cellular users

Browse the Web in real time on cellular phones Offline users

Download content onto their PDAs for later (offline) browsing, e.g. AvantGo

Desktop users Signup services and specify preferences

Many more users now

User Type # Users # Requests

Cellular 58,432 2,210,758

Offline 50,968 20,508,272

Desktop 639,971 7,342,206

Misc. 1634 2,944,708

Page 8: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

8

Document Popularity Previous Web research have found Web

accesses follow Zipf-like distribution (i.e. request frequency 1/i)

Two definitions of document URL <URL, parameter> (i.e. query)

Page 9: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

9

Document Popularity (Cont.)

110

1001000

10000100000

1000000

1 10 100 1000

Popularity ranking of urls

# R

eque

sts

110

1001000

10000100000

1000000

1 10 100 1000 10000 100000

Popularity ranking of url and para. pairs#

Req

uest

s

Document Popularity does not closely follow Zipf-like distribution.

Page 10: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

10

Document Popularity (Cont.) Majority of the requests

are concentrated on a small number of documents

0.1% - 0.5% URL and parameter combinations (i.e. 112 – 442) account for 90% requests

0

0.2

0.4

0.6

0.8

1

1.2

0 0.5 1 1.5

Percentage of documents

Perc

enta

ge o

f re

ques

ts

Very small amount of memory needed to cache popular query

results.

Page 11: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

11

User Behavior Analysis Understand how long a wireless user

stays on the channel as he/she browses the Web

Determine user sessions Intuition: a session is idle for a sufficiently

long time, we say it has ended. Heuristic to determine a session inactivity

period

Page 12: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

12

User Behavior Analysis (Cont.)

Determine the session inactivity period (s) Too small s => too many sessions Too large s => too few sessions An appropriate value is at the

knee point The knee point is between

30 to 45 seconds 95% users

Have session time less than 3 minutes Initiated less than 35 sessions during the 12 days

No. of sessions vs. session inactivity period

0

200000

400000

600000

800000

1000000

1200000

0 100 200 300 400 500 600

Session inactivity period (secs)

No

. of

sess

ion

s

We can reclaim IP addresses more quickly than 90 seconds used previously in [KBZ+2000].

Page 13: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

13

System Load Analysis Understand how to optimize

Web server for better performance

Small replies 98% to wireless users < 3 KB 99% to offline users < 6.3 KB

Diurnal pattern and weekday vs. weekend variation

Over 60% browsing requests are from offline PDA users, and less than 7% are from wireless users.

CDF for No. of entries vs Reply Size/100

0

0.2

0.4

0.6

0.8

1

0.0 100.0 200.0 300.0 400.0 500.0 614.0 742.0 1025.

Wireless Offline All Desktop

1) Highly optimize sending small replies.2) Identify what type of user issued the request, and prioritize the request according to the user type.

Page 14: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

14

Content Analysis

Rank #1 Rank #2 Rank #3

Wireless Stock quotes News Yellow pages

Offline Help News Stock quotes

Desktop Sign-ups Email Sports

Top three preferences for different kinds of users

Important to content providers: whatcontent is interesting to users

Page 15: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

15

Summary of Results and Implications

Facts Implications

0.1% - 0.5% queries (i.e. 121-442) account for 90% requests.

Caching the results of popular queries can be very effective.

A large fraction of requests come from automated sync programs.

System designers should prioritize requests according to user type.

Page 16: 1 Analyzing Browse Patterns of Mobile Clients Lili Qiu Joint work with Atul Adya and Victor Bahl {adya,bahl,liliq}@microsoft.com Microsoft Research ACM.

16

Summary of Results and Implications

Facts ImplicationsMost of the replies are short (< 3KB for wireless users, and < 6KB for offline users).

Wireless Web servers should highly optimize sending short replies.

The session inactivity period is between 30 to 45 seconds.

We may reclaim IP addresses more quickly than 90 seconds used previously.