Top Banner
Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay
22

Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Dec 14, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Request Distribution in Server Clusters

Krithi Ramamritham

Indian Institute of Technology Bombay

Page 2: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Web site infrastructureClustered, multi-tiered architectures

… …

WebSwitch

WebServerCluster

ApplicationServerCluster

… …

WebSwitch

WebServerCluster

ApplicationServerCluster

e-Shopping Open the portal home page Login View items, prices, availability Select an item type Specify the no. of items Confirm by entering the credit card number Logout

Page 3: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

WS vs. AS

• Web servers– Do well defined and quantifiable local work

• e.g., processing HTTP headers, serving static content

• Application servers– Run multi-layer programs

• e.g., scripts involving calls to backends

Page 4: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

ReDalIn clustered, multi-tiered architectures, two request distribution points:

– Web Server Request Distribution (WSRD): Web switch distributes requests to the web server cluster– Application Server Request Distribution (ASRD): Web server distributes requests requiring business logic to the

application server cluster

… …Web

Switch

WebServerCluster

ApplicationServerCluster

… …Web

Switch

WebServerCluster

ApplicationServerCluster ReDal:

Request Distribution for the Application Layer

An approach for efficient distribution of requests across a cluster of application servers

Page 5: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Web Server Request DistributionMany policies: Random, Round Robin (RR), Weighted Round Robin (WRR), Least Connections

– Several of these policies are commercially implemented (e.g., Cisco’s Local Director and F5’s BIG/IP)

Two improvements:1. Session Affinity 2. Locality-Aware Request Distribution (LARD)

• attempts to exploit locality of working sets on different servers – not applicable to dynamically generated content

Session Affinity:

Consecutive requests in a given user session will be served faster if they are handled by the same server

Page 6: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Application Server Request Distribution

Dynamic scheduling techniques usually presuppose some knowledge of task (e.g., duration, weight) and/ or resource (e.g., queue sizes, service times)

– In ASRD, both tasks and resources are highly dynamic

So, techniques are adaptations of WSRD techniques

Most common technique: combination of RR and Session Affinity– Requests starting new sessions are dispatched according to

RR– Subsequent requests in a session are routed to the server

where the session’s previous request was served, i.e., where the session object resides

=> frequently results in load imbalances

Page 7: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

ReDal: Motivation

Request distribution combining RR and Session Affinity

Short and long sessions arrive at at one-minute intervals

S S L S S L S L L S

3 4 5 6 7 8 9 10 1121

A1

S3

S

s7 S9

3 4 5 6 7 8 9 10 1121

A2

S6 S8s

Load imbalances

Time (minutes)

Nu

mb

er

of

Act

ive S

ess

ion

s

3 4 5 6 7 8 9 10 1121

A1

3 4 5 6 7 8 9 10 1121

A2

Load imbalances

Time (minutes)

Nu

mb

er

of

Act

ive S

ess

ion

s

Page 8: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

ReDAL ObjectiveDistribute requests across a cluster of application

servers such that:• Load on each application server is kept below a certain threshold

• Session affinity is preserved where possible

Lightly Loaded

#users

Trsper Sec

Throughput Peak

Peak Load

Heavily Loaded

Page 9: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

ReDAL Components

Application Analyzer

characterizes behavior of

application server

Runs in offline phase to record peak throughput/load values, which are used at runtime by

Request Dispatcher

Request Dispatcher routes requests to a set of application servers

Monitors expected and actual load on each application server

Routes a given request to the affined server if lightly loaded else to application server

having lowest expected load

Page 10: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

ReDAL Algorithm

based on key observation:

think-time or view-time on a page is predictable based on past behavior

Jeffrey Heer and Ed H. Chi (Palo Alto Xerox Research Center), “Mining the Structure of User Activity using Cluster Stability”, Proceedings of the Web

Analytics Workshop, SIAM Conference on Data Mining (2002)

Page 11: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

ReDal: Capacity Reservation

• Consider a finite lookahead period partitioned into discrete time periods or slices

Current Time

Time SliceTimet1 t2

r1 r2

Think Time

Slice 0 Slice 1 Slice 2

Load metrics:

• Actual Load = number of requests in time slice

• Expected Load = number of requests expected in a time slice based on think time, i.e., time between subsequent requests in a session

– e.g., Capacity is reserved for request r2 on this application server during time slice 2

• Modified Load = Actual Load + Expected Load (0 1)

accounts for prediction errors

Page 12: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

ReDal: Algorithm Overview

Inputs:

Request in a session, Think time, Time slice duration,

Output:

Assignment of request to application server A

A = NULL

A = SessionAffinity()

If A is NULL

A = LeastLoaded()

UpdateLoadMetrics()

AdvanceTimeSlice()

Return A

SessionAffinityIf ActualLoad() < PeakLoad()

Return AffinedServer()

LeastLoaded

If request is part of new session

A = LeastLoaded(modified)

Else

A = LeastLoaded(actual)

Return A

Page 13: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Consistent global view of metadata

• Multicasting of changed load info by

WS request dispatcher• Session objects virtualized

in a shared db• Web server records time of

response in a cookie – useful for estimating think

times in web server clusters

… …Web

Switch

WebServerCluster

ApplicationServerCluster

… …Web

Switch

WebServerCluster

ApplicationServerCluster

Page 14: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

ReDal: Evaluation

• ReDal, RR, HJ implemented as

Apache Web Server plug-ins

• Load generator simulates a varying number of simultaneous user sessions, each session submitting a stream of requests

• Each request chosen from a uniform distribution across the high and low load transaction requests

• Load generator (LoadRunner 6), Web server (Apache), 10 application server instances (WebLogic 7.1), and session repository (Oracle 8), each running on separate hardware

• Machine configuration: single-CPU (900 MHz), 1GB RAM, 20 GB disk, running Windows 2000 Advanced Server (SP3)

HJ (Hwang and Jung, 2002) uses“least-active-requests” routing policy not applicable to stateful applications

Page 15: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

ReDal: Experimental Results

Performance Metrics:

• Average Throughput per Application Server (ATAS): average number of transactions per second an application server in the cluster provides

• Average Response Time (ART): average response time provided by the application servers, measured from the end user perspective

• Web Server CPU Utilization (WSCU): percentage CPU utilization on the web server, measured by OS utilities

• Peak % CPU on the Application Servers: peak percentage CPU usage among a cluster of application servers measured by OS utilities.

• Scaling with Application Servers: percentage CPU usage in web server for various number of application servers in application server cluster.

Page 16: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Throughput Performance

0

10

20

30

40

50

60

0 20 40 60 80 100

Number of Simultaneous Sessions

ATA

S

ReDAL (0.9)

ReDAL (0.5)

HJ

RR

• ReDAL (0.9) is ReDAL algorithm with = 0.9• ReDAL (0.5) is ReDAL algorithm with = 0.5

ReDAL with = 0.9 case has highest throughput

Page 17: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Response Time Performance

0

200

400

600

800

1000

1200

1400

0 20 40 60 80 100

Number of Simultaneous Sessions

AR

T (

ms)

ReDAL (0.9)

ReDAL (0.5)

HJ

RR

ReDAL with = 0.9 case has best response time

Page 18: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

CPU Overhead on the Web Server

0

2

4

6

8

10

12

14

0 20 40 60 80 100

Number of Simultaneous Sessions

WS

CU

(%

)

HJ

RR

ReDAL (0.9)

Additional overhead ofReDal algorithm is 1.5% or less

Page 19: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Peak CPU Utilization on Application Servers

0

20

40

60

80

100

0 20 40 60 80 100

Number of Simultaneous Sessions

Pea

k %

CP

U o

n th

e A

pplic

atio

n S

erve

rs ReDAL-Alpha=0.9

ReDAL-Alpha=0.5

HJ

RR

Highest in the RR case and lowest in the ReDAL ( = 0.9) case

Page 20: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Scaling with Application Servers

overhead of ReDAL algorithm is at or below 15% for 100 concurrent sessions

0

2

4

6

8

10

12

14

0 20 40 60 80 100

Number of Simulatenous Sessions

WS

CU

(%

)

#App-Server=5

#App-Server=10

#App-Server=20

Page 21: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Real World EvaluationOnline credit card application

30 WebLogic application servers on Linux Redhat 9.0 Apache Web Server on Linux RedHat 9.0 Machine hardware configuration: 1 GB RAM, 2.2 GHz dual processors Load was simulated by re-tracing web log collected during various times over a day

At a peak load of 1000 simultaneous sessions, ReDAL improved the response time of RR by 100%.

0

200

400

600

800

1000

1200

1400

1600

1800

0 200 400 600 800 1000

Number of Simultaneous Sessions

AR

T (

ms

)

ReDal-0.8

HJ

RR

Page 22: Request Distribution in Server Clusters Krithi Ramamritham Indian Institute of Technology Bombay.

Summary

… …

WebSwitch

WebServerCluster

ApplicationServerCluster

… …

WebSwitch

WebServerCluster

ApplicationServerCluster

ReDal: Application server load Distribution

Maximizes affinity

Exploits application characteristics

Practical and scalable