Http:// Quality Assurance and Adaptation: A Key to Next Generation of Stress-Resilient Internet Services Cheng-Zhong Xu Cluster &

http://www.cic.eng.wayne.edu

Quality Assurance and Adaptation:

A Key to Next Generation of Stress-Resilient Internet Services

Cheng-Zhong XuCluster & Internet Computing Lab

Dept of Electrical/Computer EngineeringWayne State University

Wayne State University

Overview of Research

Cluster & Internet Computing Lab

http://cic.eng.wayne.edu

C. Xu @ Wayne State QoS Assurance 3

Pervasive Internet Services

• New communication services– Email, Chat, Instant Message– Voice, Telephony, Video conf.

• New information services– News, stock, weather, etc– Location-aware: ATM, restaurant, parking– Mobility-aware: banking, ticketing, etc

• Services accessible anytime and anywhere


Characteristics• Diversity

– Diverse Access Networks: • PSTN, Bluetooth, Cellular, DSL, Cable, LAN, Satellite, etc

– Diverse Access Devices• PDA, phone, computer, “Dick Tracy” watch, etc

• Resource-constrained – Info processing capacity: cpu, memory– Storage, networking, – Battery power, etc

• Mobile– Mobility is an inherent nature of human being,

moving toward resource or away from scarcity.– User (device) and computation


MAPS Solution @ CIC Group

• MAPS: System Support for Mobility and Adaptation in Pervasive Services

• Desgin Goals:– Scalable and Secure Service Arch.

• Rapid development/deployment of new services

– Mobility Support: access on-the-move • User/Device (physical) vs Computation (logical)

– Adaptation: proactive in response to change• user requirements, preferences, • available resources and operation conditions


Energy-aware RMin mobile &embedded sys

Connection migration in mobile comp.

P2P file sharing and load balancing

Mobile codesfor network appl

Service migration for adaptive grid

Cluster-based Internet services

Client-aware streaming service adaptation

Service quality assurance andadaptation

MAPS Ongoing Projects

serversclients service overlay network

Quality Assurance and Adaptation:

A Key to Next Generation of Stress-Resilient Internet Services


Outline

User-Perceived Quality of Service The Problem and Related work Approach I: Model Predictive Control Approach II: Model-Free Self-tuning

Fuzzy Control Performance Evaluation Summary


User-Perceived QoS Client-perceived response time includes

network transfer time and server delay and processing time

Network alone is not sufficient to support end-to-end QoS assurance

www.wayne.edu

delay

processing time


Critical Path Analysis

Early studies (Barford and Croella, 2001) showed For large files (>500K), user-perceived delay mostly

came from network delay For small files (~50K), server-side delay constituted up to

80% latency

Network/Systems trends Over-provisioning of network bandwidth makes QoS

failure rare in network core Servers are more vulnerable to congestion and perf. loss.

• Due to open access nature of Internet services• Caused by flash crowd-like DDoS attacks


Our Experience on PlanetLab

Run Apache server at Wayne State with various load Access from clients in North America and Europe Server-side delay becomes the dominant factor

when the system utilization reaches 50%


Objectives

QoS Assurance and Adaptation on Servers QoS-aware resource management to achieve guaranteed

perf. and resilience even in the face of system stress.• Observe and respond to per-class traffic change• Graceful performance degradation

In contrast to best-effort, same service to all model Perspectives for QoS assurance

On an indiscriminate Web site• Control behaviors of aggressive clients for fairness• Protect servers from flash-crowd like DDoS attack

On an e-commerce site• Give higher priority to sessions of buyers than visitors,

without over-compromising the needs of occasional visitors• Guarantee the perf of purchase requests when the server is

stressed.


Problem Statement

QoS control over requests in different classes Schedule requests for processing so as to provide

predictable and controllable fair-sharing (PCF) services Predictability: schedules must be consistent, independent

of variations of the class workloads Controllability: controllable parameters to adjust quality

factors between classes Fairness: lower classes not be over-compromised,

especially when workload is high

Centralqueue

…

Dis

pat

cher

Queueing delay

Q1

Q2

QN

…IP N

etw

ork

IP N

etw

ork


Related work QoS-aware admission control

Early random dropping (Chen & Mohapoatra, 1999) Feedback control to bound utilization (Abdelzaher et al. 02) Session-based AC (Cherkasova & Phaal, 2002) On/off AC model doesn’t support performance graceful

degradation Priority-based request scheduling

Differentiate QoS between different classes of requests by setting priorities (Almeida et al, 98, Eggert, et al 99)

No guarantee of absolute/relative QoS Processing rate allocation

Queueing-model based: calculate resource amount based on a queueing model w.r.t. processing delay (Cardellini01, Zhu01, Pradhan02, Zhou04)• However, it relies on an accurate server model:• Mean-value analysis provides control over average quality of

requests in a long run, but unable to control their QoS variance Model predictive feedback control


QoS Assurance

Client-Perceived QoS Assurance Related work Approach I: Model Predictive Control Approach II: Model-Free Self-tuning



Model Predictive Feedback Control

MPFC = queuing model + feedback control Queueing model to estimate a processing rateFeedback control to deal with the impact of

traffic self-similarity and bustiness Performance metric: Slowdown

Slowdown = Queuing delay/Service timeRequests have different service time; users

tend to tolerate long delays for “large” requests


MPFC Resource Allocation

Classifier determines requests’ classes Scheduler dispatches requests to server based on

classes’ allocated processing rate QoS controller adjusts a class’s rate according to

measured system conditions


Queueing Analysis of Slowdown

Performance Metric: Slowdown Slowdown = Queuing delay (W) /Service time

(X)

For general M/G/1 FCFS, with bounded Pareto service-time distribution

Expected slowdown S is


Proportional Slowdown Differentiation

Determine processing rate Ci for each class so that the slowdown Si is proportional to its target quality factor δi:

: processing rate of class i

: differentiation parameter of class i

Subject to


Queueing Model-based Estimates

Processing rate of class i is

First term: baseline rate of class iprevents the class from being overloaded

Second term: portion of surplus ratedetermined by its normalized arrival ratecontrols quality differences between classes


Properties of the Solution

[Controllability] Differential weight of a class increases, its quality factor increases

[Self-adaptability] Quality factor of a class drops with the increase of its arrival rate Resilience to flash crowd-like DDoS attacks, load surge, etc Guarantee good, block bad, and slowdown suspicious ones

[Self-management] Load decrease of a higher-weighted class causes a big quality increase of others.

Per-class quality factor:


Simulation Results

Simulation setting: expo arrival, bounded Pareto service distribution for each traffic class

Targets are achieved on average Large variance unstable quality

95th-5th = 25

Target = 8


Why large variance?

Web traffic is dynamic in nature Processing rate is calculated based on

estimated arrival rate using historyEstimation is inaccurate

Sum of errors ≈ 0, achieve target ratio on average


Basic Ideas of MPFC

Adjust a class’s processing rate according to errors (feedback) and estimated arrival rate (queueing)

Classical integral feedback control Adjust service rate proportional to the errors

integrated over time No steady-state error and insensitive to

measurement noises A long process delay poses a severe instability

issue From the perspective of feedback control, a

model-based estimate tackles the instability issue.


Structure of MPFC

Rate predictor: estimates a class’s processing rate using queueing theory

Feedback controller: adjusts the rate allocation according to errors using integral control


Definition of Control LoopControl loop includes

Reference input r(k), output y(k), and error e(k)

Class 1 is the base classA control loop is associated with every

other class

Reference input:

Loop output:

Error:


Processing Rate using MPFC

MPFC output:

Rate of class i:

Predictor output:

Controller output:

(queueing theory)

(integral control)


Simulation Results

MPFC achieves the target consistently in both small and large time scales

It assumes M/Gp/1 server model on requests for single object pages, and aims at retaining slowdown ratio

Target = 8

Small variance


18 objects

Challenges in QoS Assurance

Dynamics of Internet traffic No accurate models for requests

Multi-object Web pages Pageview quality vs request response

time

Non-deterministic process delay Long delay between the resource

allocation time and the time when QoS is measured (observed).


Client-Experienced Pageview QoS

Current queuing models are limited to requests to single objects; no models available for multi-object Web pages

Multi-phase handshaking of HTTP protocol makes it possible to take into account network conditions in resource alloc

client

server

Setup connection

last object

connection close

base pageobject 1

object 2

client-perceived pageview QoS

request-based QoS

waiting for

new requests


Presentation Outline

Client-Perceived QoS Assurance Related work Approach I: Model Predictive Control Approach II: Model-Free Self-tuning



eQoS: Model-Free Self-Tuning Control It monitors and controls client-perceived end-to-

end pageview response time in Web servers It is a middleware, residing between operating

systems and web server software

Fuzzy control provides a model-free way to translate heuristic control knowledge into a set of control rules


Service rate u(k+1) of a class in sampling period k+1 is adjusted according to its error e(k) and change of error ∆e(k) in previous sampling period k

Self-tuning fuzzy controller

First level is a fuzzy resource controller to address the issue of lacking accurate server model

Second level is a fuzzy scaling-factor controller to compensate the effect of process delay


Resource controller

Rule base contains quantified control knowledge about how to adjust a class’s service rate according to the e(k) and ∆e(k).


Experimental Setting

Implemented as a plugin of Apache http/1.1 on Linux Testbeds

PlanetLab, world wide distributed testbed• Server in Detroit, Michigan• Clients in Boston (RTT: 45 ms)• Clients in San Diego (RTT: 70 ms)• Clients in UK (RTT: 130 ms)

Network simulator (Dummynet)• Random xmission time (RTT, packet loss)• RTT: 40, 80, and 180 ms

Benchmark Surge workload generator

• Maximum number of embedded objects: 150• Base: 30%, Embedded objects 38%, Loner: 32%

World Cup 98 Trace• Requests replayed by clients from PlanetLab to objects in

trace


Input Traffic Profile

Workload is measured in terms of page requestsPage requests from a class is stochastic and

changes frequently


Transient Behavior of eQoS

on PlanetLab (World Cup Trace)

on PlanetLab (Surge)

Statistical guarantee of the target response time


Robustness of eQoSSelf-adaptive to load change

Self-adaptive to net condition


Performance Comparison

Fuzzy controller without self-tuning Tradition proportional integral (PI)

controller, based on M/G/1 model Adaptive PI controller (Kamra et al.

IWQoS’04) All controllers are carefully tuned for

RTT = 180 ms and load = 700 clients


Performance Relative to eQoS

• eQoS outperforms others in most of test cases

• eQoS is slightly worse than static controller only in the case when the latter was best tuned.


Summary

QoS assurance on Internet Servers Web server, e-commerce server, streaming servers

User-perceived performance Slowdown: normalized response time Response time for multi-object web pages

Model predictive feedback control approach for queueing delays of individual requests, relative to their processing time.

Model-free self-tuning control approach for pageview response time Robustness in both short and long time scales Self-adaptive to change of server load Self-adaptive to network conditions


Related Publications Robust processing rate allocation for proportional

slowdown diff. on Internet servers, IEEE Trans. on Computers, 2005

Resource allocation for session-based 2D service differentiation on e-commerce servers, IEEE Trans. on Parallel and Distrib. Systems. 2005.

Harmonic bandwidth allocation for QoS control on streaming servers, IEEE Trans. on Parallel and Distrib. Systems, 2004

eQoS: Provisioning of client-perceived end-to-end QoS guarantees in Web servers, Proc. of IWQoS’05

Modeling and analysis of 2-d service differentiation on e-commerce servers, Proc. of IEEE ICDCS 2004

Processing rate allocation for proportional slowdown differentiation on Internet Servers, Proc. of IPDPS'04


Other MAPS Publications• Energy-aware resource management

“Energy-aware modeling scheduling of real-time tasks for dynamic voltage scaling”, IEEE RTSS’05

“Delay-constrained energy-efficient wireless packet scheduling”, Globecom’05• Intelligent personalized info agent and prefetching

“Keywords-based semantic prefetching to tolerate Web access latecny”, IEEE TKDE’04• Continuous media adaptation for service differentiation on steaming

servers “Harmonic bandwidth allocation for qos control on streaming servers”, IEEE TPDS’04

• Mobility support for network-centric, data-intensive applications“Naplet: A flexible and reliable mobile agent framework”, IPDPS’02“Mobile codes and Security”, Handbook of Info Security, John Wiley & Sons, 2005

• Load balancing in a cluster of servers and overlay network“Cycloid: A scalable and constant-degree lookup-efficient P2P overlay network”, Perf. Eval.’06“Locality-aware randomized load balancing on DHT networks”, ICPP’05, and IPDPS’06

• Service migration for adaptive grid computing“service migration in distributed virtual machines for adaptive grid comp.”, ICPP’04, ICPP’05

• Transparent connection migration in mobile computingA reliable connection migration mechanism for synchronous transient communication

between mobile objects. ICPP’04

Scalable and Secure Internet Services and Architecture, Chapman & Hall/CRC Press, June 2005


MAPS Project in CIC@WSU• MAPS: System support for mobility and

adaptation in pervasive services

• Team– C. Xu, Principal Investigator– Visiting/Guest Faculty (3)

• X. Zhou, G. Chen, Y.-S. Jeong– PhD Students (7)

• J. Wei, H. Shen, X. Zhong, S. Fu, B. Liu, M. Xu, B. Wims, – M.Sc. Thesis Students (5)

• A. Brodie, W. Chen, R. Sudhindra, E. Henne, S. Shashidhara,

• Funded by – U.S. NSF: ACI-0303592, NASA: 03-OBPR-01-0049– WSU Research Enhanced Program, Career Development Chair

Award

http://cic.eng.wayne.edu


Thanks.

Cluster and Internet Computing Laboratory

Wayne State University, Detroit, Michigan

HTTP://www.cic.eng.wayne.edu

BackupSelf-tuning Rules


Rule-base design

1

32

45

e(k) > 0 and ∆e(k) < 0

e(k) < 0 and ∆e(k) > 0e(k) < 0 and ∆e(k) < 0

e(k) > 0 and ∆e(k) > 0

Zone 1 and Zone 3: Self-correcting, slowdown/speedup current trend

Zone 2 and Zone 4: Moving away, reverse current trend Zone 5: small e and ∆e, maintain current trend


Rule-base design (cont.)

Rules are described as IF-THEN statements using linguistic values

Linguistic values

Linguistic value Meaning

PL (NL) Positive (negative) large

PM (NM) Positive (negative) medium

PS (NS) Positive (negative) small

ZE Zero


Rule-base design (cont.)

IF error is NM and change of error is NL, THEN change of service rate is PL


Scaling factor controller

e(k) is large e(k) and ∆e(k) have the same sign

• Far away from target and moving farther away: large change of resource allocation

Different sign• Moving closer: small change of resource

e(k) is small Resource change to prevent overshoot or

undershoot according to transient states


Scaling factor controller (cont.)

Linguistic value Meaning

ZE Zero

VS Very small

SM Small

SL Small large

ML Medium large

LG Large

VL Very large


Scaling factor controller (cont.)

e(k) is large, ∆e(k) has same sign, large change of resource allocation (VL: very large)

e(k) is large, ∆e(k) has different sign, small change of resource allocation (VS: very small)

Http:// Quality Assurance and Adaptation: A Key to Next Generation of Stress-Resilient Internet Services Cheng-Zhong Xu Cluster &

Documents

computation slide

wayne stateqos assurance11

network delay

delay processing time

services accessible

processing time network

userperceived delay

network appl service