Top Banner
DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song Liu ¥ †Department of Computer Sciences, Purdue University, USA ‡School of Computing Science, Simon Fraser University at Surrey, Canada ¥ School of Mechanical Engineering, Purdue University, USA
23

DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

Dec 15, 2015

Download

Documents

Lilly Passman
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Control-based Quality Adaptation in Data Stream

Management Systems (DSMS)

Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song Liu¥

†Department of Computer Sciences, Purdue University, USA

‡School of Computing Science, Simon Fraser University at Surrey, Canada

¥ School of Mechanical Engineering, Purdue University, USA

Page 2: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Data Stream Management

• Continuous data, discarded after being processed

• Continuous query• Data-active query-

passive model• Applications

– Financial analysis– Mobile services– Sensor networks– Network monitoring– More …

User

DSMS

User

User

DataQuery

Results

Page 3: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

DSMS architecture

• Network of query operators (O1 – O3)

• Each operator has its own queue (q1 – q4)

• Scheduler decides which operator to execute

• Query results (Q1, Q2) pushed to clients

• Example systems:– Aurora/Borealis– STREAM

Page 4: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Quality-of-Service (QoS) in DSM

• Data processing is QoS-critical in DSMS– Tuple delay is the major concern: results generated from old data

are useless!

• Highly dynamic environment hard to maintain QoS– Bursty data input– Unpredictable unit processing cost

• Overloading during spikes degraded (delay) QoS • Solution: adjust the following (i.e. quality adaptation)

– Sampling rate (source side) – Data loss (DSMS side) load shedding

Page 5: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Load Shedding

• Eliminating excessive load by dropping data items less QoS violations

• Basic algorithm (Tatbul et al., 2003): periodically• CPU is the bottlenecking resource• Key questions

– When?– How much?– Where?– Which tuples?

Page 6: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

What’s missing?

• Current solutions focus on steady-state performance

• Assuming input level changes between stable states

• However, arrivals are bursty in practice – always in transient state

• Taking averages (baseline) wouldn’t work

Load

Time

CPUcapacity

Page 7: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Our approach

• View load shedding as a feedback control problem • Feedback Control: manipulation of system behavior by

adjusting system input based on system output – Cruise control of automobiles, room temperature control, etc.

• The feedback control loop:– Plant

– Monitor

– Controller

– Actuator

• How it works– Error = measured output – desirable output

– Focal point: controller, which maps error to control signal

Page 8: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Why Feedback Control ?

• Maintain system performance under internal/external uncertainties

• Control theory provides tools to choose and tune controller toward desired performance

– Current load shedding solution is also feedback-based– Difference: we use control theory to guide the controller design

• Steps of problem-solving using control theory1. Mapping problem to feedback control loop, determine

input/output

2. System identification: modeling input/output relationship

3. Controller design: can be done analytically

Page 9: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

The feedback control loop

• Plant : current DSMS– Input : load admitted– Output : delay QoS– Reference output: specified by DBA

• Actuator – adaptor: load shedder– admission controller

• Monitor : new• Controller : new• System dynamics: disturbances• Discrete control: control period T

Page 10: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

System identification

• To build dynamic model that describes the relationship between input and output

• Most system can be modeled by the following linear difference equation:

– I(x): input at period x– O(x): output at period x

– n: order of the equation– ai, bi: system-specific coefficients

• Determine n, ai, bi by experiments using synthetic inputs

n

ii

n

ii ikIbikOakO

11

)()()(

Page 11: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Controller design

• PI controller:

– E(k) : error– g, r: controller coefficients– Id(k) : desirable input

• More efficiently:

• Transfer function of the PI controller:

k

id iErkEgkI

0

)()()(

)1()()1()( krEkEgkIkI dd

1

)()(

z

rzgzC

• For example, a second order system has TF:

• Closed-loop TF (CLTF):

• determine g and r by pole placement of the CLTF (details skipped)

212

21

)(

)()(

azaz

bzb

zI

zOzG

Page 12: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Actuator (load shedder) design

• Id(k) is the desirable load (# of data tuples) entering the DSMS during the next control period k

• Let S(k) be the real load during period k, we need to discard S(k) - Id(k) tuples

• Two implementations of load shedder:– Admit the first Id(k) tuples during period k

• Pros: easy to implement, generate (100%) accurate control signal

• Cons: skewed to the early arrivals– Sampling based shedding: each tuple is discarded with

probability 1-p, i.e. p = Id(k) / S(k) • However, S(k) is unknown at the beginning of period k• Solution: use S(k-1) to estimate S(k) and this does not affect

controller performance (see backup slide)

Page 13: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Determining control period

• Control period T is critical in controller design• Two primary concerns in setting T

– Should be short enough to capture the changes of input rate • Nyquist-Shannon theorem of sampling

• The shorter the better

– Output signal (delay) is measured as an average of all data tuples in one control period

• T is too short small number of sampled tuples• T cannot be too short as the output signal may fail to represent real

system status

• We make tradeoffs between the above two factors and set T to one second

Page 14: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Experiments

• We evaluate our control-based solution by simulations

• Set four classes of delays: 500ms – 2000ms

• Operator scheduling policy: Earliest Deadline First– Input: CPU utilization

– Output: deadline miss ratio

• Small query network with 13 operators

• Stream data:– Synthetic: Poisson, Pareto

– Real: TCP traces

• Comparison: static shedding– Amount of shedding follows a pre-determined STEPSIZE

– Similar to TCP rate control

Page 15: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Simulation results: Poisson inputs

Target deadline miss ratio (control goal) is set to zero

Inputs Outputs

Page 16: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Simulation results: bursty inputs

a. Paretob. TCP trace

• Much less deadline misses than static shedding

• The same or lower level of data loss (load shed)

• Hard to get an appropriate STEPSIZE in static shedding – not a problem in control-based approach

Page 17: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Summary

• Load shedding is an important quality adaptation method• Current solutions focusing on steady-state performance

do not work well under bursty inputs • We propose an approach to guide load shedding in a

highly dynamic environment based on feedback control theory

• Initial experimental results by simulation show promising potential of our approach

Page 18: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Verification of model

First order linear model

Page 19: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Simulation: unpredictable unit processing cost

Control-based method learns the real cost

Page 20: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Controller stability after replacing S(k) with S(k-1)

Let Id’(k) be the input signal as a result of using S(k-1) instead of S(k), we have

Id’(k) = p S(k-1)

and thus

S(k-1) Id (k) = S(k) Id’(k) .

In the z-domain, we get

Id (k) = z Id’(k) .

Plugging above into the CLTF, we have

According to control theory, controller is still stable.

Page 21: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Ongoing work

• Performed all three steps in a real DSMS – the Borealis system

• We set output to average delay• System identification gives a first-order model

structure• Control function

• Controller analysis gives the following set of parameters:

)1()1()()( 10 kaIkEbkEbkI dd

8.0 and ,31.0 ,4.0 10 abb

Page 22: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Ongoing work: results

• Control target: 2000ms• Comparison:

– Adaptive: static shedding– BASELINE– NON-CTRL

• Metrics:– Total delay violations– Total delayed tuples– Max delay– Load shed

Page 23: DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.

DEXA 2005

Ongoing work: results