Adapative Provisioning of Stream Processing Systems in the Cloud

Javier Cerviño, Eva Kalyvianaki, Joaquín Salvachúa, Peter Pietzuch Adap?ve Provisioning of Stream Processing Systems in the Cloud 1/23

Adap?ve Provisioning of Stream Processing Systems in the Cloud

Javier Cerviño#1, Eva Kalyvianaki*2,

Joaquín Salvachúa#3, Peter Pietzuch*4

# Universidad Politécnica de Madrid, * Imperial College London

[email protected], [email protected] [email protected], [email protected]

SMDB 2012


Data Stream Processing Systems (DSPS)

•  Real-‐?me processing of con?nuous data •  Financial trading, sensor networks, etc. •  Data from sources arrive as streams

–  Time-‐ordered sequence of tuples

•  Characteris?cs –  Tuples arrival rates are not uniform

•  Performance requirements –  Low latency –  Guaranteed throughput

•  Adap6ve provisioning –  Use resources on demand


Cloud Compu?ng

Cloud offers elas?c compu?ng by providing resources on demand –  Characteris?cs

•  Scalability •  Geographical Distribu?on •  Virtualiza?on •  Applica?on Programming Interface (API)

–  Amazon EC2 •  Public cloud provider •  Infrastructure as a Service •  Images and Virtual Machines


Related work

•  Cloud Stream Processing [Kleiminger et al, SMDB’11]

•  Cloud network performance –  Cloud and Internet paths support streaming data into cloud DCs?

[Barker et al, MMSys’07], [Wang et al, INFOCOM’10], [Jackson et al, CLOUDCOM’10]

•  Cloud computa?on performance –  Best effort VMs support low-‐latency, low-‐jiier and high-‐throughput stream

processing? [Barker et al, MMSys’07]

–  Computa?onal power of Amazon EC2 VMs for standard stream processes tasks? [Diirich et al, VLDB’10],


Contribu?ons

•  Explore the suitability of cloud infrastructures for stream processing, (case study on Amazon EC2) –  Measure network and processing latencies, jiier and throughput

•  An adap?ve algorithm to allocate cloud resources on-‐demand –  Resizes the number of VMs in a DSPS deployment

•  Algorithm evalua?on –  Deploying the algorithm as part of a DSPS on Amazon EC2


Outline

1.  Cloud Performance 1.  Network Measurements 2.  Processing Measurements 3.  Discussion

2.  Adap?ve Cloud Stream Processing 1.  Architecture 2.  Algorithm

3.  Experimental Evalua?on 1.  Descrip?on 2.  Results

4.  Future Work and Conclusions


Outline






Asia Europe USA

PROCESSING ENGINE

SOURCE

PlanetLab node

Cloud instance

Cloud Performance Network Measurements

•  Goal: Explore network parameters that affect stream processing condi?ons: –  Ji9er, latency and bandwidth

•  Experimental set-‐up –  Stream engines

•  Mock engines without processing •  9 Amazon EC2 instances: 3 in US, 3 in EU and 3 in Asia. •  Large Amazon EC2 instances: 7.5GB and 4 ECU

–  Stream sources •  9 distributed PlanetLab nodes: 3 in US, 3 in EU and 3 in Asia.

–  Dataset •  Random data at three different data rates: 10kbps, 100kbps and 1Mbps



•  Average jiier is less than 2.5 μs

•  Some outliers have a value of almost 4 seconds

•  Low ji9er with less than 3% of high outliers

1 2 3 4 5 6 7 8 9

0

2000

4000

PlanetLab nodes

Jitte

r (m

s)

high rate medium rate low rate


•  Applica?on-‐level delay involves processing ?me: tsent-‐treceived

•  Network-‐level delay between the source and the engine: RTT

•  Cloud DC does not increase applica6on-‐level delay


0 50 100 150 200 2500

100

200

300

Application−Level Round−Trip Time (ms)

Net

wor

k−Le

vel

Rou

nd−T

rip T

ime

(ms)

idealamericaasiaeurope


Cloud Performance Processing Measurements

•  Goal –  Explore performance varia?on with ?me-‐of-‐day (processing and latency) –  Check if cloud VMs can scale efficiently with varying input rate

•  Experimental set-‐up –  Dataset

•  Esper benchmark tool •  Stream of shares and stock values for a given symbol at a fixed rate (30000 tuples/sec)

–  Submi9er •  10 Extra large Amazon EC2 VMs: 15GB, 8 ECU

–  Nodes •  10 Small Amazon EC2 VMs: 1.7 GB, 1 ECU



•  Throughput remains rela?vely stable over the measurement period

•  Latency suffers more from unpredictable outliers

•  No obvious pa9ern to correlate performance with ?me-‐of-‐day

0

50La

tenc

y(m

s)

Day 1 Day 2

7 8 9 101112131415161718190

2

x 104

Thro

ughp

ut(tu

ples

/s)

Time of day, 24−hour format7 8 9 10111213141516171819

Time of day, 24−hour format



•  Cloud VMs can be used to scale efficiently with an increasing input rate

•  The number of VMs depends on their type, as expected

1 3 5 7 9 11 13 15 170

0.20.40.60.8

11.21.41.61.8

2 x 105

Input Data Rate − x10000 tuples/s

Thro

ughp

ut −

tupl

es/s

Small VM instances

1 3 5 7 9 11 13 15 17Input Data Rate − x10000 tuples/s

Large VM instances


Outline






Adap?ve Cloud Stream Processing

•  Elas?c stream processing system to scale the number of VMs to input stream rates

•  Goals –  Low-‐latency with a given throughput –  Keep VMs opera?ng to their maximum processing capacity

•  Workload is par??oned and balanced across mul?ple VMs •  Many VMs available to scale up and down to workload demands •  Collector gathers results from engines and process addi?onal queries

source 1

source 2

VM

engine

collector

Sub-‐query 1 Sub-‐query 2 Stream source

VM

engine

VM

engine

VM

engine


Adap?ve Cloud Stream Processing Algorithm I

VM

Esper Proc. Rate

VM

Esper Proc. Rate

VM

Esper Proc. Rate

Σ Tuple submiier

Input Rate -‐ Proc

Rate

N virtual machines

/

Extra Rate

Average Rate

•  Gathering and calcula6on –  Gathers processing rates from VMs –  Obtains

•  Total extra processing rate (Extra rate) •  Average processing rate per VM (Average rate)


Extra Rate

Average Rate

Adap?ve Cloud Stream Processing Algorithm II

Extra Rate > 0 ?

No

Σ

/

Yes

N

Store Average Rate

/ Input Rate

Return

•  Decision stage –  Calculates new number of machines (N’) –  Scale up

•  Stores the average rate as maximum average rate –  Scale down

•  Uses last maximum average rate

N’

scale up

scale down


Outline






Experimental Evalua?on Descrip?on

Amazon EC2

VM

Esper

Esper

Sub-‐query 1 Maximum value of each stock

symbol per second

Sub-‐query 2 Collec?on and merge of all results

Same query

Stream source Random values of

different stock symbols

VM

Esper

VM

Esper

VM

engine

Controller

Esper tuple submiier

Esper tuple submiier

•  Goals –  Adaptability of the algorithm against varying input rates –  Implica?ons on stream processing performance to adapta?on

•  Experimental set-‐up –  Integrated with Esper processing system engine –  Framework to control VMs and to collect performance metrics

•  Throughput, processing latency and network latency •  Collec?on of shell script

–  Deployed on Amazon EC2


Experimental Evalua?on Results

100 200 300 400 500 600 7000

1

2 x 105

Tupl

es/s

ec

Time (sec)

100 200 300 400 500 600 7000

1

2

Num

ber o

f VM

sInput Rate Tuples dropped Number of nodes

Large Instances

Small Instances

100 200 300 400 500 600 7000

0.5

1

1.5

x 105

Tupl

es/s

ec

Time (sec)

100 200 300 400 500 600 700

1234

Num

ber o

f VM

sInput Rate Tuples dropped Number of nodes

•  Processing latency remains low: 7 – 28 μs

•  Scales up and down the number of VMs as required by the input rate

•  There is a significant reac?on delay before VMs are scaled up and down

•  VMs are pre-‐allocated


Outline






Future Work

•  Inves?gate ways to reduce the reac?on delay to performance viola?ons

•  Predict the future behaviour of input data rates

•  Inves?gate cost models for alloca?on of small and large VM instances

•  Evaluate our system in other cloud environments

•  Extensive evalua?on over longer periods of ?me and different VM types


Conclusions

•  An adap?ve approach to provision stream processing systems in the cloud

•  Public clouds are suitable for stream processing

•  Network latency is the domina?ng factor in public clouds

•  Our approach can adap?vely scale the number of VMs to input rates

•  Processing latency and data loss remain low

Javier Cerviño email: [email protected]

Thank you!

Ques?ons?


Adap?ve Cloud Stream Processing Algorithm

1 3 5 7 9 11 13 15 170

0.20.40.60.8

11.21.41.61.8

2x 10

5

Input Data Rate ! x10000 tuples/s

Th

rou

gh

pu

t !

tu

ple

s/s

Small VM instances

1 3 5 7 9 11 13 15 17Input Data Rate ! x10000 tuples/s

Large VM instances

Fig. 5. Increase in throughput with different instance sizes on Amazon EC2(Different shades/colours correspond to different VMs.)

Fig. 6. Elastic DSPS with query partitioning across processing VMs

ture itself does not increase end-to-end latency significantly.Therefore, it is preferable to deploy stream processing enginesat cloud sites within close network proximity to sources.

Second, it is important to consider that jitter suffers fromhigh outliers that can be orders of magnitude above the aver-age. Typically systems compensate for jitter through bufferingor discarding of late-arriving data. In our experiments, dis-carding delayed data items would have resulted in a smallpercentage of lost data (approx. 3%).

In summary, when deploying a DSPS in a public cloud,it is necessary to understand the trade-offs when scaling todifferent numbers of VMs. A challenging issue is to decideon the right number of VMs and their instance types to supporta given stream processing workload. After deployment, it isnecessary to monitor the performance of processing VMs, andif they show decreasing throughput, to scale out to more VMs.

IV. ADAPTIVE CLOUD STREAM PROCESSING

We now present an adaptive algorithm to scale the numberof VMs required to deploy a DSPS in the cloud. Our goalis to build an elastic stream processing system that resizesthe number of VMs in response to input streams rates. Thegoal is to maintain low latency with a given throughput, whilekeeping VMs operating to their maximum processing capacity.We assume that a workload can be partitioned among multipleVMs, balancing streams equally across them. We also assumethat there are always sufficiently many VMs available to scaleup to workload demands.

As shown in Fig. 6, we assume that the DSPS executesa query, which can be decomposed across multiple VMs bysplitting the query into sub-queries, each processing a sub-stream on a given engine. The input stream can be equally

Algorithm 1 Adaptive provisioning of a cloud-based DSPSRequire: totalInRate , N , maxRatePerVM

Ensure: N

0 s.t. projRatePerVM ⇤N

0 = totalInRate

1: expRatePerVM = btotalInRate/Nc

2: totalExtraRateForVMs = 0; totalProcRate = 03: for all deployed VMs do

4: totalExtraRateForVMs += expRatePerVM -getRate(VM )

5: totalProcRate += getRate(VM )6: end for

7: avgRatePerVM = b(totalProcRate/N)c8: if totalExtraRateForVMs > 0 then

9: N

0 = N+d(totalExtraRateForVMs/avgRatePerVM )e10: maxRatePerVM = avgRatePerVM

11: else if totalExtraRateForVMs < 0 then

12: N

0 = dtotalInRate/maxRatePerVM e

13: end if

14: projRatePerVM = totalInRate/N

0

15: return N

0

partitioned into sub-streams. For example, queries that com-pute aggregate and topK functions are naturally decomposablein this fashion. The results from sub-queries are then sent toa collector that merges them by executing another sub-query,emitting the overall query result. We further assume that loadshedding is employed by the DSPS in overloaded conditionsto sustain low-latency processing.

Our proposed provisioning algorithm uses a black-box ap-proach, i.e. it is independent of the specifics of queries runningin the DSPS. It scales the number of VMs used solely basedon measurements of input stream rates. It detects an overloadcondition when a decrease in the processing rate of inputdata occurs because of discarded data tuples due to load-shedding. The algorithm is invoked periodically and calculatesthe new number of VMs that are needed to support the currentworkload demand. This number can be larger than (when thesystem is overloaded and requires more resources), smallerthan (when the system has spare capacity) or equal to thecurrent number of engines. The aim is to maintain the requirednumber of VMs, operating almost at their maximum capacity.

A. Algorithm

We present the provisioning algorithm more formally inAlg. 1. The algorithm takes as input the aggregate rate of theinput stream, totalInRate , and the number of VMs currentlyused by the DSPS, N . It also takes maxRatePerVM , which isthe maximum rate that a single VM can process, from previousinvocations based on measurements in overload conditions.The algorithm takes a conservative approach, in which itgradually increases the number of VMs to reach the requiredset for sustainable processing. The output of the algorithm isthe number of VMs, N 0, that is needed to sustain totalInRate .In this case, totalInRate is divided equally among VMs andeach handles projRatePerVM .

The algorithm initially estimates the stream rate each VM


Adap?ve Cloud Stream Processing Algorithm

getExpectedVMs(totalInRate, currentVMs) {

expectedRatePerVM = totalInRate/currentVMs

for each deployed VM {

vmRate = getRate(VM)

totalExtraRate += (expRatePerVM-vmRate)

}

avgRatePerVM = totalProcRate/N

if (totalExtraRateForVMs > 0) {

expectedVMs = currentVMs + totalExtraRate/avgRate

maxRatePerVM = avgRatePerVM

}

else if (totalExtraRateForVMs < 0) {

expectedVMs = totalInRate / maxRatePerVM

}

}

Input rate calcula?ons

Increasing Input rate

Decreasing Input rate

Adapative Provisioning of Stream Processing Systems in the Cloud

Technology

ve cloud stream processing

cloud vms

cloud javier cervio

cloud resources

stream processing condi

cloud dcs

processing latencies

day processing