Top Banner
YeezyScore A comparison of stream processing software By: Kat Chuang @katychuang
9

Insight DE project

Jan 16, 2017

Download

Technology

Kat Chuang
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Insight DE project

YeezyScoreA comparison of stream

processing software

By: Kat Chuang

@katychuang

Page 2: Insight DE project

10 mins

Page 3: Insight DE project

High level overview

Kat Chuang @katychuang

Batch

Streaming

Microbatching

Storm Trident Spark Streaming

Released 2011 2010

Delivery Semantics

Exactly Once Exactly once

State Management Yes Yes

Latency Seconds Seconds

Output MapState Resilient Distributed Dataset (RDD)

Throughput 10k/nodes/sec? 400k/nodes/sec?

Page 4: Insight DE project

Test Cases Metrics

1. Does every message pass through the pipeline?

2. How fast does each message take to process?

Data

1. Timestamps

Kat Chuang @katychuang

Page 5: Insight DE project

Timestamp1 (Timestamp1, Timestamp2)

(Timestamp1, Timestamp2)

Timestamp1

Pipelines

Kat Chuang @katychuang

Page 6: Insight DE project

1. Does every message pass through the pipeline?

Kat Chuang @katychuang

This is a scatterplot

Page 7: Insight DE project

2. How fast does each message take to process?

Kat Chuang @katychuang

This is a scatterplot

Page 8: Insight DE project

Storm Trident Vs Spark StreamingStorm Trident Spark Streaming

Stream processing framework that also does micro-batching.

Great for transforming or computing as data flows in.

Complex event processing (CEP), continuous computation.

Task-Parallel Computations, i.e. reading Twitter streams

Batch processing framework that also does micro-batching.

Great for combining with historical data.

ML algos included. Requires HDFS-backed data source.

Data-Parallel Computations, i.e. offering recommendations

Page 9: Insight DE project

Kat ChuangData Engineering Fellow#DE-2015c

[email protected]: katychuangTwitter: katychuangIG: katychuang.nyc