First Flink Bay Area meetup

Introduction toApache Flink™

Kostas Tzoumas@kostas_tzoumas

2

Flink is a stream processor with many faces

Streaming dataflow runtime

3

case class Path (from: Long, to: Long)val tc = edges.iterate(10) { paths: DataSet[Path] => val next = paths .join(edges) .where("to") .equalTo("from") { (path, edge) => Path(path.from, edge.to) } .union(paths) .distinct() next }

Optimizer

Type extraction

stack

Task schedulin

g

Dataflow metadata

Pre-flight (Client)

JobManagerTaskManagers

Data Sourceorders.tbl

Filter

MapDataSour

celineitem.tbl

JoinHybrid Hash

buildHT

probe

hash-part [0] hash-part [0]

GroupRed

sort

forward

Program

DataflowGraph

deployoperators

trackintermediate

results

Flink's internal execution model

4

Flink execution model A program is a dag of operators

Operators = computation + state

Operators produce intermediate results = logical streams of records

Other operators can consume those

5

map

join sum

ID1

ID2

ID3

6

A map-reduce job with Flink

ExecutionGraph

JobManager

TaskManager 1

TaskManager 2

M1

M2

RP1

RP2

R1

R2

1

2 3a3b

4a

4b

5a

5b

One runtime for batch and streaming

Pipelined Blocked

Ephemeral Stream data shuffles

Batch data shuffles

Checkpointed Caching for recovery or reuse

7

Pipelining

8

Basic building block to “keep the data moving”

Note: pipelined systems do not usually transfer individual tuples, but buffers that batch several tuples!

Streaming fault tolerance

Ensure that operators see all events• “At least once”• Solved by replaying a stream from a

checkpoint, e.g., from a past Kafka offset

Ensure that operators do not perform duplicate updates to their state• “Exactly once”• Several solutions

9

Exactly once approaches

Discretized streams (Spark Streaming)• Treat streaming as a series of small atomic

computations• “Fast track” to fault tolerance, but restricts

computational and programming model (e.g., cannot mutate state across “mini-batches”, window functions correlated with mini-batch size)

MillWheel (Google Cloud Dataflow)• State update and derived events committed as atomic

transaction to a high-throughput transactional store• Requires a very high-throughput transactional store

Chandy-Lamport distributed snapshots (Flink)10

11

JobManagerRegister checkpointbarrier on master

Replay will start from here

12

JobManagerBarriers “push” prior events (assumes in-order delivery in individual channels)

Operator checkpointing starting

Operator checkpointing finished

Operator checkpointing in progress

13

JobManagerOperator checkpointing takes snapshot of state after ack’d data have updated the state. Checkpoints currently one-off and synchronous, WiP for incremental and asynchronous

State backup

Pluggable mechanism. Currently either JobManager (for small state) or file system (HDFS/Tachyon). WiP for in-memory grids

14

JobManager

State snapshots at sinks signal successful end of this checkpoint

At failure, recover last checkpointed state and restart sources from last barrier guarantees at least once

State backup

Best of all worlds for streaming

Low latency• Thanks to pipelined engine

Exactly-once guarantees• Variation of Chandy-Lamport

High throughput• Controllable checkpointing overhead

Separates app logic from recovery• Checkpointing interval is just a config parameter

15

Faces of a stream processor

16

Stream processing

Batchprocessing

Machine Learning at scale

Graph Analysis

Stream data analytics

17

18

DataStream API

case class Word (word: String, frequency: Int)

val lines: DataStream[String] = env.fromSocketStream(...)

lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .window(Time.of(5,SECONDS)).every(Time.of(1,SECONDS)) .groupBy("word").sum("frequency") .print()

val lines: DataSet[String] = env.readTextFile(...)

lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .groupBy("word").sum("frequency") .print()

DataSet API (batch):

DataStream API (streaming):

Flink stack

19

DataStream (Java/Scala)


20

Batch data analytics

21

Batch is a special case of streaming

map ID1

ID2

ID3

Blocking work units are embedded in streaming topology

Lower-overhead fault-tolerance via replaying intermediate results

join sum

Managed memory in Flink

22Memory runs out

Cost-based optimizer

23

Flink stack

24

Table

DataSet (Java/Scala)DataStream (Java/Scala)

Hadoop M

/R

Local Cluster Yarn Tez Embedded

Table


25

Iterative processing

26

FlinkML

API for ML pipelines inspired by scikit-learn

Collection of packaged algorithms • SVM, Multiple Linear Regression, Optimization, ALS, ...

val trainingData: DataSet[LabeledVector] = ...val testingData: DataSet[Vector] = ...

val scaler = StandardScaler()val polyFeatures = PolynomialFeatures().setDegree(3)val mlr = MultipleLinearRegression()

val pipeline = scaler.chainTransformer(polyFeatures).chainPredictor(mlr)

pipeline.fit(trainingData)

val predictions: DataSet[LabeledVector] = pipeline.predict(testingData)

Gelly

Graph API and library

Packaged algorithms• PageRank, SSSP, Label Propagation, Community

Detection, Connected Components

27

ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

Graph<Long, Long, NullValue> graph = ...

DataSet<Vertex<Long, Long>> verticesWithCommunity = graph.run(new

LabelPropagation<Long>(30)).getVertices();

verticesWithCommunity.print();

env.execute();

Iterative processing in FlinkFlink offers built-in iterations and delta iterations to execute ML and graph algorithms efficiently

28

map

join sum

ID1

ID2

ID3

Example: Matrix Factorization

29

Factorizing a matrix with28 billion ratings forrecommendations

More at: http://data-artisans.com/computing-recommendations-with-flink.html

The full stack

30

Gelly

Table

ML

SA

MO

A

DataSet (Java/Scala) DataStream

Hadoop M

/R

Local Cluster Yarn Tez Embedded

Data

flow

Data

flow

(W

iP)

MR

QL

Table

Casc

adin

g

(WiP

)


Sto

rm (

WiP

)

Zeppelin

Closing

31

32

tl;dr: what was this about? The case for Flink as a stream

processor• Low latency• High throughput• Exactly once• Easy to use APIs, library ecosystem• Growing community

A stream processor that is great for batch analytics as well

Demo time

33

I Flink, do you?

34

If you find this exciting,

get involved and start a discussion on Flink‘s mailing list,

or stay tuned by

subscribing to [email protected],following flink.apache.org/blog, and

@ApacheFlink on Twitter

35

flink-forward.org

First Flink Bay Area meetup

Technology