Top Banner
Stream Processing with Kafka and Samza Diego Pacheco @diego_pacheco Principal Software Architect
25

Stream Processing with Kafka and Samza

Feb 16, 2017

Download

Technology

Diego Pacheco
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Stream Processing with Kafka and Samza

Stream Processing with Kafka and Samza

Diego Pacheco @diego_pacheco Principal Software Architect

Page 2: Stream Processing with Kafka and Samza
Page 3: Stream Processing with Kafka and Samza
Page 4: Stream Processing with Kafka and Samza

●LinkedIN 2011●Implemented with Scala and Java●Motivation: Real-time data feeds●Goals:–Low Latency–High Throughtput

●Kafka at LinkedIN(2014):–300+ brokers–18k topics–140k partitions–220B messages per day–40TB inboud–160TB outbound–Peak Load: 3.25M messages/second

●Use case: Activity Stream, Offline log processing

Page 5: Stream Processing with Kafka and Samza

NO JMS

Page 6: Stream Processing with Kafka and Samza
Page 7: Stream Processing with Kafka and Samza
Page 8: Stream Processing with Kafka and Samza
Page 9: Stream Processing with Kafka and Samza
Page 10: Stream Processing with Kafka and Samza
Page 11: Stream Processing with Kafka and Samza
Page 12: Stream Processing with Kafka and Samza
Page 13: Stream Processing with Kafka and Samza
Page 14: Stream Processing with Kafka and Samza
Page 15: Stream Processing with Kafka and Samza

● LinkedIN 2013

● Stream Processing with Save Points.

● Multi-tenancy: 1 Thread per container

● State is simple

– You handle logging and restoring

– Single threaded programing

● Works with YARN

● Works well with Kafka

● Simple API – Record-like.

Page 16: Stream Processing with Kafka and Samza
Page 17: Stream Processing with Kafka and Samza
Page 18: Stream Processing with Kafka and Samza
Page 19: Stream Processing with Kafka and Samza
Page 20: Stream Processing with Kafka and Samza
Page 21: Stream Processing with Kafka and Samza
Page 22: Stream Processing with Kafka and Samza
Page 23: Stream Processing with Kafka and Samza

● Stream Processing

● Low Latency

● Async Processing

● Local State● Stores data localy on DISK● SAME machine where container runs

– Awesome FIT for Statefull processing

● Tight Integration with Kafka

● Strong Model For Streams: Ordered, Highly Avaliable, Partitioned and Durable(Kafka).

● Full feature Set of Kafka

● Client Side Join

Page 24: Stream Processing with Kafka and Samza
Page 25: Stream Processing with Kafka and Samza

Stream Processing with Kafka and Samza

Diego Pacheco @diego_pacheco Principal Software Architect

Thank You!Obrigado !