This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Storm @Twitter
KARTHIK RAMASAMY @KARTHIKZ
#TwitterAtSigmod #TwitterDataStorm
Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Jignesh Patel, Sanjeev Kulkarni Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, Nikunj Bhagat
Sailesh Mittal and Dmitriy Ryaboy
BEGIN
END
STORM OVERVIEW
!I
STORM INTERNALS
(II
STORM EXPERIMENTS
KV
OPERATIONAL EXPERIENCES
ZIV
TALK OUTLINE
OPERATIONAL OVERVIEW
bIII
OVERVIEW
![
I
GUARANTEED MESSAGE
PROCESSING
HORIZONTAL SCALABILITY
ROBUST FAULT
TOLERANCE
CONCISE CODE- FOCUS
ON LOGIC
/b \ Ñ
WHAT IS STORM?
Streaming platform for analyzing realtime data as they arrive, so you can react to data as it happens.
STORM DATA MODELTOPOLOGY
Directed acyclic graph
Vertices=computation, and edges=streams of data tuples
SPOUTS
Sources of data tuples for the topology
Examples - Kafka/Kestrel/MySQL/Postgres
BOLTS
Process incoming tuples and emit outgoing tuples
Examples - filtering/aggregation/join/arbitrary function
,
%
STORM TOPOLOGY
%
%
%
%
%
SPOUT 1
SPOUT 2
BOLT 1
BOLT 2
BOLT 3
BOLT 4
BOLT 5
WORD COUNT TOPOLOGY
% %TWEET SPOUT PARSE TWEET BOLT WORD COUNT BOLT
Live stream of Tweets
#worldcup : 1M soccer: 400K
….
WORD COUNT TOPOLOGY
% %TWEET SPOUT
TASKSPARSE TWEET BOLT
TASKSWORD COUNT BOLT
TASKS
%%%% %%%%
When a parse tweet bolt task emits a tuple which word count bolt task should it send to?
STREAM GROUPINGS
Random distribution of tuples
Group tuples by a field or multiple
fields
Replicates tuples to all tasks
SHUFFLE GROUPING FIELDS GROUPING ALL GROUPING
Sends the entire stream to one task
GLOBAL GROUPING
/ - ,.
STORM INTERNALS
(II
STORM ARCHITECTURE
Nimbus
ZK CLUSTER
SUPERVISOR
W1 W2 W3 W4
SUPERVISOR
W1 W2 W3 W4
TOPOLOGY SUBMISSION
ASSIGNMENT MAPS
SYNC CODE
SLAVE NODE SLAVE NODE
MASTER NODE
STORM WORKER
TASK
EXECUTOR
TASK
TASK
EXECUTOR
TASK
TASK
TASK
EXECUTOR
JVM
PR
OC
ESS
DATA FLOW IN STORM WORKERS
In QueueIn QueueIn QueueIn QueueIn Queue
TCP Receive Buffer
In QueueIn QueueIn QueueIn QueueOut Queue
Outgoing Message Buffer
User Logic Thread
User Logic Thread
User Logic Thread
User Logic Thread
User Logic Thread
User Logic Thread
User Logic Thread
User Logic Thread
User Logic ThreadSend Thread
Global Send Thread
TCP Send Buffer
Global Receive Thread
Kernel
Disruptor Queues
0mq Queues
OPERATIONAL OVERVIEW
bIII
1
STORM METRICS
SUPPORT AND TROUBLE SHOOTING
2 CONTINUOUS PERFORMANCE
3 CLUSTER AVAILABILITY
COLLECTING TOPOLOGY METRICS
% %TWEET SPOUT PARSE TWEET BOLT WORD COUNT BOLT
%
METRICS BOLT
SCRIBE
SAMPLE TOPOLOGY DASHBOARD
OPERATIONAL EXPERIENCES
K"
IV
OVERLOADED ZOOKEEPER
zk
S1
S2
S3
Shared configuration
W
W
WSTORM
Quickly exceeded number of clients
Impacted uptime of other systems
OVERLOADED ZOOKEEPER
zk
S1
S2
S3
Detached configuration
W
W
WSTORM
zk
Increased to 300 workers per cluster
> 300 - workers get killed and relaunched
Worker heart beats written to zknode every 15 secs
OVERLOADED ZOOKEEPER
zk
S1
S2
S3
Scale up
W
W
WSTORM
zk
Increased to 1200 workers per cluster
67%
33%
OVERLOADED ZOOKEEPER
KAFKA SPOUT
Offset/partition is written every 2 secs
!
!
STORM RUNTIME
Workers write heart beats every 3 secs
Analyzing zookeeper traffic
OVERLOADED ZOOKEEPER
zk
S1
S2
S3
Heart beat daemons
W
W
WSTORM
zk
5000 workers per cluster and still growing!
HHH
KVKVKV
EXPT 1
STORM OVERHEADSJAVA PROGRAM
Read from Kafka cluster and deserialize in a “for loop”
Sustain input rate of 300K msgs/sec from Kafka topic
EXPT 2
1-STAGE TOPOLOGY
No acks to achieve at most once semantics
Storm processes were co-located using isolation scheduler
EXPT 3
1-STAGE TOPOLOGY WITH ACKS
Enable acks for at least once semantics
STORM OVERHEADS
Aver
age
CPU
Util
izat
ion
0%
20%
40%
60%
80%
Mac
hine
s U
sed
0
1
2
3
JAVA 1-STAGE 1-STAGE-ACK
Machines Avg. CPU
77%
58.2%58.3%
3
11
STORM EXPERIMENTS
x
9V
STORM EXPERIMENTSExamine resiliency and efficiency during machine failures