Top Banner
Distributed System Coordination by Zookeeper and Introduction to Kazoo Python Library Jimmy Lai r97922028 [at] ntu.edu.tw Dec. 22th, 2014 1
35

Distributed system coordination by zookeeper and introduction to kazoo python library

Jul 12, 2015

Download

Technology

Jimmy Lai
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Distributed system coordination by zookeeper and introduction to kazoo python library

Distributed System Coordination by Zookeeper and Introduction to

Kazoo Python Library

Jimmy Lai r97922028 [at] ntu.edu.tw

Dec. 22th, 2014

1

Page 2: Distributed system coordination by zookeeper and introduction to kazoo python library

Outline1. Overview 2. Basics 3. Deployment 4. Recipes 5. References

2

Page 3: Distributed system coordination by zookeeper and introduction to kazoo python library

Overview of Zookeeper

3

Page 4: Distributed system coordination by zookeeper and introduction to kazoo python library

A Distributed System - Master-Worker

• Coordination tasks: 1. elect new master when the master crashes 2. master assign tasks to worker 3. when worker crashes, re-assign the task to other

worker 4. When worker finished their task, master assign new

tasks to it

Master

Worker Worker Worker Worker Worker Worker

4

Page 5: Distributed system coordination by zookeeper and introduction to kazoo python library

Distributed System• An application consists of programs run on a

group of computers. • Coordination is more difficult than writing a

standalone program. • Developer may take too much times to handle

the coordination or create a fragile (e.g. race condition, single point failure) distributed system.

5

Page 6: Distributed system coordination by zookeeper and introduction to kazoo python library

Easy Distributed System by Zookeeper• Common coordination tasks:

• Naming service • Configuration management • Synchronization • Leader election • Message queue • Notification system

• Zookeeper provides highly reliable API for those common coordination tasks

http://en.wikipedia.org/wiki/Apache_ZooKeeper#Typical_use_cases6

Page 7: Distributed system coordination by zookeeper and introduction to kazoo python library

Powered By Zookeeper• Zookeeper is built by Yahoo Research • Customers:

• Hadoop, Hbase • Solr • Neo4j • Flume • Facebook messages

7

Page 8: Distributed system coordination by zookeeper and introduction to kazoo python library

Benefits of Zookeeper• With Zookeeper:

• simplify the development of distributed system, more agile and robust

• zookeeper is simple, fast and replicated • Without Zookeeper:

• more difficult8

Page 9: Distributed system coordination by zookeeper and introduction to kazoo python library

• Servers replicate data • Client connect to one of the

server • Throughput test • Hardware: dual 2Ghz Xeon and

two SATA 15K RPM drives

Benefits of Zookeeper

9

Page 10: Distributed system coordination by zookeeper and introduction to kazoo python library

Zookeeper Basics

10

Page 11: Distributed system coordination by zookeeper and introduction to kazoo python library

Znode (1/2)• Based on shared storage

model, each client store/acquire data from zookeeper service

• File system-like API• Znode: hierarchical tree

contains optional data or optional znodes.

• Persistent znode will disappear after delete operation

• Ephemeral znode will disappear when the client creator crashes or close the connection, or deleted by any client

11

Page 12: Distributed system coordination by zookeeper and introduction to kazoo python library

Znode (2/2)• Sequential znode will

be assigned a monotonically increasing integer at the end of path. E.g. /path-1, /path-2

• Versions: each node have a version and will be increased when its data changes

12

Page 13: Distributed system coordination by zookeeper and introduction to kazoo python library

Operations• Primitive operations:

• create /path data • delete /path • exists /path • setData /path data • getData /path • getChildren /path

13

Page 14: Distributed system coordination by zookeeper and introduction to kazoo python library

Notification• set a watch on a znode operation (getData,

getChildren, exist) and then get the notification when there is a change at the target

• Watch is: • one-time trigger • with ordering guarantee: all the event received

in client side will preserve the order of time

14

Page 15: Distributed system coordination by zookeeper and introduction to kazoo python library

Session• Session: client create a session connection

to one of the server and start operations • Session states:

• connecting • connected • closed • not_connected

15

Page 16: Distributed system coordination by zookeeper and introduction to kazoo python library

Example - implement a lock• Spec: n clients try to get the lock at the same

time, but only one of them can get the lock. • Solution: clients try to create a ephemeral

znode e.g. /lock. the first one will get the lock and the rest of them which fail to create the znode set up a watch to know when the lock is released and then try to acquire again.

16

Page 17: Distributed system coordination by zookeeper and introduction to kazoo python library

Example - implement master-worker

• Spec: • client submit tasks • master watches for new workers and tasks,

assign tasks to available workers • backup master takes over when the master fails • workers register themselves and then watch for

new tasks

17

Page 18: Distributed system coordination by zookeeper and introduction to kazoo python library

Example - implement master-worker• Solution:

• ephemeral znode /master for master election • backup masters sets up a watch for /master

• persistent znode /workers • master set up with for /workers • worker create a znode in /workers, e.g. /workers/host1

• persistent sequential znode /tasks • client submit tasks by creating znode under /tasks

• persistent znode /assign • workers set up watch on their corresponding znode under /assign e.g. /assign/

host1 • master assign task to worker by create znode under /assign, e.g. /assign/host1/

task1• worker mark the task as done by update the data of task as “done”

18

Page 19: Distributed system coordination by zookeeper and introduction to kazoo python library

Zookeeper Deployment

19

Page 20: Distributed system coordination by zookeeper and introduction to kazoo python library

Zookeeper Server Run Modes• Standalone: single server • Quorum: multiple servers replicate the data

• the cluster apply majority vote to keep the consistency so a cluster can afford less than half of nodes crash

• default ports: client(2181), quorum(2182), election(2183)

20

Page 21: Distributed system coordination by zookeeper and introduction to kazoo python library

Clients• Native primitive operations

• C library • Java library

• Recipes (3rd party high level API) • Java: Curator (by Netflix) • Python: kazoo (by Mozilla and Zope)

21

Page 22: Distributed system coordination by zookeeper and introduction to kazoo python library

Java Client Console• bin/zkCli.sh -server 127.0.0.1:2181 • Commands

• get path [watch] • ls path [watch] • set path data [version] • createpath data acl • delete path [version] • setquota -n|-b val path

22

Page 23: Distributed system coordination by zookeeper and introduction to kazoo python library

Python client - kazoo

• from kazoo.client import KazooClient • zk = KazooClient(hosts='127.0.0.1:2181') • zk.start()

• zk.stop()

https://kazoo.readthedocs.org/en/latest/23

Page 24: Distributed system coordination by zookeeper and introduction to kazoo python library

from kazoo.client import KazooClientfrom kazoo.client import KazooState

def my_listener(state): if state == KazooState.LOST: print 'lost session' elif state == KazooState.SUSPENDED: print 'disconnected from Zookeeper' elif state == KazooState.CONNECTED: # try to become the master print 'connected'

zk = KazooClient(hosts='127.0.0.1:2181')zk.add_listener(my_listener)zk.start()lock = zk.Lock('/master', '%s-%d' %(socket.gethostname(), os.getpid()))

24

zk.ensure_path("/path")

zk.set("/path", “data_string".encode('utf8'))

start_key, stat = zk.get("/path")

Page 25: Distributed system coordination by zookeeper and introduction to kazoo python library

Zookeeper Recipes

25

Page 26: Distributed system coordination by zookeeper and introduction to kazoo python library

Common Recipes• lock • election • counter • barrier • partitioner • party • queue

• watch

26

Page 27: Distributed system coordination by zookeeper and introduction to kazoo python library

Lock

zk = KazooClient()lock = zk.Lock("/lockpath", "my-identifier")with lock: # blocks waiting for lock acquisition # do something with the lock

lock.release()

27

Page 28: Distributed system coordination by zookeeper and introduction to kazoo python library

Electionzk = KazooClient()election = zk.Election("/electionpath", "my-identifier")# blocks until the election is won, then calls# my_leader_function() election.run(my_leader_function)

28

Page 29: Distributed system coordination by zookeeper and introduction to kazoo python library

zk = KazooClient()counter = zk.Counter("/int")counter += 2counter -= 1counter.value == 1counter = zk.Counter("/float", default=1.0)counter += 2.0counter.value == 3.0

Counter

29

Page 30: Distributed system coordination by zookeeper and introduction to kazoo python library

Barrierbarrier = zk.Barrier("/barrier")barrier.create() barrier.wait()# master release the barrier bybarrier.remove()

30

Page 31: Distributed system coordination by zookeeper and introduction to kazoo python library

Partitionerfrom kazoo.client import KazooClientclient = KazooClient()qp = client.SetPartitioner( path='/work_queues', set=('queue-1', 'queue-2', 'queue-3'))while 1: if qp.failed: raise Exception("Lost or unable to acquire partition") elif qp.release: qp.release_set() elif qp.acquired: for partition in qp: # Do something with each partition elif qp.allocating: qp.wait_for_acquire()

31

Page 32: Distributed system coordination by zookeeper and introduction to kazoo python library

Partyparty1 = zk.Party("/party1", "my-identifier")party2 = zk.Party("/party2", "my-identifier")party1.join()"my-identifier" in party1"my-identifier" not in party2

32

Page 33: Distributed system coordination by zookeeper and introduction to kazoo python library

Queue

queue = zk.LockingQueue("/queue")for task in tasks: queue.put(task.encode('utf8')) task = queue.get()

33

Page 34: Distributed system coordination by zookeeper and introduction to kazoo python library

Watch: watch znode continuously

@zk.DataWatch('/last_scanned_card_key')def my_func(data, stat, event): print("Data is %s" % data) print("Version is %s" % stat.version) print("Event is %s" % event)

34

Page 35: Distributed system coordination by zookeeper and introduction to kazoo python library

References

35

• Flavio Junqueira, Benjamin Reed, ZooKeeper: Distributed Process Coordination, O'Reilly Media, Inc., November 25, 2013

• Zookeeper website, http://zookeeper.apache.org/