Top Banner
ZooKeeper 高高高高高高高高高high available and reliable coordination system
30

Zoo keeper

May 29, 2015

Download

Technology

amazingjxq
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Zoo keeper

ZooKeeperZooKeeper

高可用分布式协调系统high available and reliable

coordination system

高可用分布式协调系统high available and reliable

coordination system

Page 2: Zoo keeper

zk的角色zk的角色

人体 分布式系统

神经 网络

器官 服务器

大脑 zk

Page 3: Zoo keeper

zk的角色zk的角色

Page 4: Zoo keeper

Standalone模式Standalone模式

$ cd zookeeper-3.4.2/$ cat conf/zoo.cfgtickTime=2000dataDir=/var/lib/zookeeperclientPort=2181

Page 5: Zoo keeper

Standalone模式Standalone模式$ bin/zkServer.sh startJMX enabled by defaultUsing config: /home/jxq/code/zookeeper-3.4.2/bin/../conf/ zoo.cfgStarting zookeeper ... STARTED

Connecting to localhost:2181Welcome to ZooKeeper!JLine support is enabled[zk: localhost:2181(CONNECTED) 0] help

$ bin/zkCli.sh -server 127.0.0.1:2181

Page 6: Zoo keeper

Standalone模式Standalone模式[zk: localhost:2181(CONNECTED) 0] help ZooKeeper -server host:port cmd args

connect host:portget path [watch]ls path [watch]set path data [version]rmr pathdelquota [-n|-b] pathquit printwatches on|offcreate [-s] [-e] path data aclstat path [watch]close ls2 path [watch]listquota pathsetAcl path aclgetAcl pathsync pathdelete path [version]

Page 7: Zoo keeper

Standalone模式Standalone模式[zk: localhost:2181(CONNECTED) 1] ls /[zookeeper][zk: localhost:2181(CONNECTED) 2] create /zk_test test_dataCreated /zk_test[zk: localhost:2181(CONNECTED) 3] ls /[zookeeper, zk_test][zk: localhost:2181(CONNECTED) 4] get /zk_testtest_datacZxid = 0x11ctime = Thu Mar 08 15:33:55 CST 2012mZxid = 0x11mtime = Thu Mar 08 15:33:55 CST 2012pZxid = 0x11cversion = 0dataVersion = 0aclVersion = 0ephemeralOwner = 0x0dataLength = 9numChildren = 0

Page 8: Zoo keeper

Standalone模式Standalone模式[zk: localhost:2181(CONNECTED) 5] set /zk_test test_2cZxid = 0x11ctime = Thu Mar 08 15:33:55 CST 2012mZxid = 0x13mtime = Thu Mar 08 15:46:27 CST 2012pZxid = 0x11cversion = 0dataVersion = 1aclVersion = 0ephemeralOwner = 0x0dataLength = 6numChildren = 0

Page 9: Zoo keeper

Standalone模式Standalone模式[zk: localhost:2181(CONNECTED) 6] get /zk_test watchtest_2cZxid = 0x11ctime = Thu Mar 08 15:33:55 CST 2012mZxid = 0x13mtime = Thu Mar 08 15:46:27 CST 2012pZxid = 0x11cversion = 0dataVersion = 1aclVersion = 0ephemeralOwner = 0x0dataLength = 6numChildren = 0

Page 10: Zoo keeper

Standalone模式Standalone模式[zk: 127.0.0.1:2181(CONNECTED) 6] set /zk_test test_3

WATCHER::cZxid = 0x11

WatchedEvent state:SyncConnected type:NodeDataChanged path:/zk_testctime = Thu Mar 08 15:33:55 CST 2012mZxid = 0x51mtime = Mon Mar 19 10:43:59 CST 2012pZxid = 0x11cversion = 0dataVersion = 11aclVersion = 0ephemeralOwner = 0x0dataLength = 6numChildren = 0

Page 11: Zoo keeper

zk的一致性保证zk的一致性保证• 顺序性:客户端请求顺序生效• 原子性• 单一系统映像• 可靠性:一旦更新请求生效,会持续到下一次请求

Page 12: Zoo keeper

znodeznode

• 3 种 znode• persistent znode: 永久有效地节点

• ephemeral znode: 临时节点

• sequential znode: 顺序节点

• 数据少于 1MB

Page 13: Zoo keeper

watcheswatches

• getData(), getChildren(), exists()• One-time trigger• data watches and child watches• 有序的:• 客户端收到 watch 事件的顺序跟节点发生改变的顺序一致

• 客户端收到 watch 事件后才会看到新数据• 注意延迟:收到 watch 事件和获取新数据之间数据可能改变多次void watcher(zhandle_t *zzh, int type, int state, const char *path, void* context)

Page 14: Zoo keeper

ACLACL• 类似 unix 文件权限• 只对某一节点有效(非递归的)• 权限:• CREATE, READ,WRITE,DELETE,ADMIN

Page 15: Zoo keeper

APIAPI异步 同步

zoo_acreate() zoo_create()

zoo_aexists() zoo_exists()

zoo_aget() zoo_get()

zoo_aget_children() zoo_get_children()

zoo_aset() zoo_set()

zoo_adelete() zoo_delete()

Page 16: Zoo keeper

典型应用典型应用• Naming service• 配置管理• 集群监控• Barriers• 分布式队列• 分布式锁• leader election

Page 17: Zoo keeper

配置管理配置管理• 配置文件、机器列表等等• 集中管理• 服务自动更新配置• 客户端建立 watch• zk 节点内容(配置)更改时推送到客户端

Page 18: Zoo keeper

集群监控集群监控• 每个服务创建“ /clusterServers/

{hostname}” 节点, ephemeral• 监控服务 watch“/clusterServers” 子节点数量

• 被监控服务停止时节点消失,监控服务收到 watch 事件

Page 19: Zoo keeper

BarriersBarriers

Page 20: Zoo keeper

BarriersBarriers• “/barrier/[n]”

1. create(“/barrier/n”, EPHEMERAL_SEQUENTIAL)2. getChildren(“/barrier/”, true),设定 watch,

节点数量变化时通知3. if 节点数量小于 x ,等待 watch通知4. else return5. goto 2

watcher函数: pthread_cond_signal()等待: pthread_cond_wait()

Page 21: Zoo keeper

分布式队列分布式队列• “/q/element_[n]”• 生产者:

• 消费者:

• zk 集群数据一致性

create(“/q” + “/element_”, message, ZOO_SEQUENCE);

get_children();delete();

Page 22: Zoo keeper

分布式锁分布式锁• “_locknode_/lock-[n]”• 获得锁:

• 释放锁:删除节点

1. create(“_locknode_/lock-[n]”)2. getChildren()3. 判断 (1)中创建节点序号是否是最小的,是则取得锁4. exists()判断第二小的节点是否存在,并加 watch5. 如果 exists()返回 false, goto 2。否则等待通

知后再跳到 2

Page 23: Zoo keeper

leader electionleader election

Page 24: Zoo keeper

ZK内部设计ZK内部设计

Page 25: Zoo keeper

• ZooKeeper Atomic Broadcast• 保证:• 消息的可靠传递• 全局顺序• 因果顺序

• 消息传递的两个流程• 选举• 同步

Page 26: Zoo keeper

zk节点的角色zk节点的角色角色 描述leader( 领导者 ) 进行投票的发起和决议

learner( 学习者 )

follower( 跟随者 )

接收客户端请求并返回结果,选举过程中参与投票

observer( 观察者 )

接收客户端请求,将写操作转发给leader 。但是不参与投票过程,只同步 leader 状态。提高读性能。

client( 客户端 ) 发起请求

Page 27: Zoo keeper

zk server工作状态zk server工作状态• LOOKING :当前 server 不知道 leader是谁

• LEADING :当前 server 是 leader• FOLLOWING : leader 已经选举出来

Page 28: Zoo keeper

选举流程选举流程• basic paxos• 每个 Server 启动以后都询问其它的 Server 它要投票给谁,收到所有 Server 回复以后,就计算出 zxid 最大的哪个 Server ,并将这个Server 相关信息设置成下一次要投票的Server 。如果此时获胜的 Server 获得 n/2 + 1 的 Server 票数, 设置当前推荐的 leader为获胜的 Server ,并修改自己状态。

• election.jpg

Page 29: Zoo keeper

同步流程同步流程• leader 等待 server 连接• follower 连接 leader ,将最大的 zxid 发送给 leader

• leader 根据 follower 的 zxid 确定同步点• 完成同步后通知 follower 已经成为

uptodate 状态• follower 收到 uptodate 消息后,又可以重新接受 client 的请求进行服务

Page 30: Zoo keeper