Top Banner
Apache ZooKeeper Andrei Savu @TechTuesday Why use it? What to expect in the future?
22

Apache ZooKeeper TechTuesday

May 06, 2015

Download

Technology

Andrei Savu
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Apache ZooKeeper TechTuesday

Apache ZooKeeper

Andrei Savu @TechTuesday

Why use it? What to expect in the future?

Page 2: Apache ZooKeeper TechTuesday

Outline

Why use it?Crash CoursePractical Example

What to expect in the future (3.4.0 release)?GSoC 2010New ContribWork in Progress

Page 3: Apache ZooKeeper TechTuesday

Crash Course

Page 4: Apache ZooKeeper TechTuesday

What is ZooKeeper?

A highly available, scalable, distributed, configuration, consensus, group membership,

leader election, naming and coordination service.

Page 5: Apache ZooKeeper TechTuesday

What is ZooKeeper? (2)

replicated in memory tree data structuresomehow similar to a file systemno partial read / writesno renamesordered updatesstrong persistence guaranteesconditional updates (version)watches for data changesephemeral nodesgenerated file names

Page 6: Apache ZooKeeper TechTuesday

ZooKeeper Data Model

hierarchical namespaceeach znode has data and childrendata is read and written in its entirety

Page 7: Apache ZooKeeper TechTuesday

Basic ZooKeeper API

string create(path, data, acl, flags)

delete(path, expected_version)

stat set_data(path, data, expected_version)

(data, stat) get_data(path, watch)

stat exists(path, watch)

string[] get_children(path, watch)

Page 8: Apache ZooKeeper TechTuesday

ZooKeeper Service

Facts: 1) all servers store a copy of the data in memory 2) the leader is elected at startup 3) followers respond to clients 4) all updates go through the leader 5) responses are sent when a majority of servers have persisted the change

Page 9: Apache ZooKeeper TechTuesday

Practical Example

Page 10: Apache ZooKeeper TechTuesday

Distributed Queue (Python)

http://www.cloudera.com/blog/2009/05/building-a-distributed-concurrent-queue-with-apache-zookeeper/ http://github.com/henryr/pyzk-recipesRetry operation on ConnectionLoss:

http://github.com/andreisavu/pyzk-recipes

Page 11: Apache ZooKeeper TechTuesday

GSoC 2010

3 projects / 5 months

Page 12: Apache ZooKeeper TechTuesday

1. Monitoring & Web-based interface

Status: Committed to the trunk

1. JIRA: ZOOKEEPER-7012. Progress Tracking Wiki3. Monitoring for Ganglia, Nagios and Cacti

1. contrib / monitoring2. 'mntr' 4letter word

4. Web interface available as a Hue application1. contrib / huebrowser2. complete install instructions3. requirements: rest gateway, Hue 1.0

Page 13: Apache ZooKeeper TechTuesday

2. Read-Only Mode

Status: Under Review (Ready to be committed)

1. JIRA: ZOOKEEPER-7042. Progress Tracking Wiki3. Description: "When a ZooKeeper server loses contact with

over half of the other servers in an ensemble ('loses a quorum'), it stops responding to client requests. For some applications, it would be beneficial if a server still responded to read requests when the quorum is lost, but caused an error condition when a write request was attempted."

Page 14: Apache ZooKeeper TechTuesday

3. Failure Detector Model

Status: Under Review

1. JIRA: ZOOKEEPER-7022. Progress Tracking Wiki3. Detectors: Phi Accrual, Chen, Bertier, Fixed Heartbeat4. Why? Check the concluding remarks on the wiki. 5. Conclusion snippet: "in scenarios where we have a

changing network behavior, such in a WAN, the adaptive methods can be a good pick"

Page 15: Apache ZooKeeper TechTuesday

Contrib

Page 16: Apache ZooKeeper TechTuesday

Large Scale Pub/Sub (hedwig)

1. JIRA: ZOOKEEPER-7752. Uses ZooKeeper and BookKeeper3. Committed to the trunk4. Developed at Yahoo! Research5. Used for PNUTS cross data center replication6. http://vimeo.com/13282102

Page 17: Apache ZooKeeper TechTuesday

Work in Progress

only some interesting JIRAs

Page 18: Apache ZooKeeper TechTuesday

#834 Children for ephemerals

JIRA: ZOOKEEPER-834Allow ephemeral nodes to have children owned by the same session. Useful when publishing status information. No need to do serialization for basic data structures (hash tables)Similar to /proc in *nix systems.Examples: /agent-01/ip, /agent-01/memory, /agent-01/load

Page 19: Apache ZooKeeper TechTuesday

#829 /zookeeper/sessions/* 

JIRA: ZOOKEEPER-829Requested by HBase developers: " we'd like the ability to forcible expire someone else's ZK session "

Page 20: Apache ZooKeeper TechTuesday

Plenty of bug fixes

Join the community!

Page 21: Apache ZooKeeper TechTuesday

Resources

http://hadoop.apache.org/zookeeper/http://wiki.apache.org/hadoop/ZooKeeper/ProjectDescriptionhttp://wiki.apache.org/hadoop/ZooKeeper/Tao

Page 22: Apache ZooKeeper TechTuesday

Thanks! Questions?