Top Banner
Gil Einziger Roy Friedman Computer Science Department Technion Postman: an Elastic Highly Resilient Pub/Sub Framework for Self Sustained Service Independent P2P Networks
27

Gil EinzigerRoy Friedman Computer Science Department Technion.

Jan 12, 2016

Download

Documents

Catherine Adams
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Gil EinzigerRoy Friedman Computer Science Department Technion.

Gil Einziger Roy FriedmanComputer Science Department

Technion

Postman: an Elastic Highly Resilient Pub/Sub Framework for

Self Sustained Service Independent P2P

Networks

Page 2: Gil EinzigerRoy Friedman Computer Science Department Technion.

Background: Publish/Subscribe

• Publisher: any entity that wishes to publish some event

Look at my new hairstyle

Page 3: Gil EinzigerRoy Friedman Computer Science Department Technion.

Background: Publish/Subscribe

• Subscriber: any entity that wishes to be notified about events that match its interests (also called subscription)

I want to know

everything about

hairstylesI want to

know everything

about Ariana Grande

I only care about science

fiction

Page 4: Gil EinzigerRoy Friedman Computer Science Department Technion.

Background: Publish/Subscribe

• The system’s goal is to deliver notifications about events to interested subscribers and only to them– Decoupling of information producers from consumers

• Applications:– Social networking (Twitter and the likes)– Stock quotes– Control systems– Data-center management– Etc.

Page 5: Gil EinzigerRoy Friedman Computer Science Department Technion.

Background: P2P

• Decentralized systems in which (most of) the communication is performed directly between the end nodes of the system – the peers

• Often, peers are donated users’ machines– But can also be set-top boxes, routers, a large datacenter’s

servers, etc.

• Famous example applications:– Skype, Bittorrent (and other file sharing), IPTV, Bitcoin

Page 6: Gil EinzigerRoy Friedman Computer Science Department Technion.

It’s a Brave New World Out There

• Most users access online content through their mobiles– Intermittent connectivity– Limited bandwidth– Limited battery life– Limited resources

• We need to decouple the devices used to access the data and the ones serving the P2P network

Page 7: Gil EinzigerRoy Friedman Computer Science Department Technion.

A Vision for Future P2P Solutions

• Whether ran as a true P2P network between donated machines or inside a data-center:

– Decoupling between devices that consume services and the ones providing the service

• Incentives might be in the form of revenue share with advertisers or paid subscribers

– P2P machines are used for providing multiple services• Not feasible to optimize the P2P overlay for a specific service

Page 8: Gil EinzigerRoy Friedman Computer Science Department Technion.

Problem Statement

• A scalable and efficient pub/sub system for self-sustained P2P networks– The challenge

• Subscribers might not be present much of the time• Short client sessions• Use the existing overlay

– Quality Goals• High delivery rate• Efficient publication delivery• Reasonable latency• High churn resilience

Page 9: Gil EinzigerRoy Friedman Computer Science Department Technion.

Our Solution: Overview

Page 10: Gil EinzigerRoy Friedman Computer Science Department Technion.

Our Solution: Subscribing to Mailbox

• Clients that are aware of mailboxes serving their topics, simply notify these mailboxes about it

• A client that is unaware of such mailboxes, initiates multiple biased random walks– Each mailbox distributes a hint (Bloom filter) with the topics it

subscribes to its overlay neighbors up to some distance– The random walks favor visiting nodes whose hints include a match– The random walks continue for a given TTL trying to find as many

matching mailbox as possible

• If none is found, then the home node becomes a mailbox for these topics

Page 11: Gil EinzigerRoy Friedman Computer Science Department Technion.

Our Solution: Mailboxes

• An overloaded mailbox can refuse to accept new clients and topics

• Mailboxes disappear naturally due to churn or when they are underutilized

• The important objective is load sharing rather than load balancing

Page 12: Gil EinzigerRoy Friedman Computer Science Department Technion.

Our Solution: Dissemination

• Spanning tree among mailboxes that know each other

• Random walks to discover new mailboxes and disseminate to them

Page 13: Gil EinzigerRoy Friedman Computer Science Department Technion.

Our Solution: Dissemination

• Spanning tree among mailboxes that know each other

• Random walks to discover new mailboxes and disseminate to them

Page 14: Gil EinzigerRoy Friedman Computer Science Department Technion.

Our Solution: Dissemination

• Spanning tree + random walks between mailboxes

• Normally, a mailbox pushes events to corresponding registered clients

• Additionally, out-of-band gossip between mailboxes and clients– Clients poll their set of known mailboxes– Exchange list of known events with each polled mailbox– Occurs periodically plus after re-connection

Page 15: Gil EinzigerRoy Friedman Computer Science Department Technion.

Implementation

– Written in Java– Open source project

• All code including testing available online

– Can be run on top of real IP networks as well as the PeerSim simulator

• In the real networks case, executed on top of the OpenKAD implementation of the Kademlia DHT

– Measurements confirm similarity between results with similar size networks

• Simulations can be used to explore scalability• Real networks can be used to validate simulation results

Page 16: Gil EinzigerRoy Friedman Computer Science Department Technion.

Evaluation: Methodology

• Traces:– Synthetic traces

• Subscriptions are spread to clients/home nodes uniformly• Topic publication distribution is Zipf-like with α=0.9

– Twitter traces

• Metrics:– Delivery rate– Communication load– Mailbox subscription pattern vs. users’– Effects of churn

Page 17: Gil EinzigerRoy Friedman Computer Science Department Technion.

Results: Synthetic Workload

Time (minutes)

Del

iver

y R

ate

Delivery rate over time vs. network size

1) Delivery rate approachs 100% after a few minutes

2) For 1500 nodes, simulation ~ real runs

Page 18: Gil EinzigerRoy Friedman Computer Science Department Technion.

Results: Twitter Traces

Time (minutes)

Del

iver

y R

ate

Delivery rate over time

Page 19: Gil EinzigerRoy Friedman Computer Science Department Technion.

Results: Load Distribution

# node

Tota

l Han

dled

Mes

sage

s

Load Distribution (Twitter)

1) Almost all load goes to mailboxes only

2) Even most loaded need to handle fewer than 10 messages per second

Page 20: Gil EinzigerRoy Friedman Computer Science Department Technion.

Results: Subscription Pattern

# node

Sub

scrip

tion

Num

ber

Twitter (Feb. 7 2010 19:00-20:20)

Subscription pattern of mailboxes much more uniform than of clients=> balanced dissemination trees

Page 21: Gil EinzigerRoy Friedman Computer Science Department Technion.

Results: Subscription Pattern

Client Subscriptions

Mai

lbox

Sub

scrip

tions

#Registered Clients/#Registered Mailboxes

Only a small number of mailboxes register even to the most popular topics => the dissemination trees are relatively small

Page 22: Gil EinzigerRoy Friedman Computer Science Department Technion.

Results: Single 10% Churn Event

Time (minutes)

Mis

s R

ate

Churn Recovery Time

Page 23: Gil EinzigerRoy Friedman Computer Science Department Technion.

Results: Repetitive 10% Churn

Time (minutes)

Mis

s R

ate

Churn Recovery Over Time

Page 24: Gil EinzigerRoy Friedman Computer Science Department Technion.

Results: Repetitive 100% Churn

Time (minutes)

Mis

s R

ate

Churn Recovery Time (Loosing All Mailboxes)

Page 25: Gil EinzigerRoy Friedman Computer Science Department Technion.

Summary

• The concept of elastic mailboxes– Self electing, self evaporating, self organizing

• Complementing delivery mechanisms– Spanning tree– Random walks– Out-of-band gossip through clients interaction

• End result:– Mailboxes dramatically reduce the scalability problem– Highly efficient, highly effective, highly robust to failures

and churn

Page 26: Gil EinzigerRoy Friedman Computer Science Department Technion.

Open Issues

• Exploit subscription similarity• Privacy

Page 27: Gil EinzigerRoy Friedman Computer Science Department Technion.

Q&A

• Thanks for listening…